feat: migrate all codex team skills from spawn_agents_on_csv to spawn_agent + wait_agent architecture

- Delete 21 old team skill directories using CSV-wave pipeline pattern (~100+ files)
- Delete old team-lifecycle (v3) and team-planex-v2
- Create generic team-worker.toml and team-supervisor.toml (replacing tlv4-specific TOMLs)
- Convert 19 team skills from Claude Code format (Agent/SendMessage/TaskCreate)
  to Codex format (spawn_agent/wait_agent/tasks.json/request_user_input)
- Update team-lifecycle-v4 to use generic agent types (team_worker/team_supervisor)
- Convert all coordinator role files: dispatch.md, monitor.md, role.md
- Convert all worker role files: remove run_in_background, fix Bash syntax
- Convert all specs/pipelines.md references
- Final state: 20 team skills, 217 .md files, zero Claude Code API residuals

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
catlog22
2026-03-24 16:54:48 +08:00
parent 54283e5dbb
commit 1e560ab8e8
334 changed files with 28996 additions and 35516 deletions

View File

@@ -1,183 +1,85 @@
---
name: team-frontend-debug
description: Frontend debugging team using Chrome DevTools MCP. Dual-mode -- feature-list testing or bug-report debugging. Covers reproduction, root cause analysis, code fixes, and verification. CSV wave pipeline with conditional skip and iteration loops.
argument-hint: "[-y|--yes] [-c|--concurrency N] [--continue] \"feature list or bug description\""
allowed-tools: spawn_agents_on_csv, spawn_agent, wait, send_input, close_agent, Read, Write, Edit, Bash, Glob, Grep, request_user_input
description: Frontend debugging team using Chrome DevTools MCP. Dual-mode feature-list testing or bug-report debugging. Triggers on "team-frontend-debug", "frontend debug".
allowed-tools: spawn_agent(*), wait_agent(*), send_input(*), close_agent(*), report_agent_job_result(*), request_user_input(*), Read(*), Write(*), Edit(*), Bash(*), Glob(*), Grep(*), mcp__chrome-devtools__*(*)
---
## Auto Mode
When `--yes` or `-y`: Auto-confirm task decomposition, skip interactive validation, use defaults.
# Frontend Debug Team
## Usage
Dual-mode frontend debugging: feature-list testing or bug-report debugging, powered by Chrome DevTools MCP.
```bash
$team-frontend-debug "Test features: login, dashboard, user profile at localhost:3000"
$team-frontend-debug "Bug: clicking save button on /settings causes white screen"
$team-frontend-debug -y "Test: 1. User registration 2. Email verification 3. Password reset"
$team-frontend-debug --continue "tfd-login-bug-20260308"
```
**Flags**:
- `-y, --yes`: Skip all confirmations (auto mode)
- `-c, --concurrency N`: Max concurrent agents within each wave (default: 2)
- `--continue`: Resume existing session
**Output Directory**: `.workflow/.csv-wave/{session-id}/`
**Core Output**: `tasks.csv` (master state) + `results.csv` (final) + `discoveries.ndjson` (shared exploration) + `context.md` (human-readable report)
---
## Overview
Dual-mode frontend debugging: feature-list testing or bug-report debugging, powered by Chrome DevTools MCP. Roles: tester (test-pipeline), reproducer (debug-pipeline), analyzer, fixer, verifier. Supports conditional skip (all tests pass -> no downstream tasks), iteration loops (analyzer requesting more evidence, verifier triggering re-fix), and Chrome DevTools-based browser interaction.
**Execution Model**: Hybrid -- CSV wave pipeline (primary) + individual agent spawn (secondary)
## Architecture
```
+-------------------------------------------------------------------+
| FRONTEND DEBUG WORKFLOW |
+-------------------------------------------------------------------+
| |
| Phase 0: Pre-Wave Interactive (Input Analysis) |
| +- Parse user input (feature list or bug report) |
| +- Detect mode: test-pipeline or debug-pipeline |
| +- Extract: base URL, features/steps, evidence plan |
| +- Output: refined requirements for decomposition |
| |
| Phase 1: Requirement -> CSV + Classification |
| +- Select pipeline (test or debug) |
| +- Build dependency graph from pipeline definition |
| +- Classify tasks: csv-wave | interactive (exec_mode) |
| +- Compute dependency waves (topological sort) |
| +- Generate tasks.csv with wave + exec_mode columns |
| +- User validates task breakdown (skip if -y) |
| |
| Phase 2: Wave Execution Engine (Extended) |
| +- For each wave (1..N): |
| | +- Execute pre-wave interactive tasks (if any) |
| | +- Build wave CSV (filter csv-wave tasks for this wave) |
| | +- Inject previous findings into prev_context column |
| | +- spawn_agents_on_csv(wave CSV) |
| | +- Execute post-wave interactive tasks (if any) |
| | +- Merge all results into master tasks.csv |
| | +- Conditional skip: TEST-001 with 0 issues -> done |
| | +- Iteration: ANALYZE needs more evidence -> REPRODUCE-002 |
| | +- Re-fix: VERIFY fails -> FIX-002 -> VERIFY-002 |
| +- discoveries.ndjson shared across all modes (append-only) |
| |
| Phase 3: Post-Wave Interactive (Completion Action) |
| +- Pipeline completion report with debug summary |
| +- Interactive completion choice (Archive/Keep/Export) |
| +- Final aggregation / report |
| |
| Phase 4: Results Aggregation |
| +- Export final results.csv |
| +- Generate context.md with all findings |
| +- Display summary: completed/failed/skipped per wave |
| +- Offer: view results | retry failed | done |
| |
+-------------------------------------------------------------------+
Skill(skill="team-frontend-debug", args="feature list or bug description")
|
SKILL.md (this file) = Router
|
+--------------+--------------+
| |
no --role flag --role <name>
| |
Coordinator Worker
roles/coordinator/role.md roles/<name>/role.md
|
+-- analyze input → select pipeline → dispatch → spawn → STOP
|
┌──────────────────────────┼──────────────────────┐
v v v
[test-pipeline] [debug-pipeline] [shared]
tester(DevTools) reproducer(DevTools) analyzer
fixer
verifier
```
---
## Pipeline Modes
| Input Pattern | Pipeline | Flow |
|---------------|----------|------|
| Feature list / function checklist / test items | `test-pipeline` | TEST -> ANALYZE -> FIX -> VERIFY |
| Bug report / error description / crash report | `debug-pipeline` | REPRODUCE -> ANALYZE -> FIX -> VERIFY |
| Input | Pipeline | Flow |
|-------|----------|------|
| Feature list / 功能清单 | `test-pipeline` | TEST ANALYZE FIX VERIFY |
| Bug report / 错误描述 | `debug-pipeline` | REPRODUCE ANALYZE FIX VERIFY |
### Pipeline Selection Keywords
## Role Registry
| Keywords | Pipeline |
|----------|----------|
| feature, test, list, check, verify functions, validate | `test-pipeline` |
| bug, error, crash, broken, white screen, not working | `debug-pipeline` |
| performance, slow, latency, memory leak | `debug-pipeline` (perf dimension) |
| Ambiguous / unclear | request_user_input to clarify |
| Role | Path | Prefix | Inner Loop |
|------|------|--------|------------|
| coordinator | [roles/coordinator/role.md](roles/coordinator/role.md) | — | — |
| tester | [roles/tester/role.md](roles/tester/role.md) | TEST-* | true |
| reproducer | [roles/reproducer/role.md](roles/reproducer/role.md) | REPRODUCE-* | false |
| analyzer | [roles/analyzer/role.md](roles/analyzer/role.md) | ANALYZE-* | false |
| fixer | [roles/fixer/role.md](roles/fixer/role.md) | FIX-* | true |
| verifier | [roles/verifier/role.md](roles/verifier/role.md) | VERIFY-* | false |
---
## Role Router
## Task Classification Rules
Parse `$ARGUMENTS`:
- Has `--role <name>` → Read `roles/<name>/role.md`, execute Phase 2-4
- No `--role``roles/coordinator/role.md`, execute entry router
Each task is classified by `exec_mode`:
## Shared Constants
| exec_mode | Mechanism | Criteria |
|-----------|-----------|----------|
| `csv-wave` | `spawn_agents_on_csv` | One-shot, structured I/O, no multi-round interaction |
| `interactive` | `spawn_agent`/`wait`/`send_input`/`close_agent` | Multi-round, progress updates, inner loop |
- **Session prefix**: `TFD`
- **Session path**: `.workflow/.team/TFD-<slug>-<date>/`
- **CLI tools**: `ccw cli --mode analysis` (read-only), `ccw cli --mode write` (modifications)
- **Message bus**: `mcp__ccw-tools__team_msg(session_id=<session-id>, ...)`
**Classification Decision**:
## Workspace Resolution
| Task Property | Classification |
|---------------|---------------|
| Feature testing with inner loop (tester iterates over features) | `csv-wave` |
| Bug reproduction (single pass) | `csv-wave` |
| Root cause analysis (single pass) | `csv-wave` |
| Code fix implementation | `csv-wave` |
| Fix verification (single pass) | `csv-wave` |
| Conditional skip gate (evaluating TEST results) | `interactive` |
| Pipeline completion action | `interactive` |
Coordinator MUST resolve paths at Phase 2 before spawning workers:
---
1. Run `Bash({ command: "pwd" })` → capture `project_root` (absolute path)
2. `skill_root = <project_root>/.claude/skills/team-frontend-debug`
3. Store in `team-session.json`:
```json
{ "project_root": "/abs/path/to/project", "skill_root": "/abs/path/to/skill" }
```
4. All worker `role_spec` values MUST use `<skill_root>/roles/<role>/role.md` (absolute)
## CSV Schema
### tasks.csv (Master State)
```csv
id,title,description,role,pipeline_mode,base_url,evidence_dimensions,deps,context_from,exec_mode,wave,status,findings,artifacts_produced,issues_count,verdict,error
"TEST-001","Feature testing","PURPOSE: Test all features from list | Success: All features tested with evidence","tester","test-pipeline","http://localhost:3000","screenshot;console;network","","","csv-wave","1","pending","","","","",""
"ANALYZE-001","Root cause analysis","PURPOSE: Analyze discovered issues | Success: RCA for each issue","analyzer","test-pipeline","","console;network","TEST-001","TEST-001","csv-wave","2","pending","","","","",""
```
**Columns**:
| Column | Phase | Description |
|--------|-------|-------------|
| `id` | Input | Unique task identifier (PREFIX-NNN: TEST, REPRODUCE, ANALYZE, FIX, VERIFY) |
| `title` | Input | Short task title |
| `description` | Input | Detailed task description with PURPOSE/TASK/CONTEXT/EXPECTED/CONSTRAINTS |
| `role` | Input | Role name: `tester`, `reproducer`, `analyzer`, `fixer`, `verifier` |
| `pipeline_mode` | Input | Pipeline: `test-pipeline` or `debug-pipeline` |
| `base_url` | Input | Target URL for browser-based tasks (empty for non-browser tasks) |
| `evidence_dimensions` | Input | Semicolon-separated evidence types: `screenshot`, `console`, `network`, `snapshot`, `performance` |
| `deps` | Input | Semicolon-separated dependency task IDs |
| `context_from` | Input | Semicolon-separated task IDs whose findings this task needs |
| `exec_mode` | Input | `csv-wave` or `interactive` |
| `wave` | Computed | Wave number (computed by topological sort, 1-based) |
| `status` | Output | `pending` -> `completed` / `failed` / `skipped` |
| `findings` | Output | Key discoveries or implementation notes (max 500 chars) |
| `artifacts_produced` | Output | Semicolon-separated paths of produced artifacts |
| `issues_count` | Output | Number of issues found (tester/analyzer), empty for others |
| `verdict` | Output | Verification verdict: `pass`, `pass_with_warnings`, `fail` (verifier only) |
| `error` | Output | Error message if failed (empty if success) |
### Per-Wave CSV (Temporary)
Each wave generates a temporary `wave-{N}.csv` with extra `prev_context` column (csv-wave tasks only).
---
## Agent Registry (Interactive Agents)
| Agent | Role File | Pattern | Responsibility | Position |
|-------|-----------|---------|----------------|----------|
| Conditional Skip Gate | agents/conditional-skip-gate.md | 2.3 (send_input cycle) | Evaluate TEST results and skip downstream if no issues | post-wave |
| Iteration Handler | agents/iteration-handler.md | 2.3 (send_input cycle) | Handle analyzer's need_more_evidence request | post-wave |
| Completion Handler | agents/completion-handler.md | 2.3 (send_input cycle) | Handle pipeline completion action (Archive/Keep/Export) | standalone |
> **COMPACT PROTECTION**: Agent files are execution documents. When context compression occurs, **you MUST immediately `Read` the corresponding agent.md** to reload.
---
This ensures workers always receive an absolute, resolvable path regardless of their working directory.
## Chrome DevTools MCP Tools
All browser inspection operations use Chrome DevTools MCP. Tester, reproducer, and verifier are primary consumers. These tools are available to CSV wave agents.
All browser inspection operations use Chrome DevTools MCP. Reproducer and Verifier are primary consumers.
| Tool | Purpose |
|------|---------|
@@ -197,625 +99,93 @@ All browser inspection operations use Chrome DevTools MCP. Tester, reproducer, a
| `mcp__chrome-devtools__wait_for` | Wait for element/text |
| `mcp__chrome-devtools__list_pages` | List open browser tabs |
| `mcp__chrome-devtools__select_page` | Switch active tab |
| `mcp__chrome-devtools__press_key` | Press keyboard keys |
---
## Worker Spawn Template
## Output Artifacts
| File | Purpose | Lifecycle |
|------|---------|-----------|
| `tasks.csv` | Master state -- all tasks with status/findings | Updated after each wave |
| `wave-{N}.csv` | Per-wave input (temporary, csv-wave tasks only) | Created before wave, deleted after |
| `results.csv` | Final export of all task results | Created in Phase 4 |
| `discoveries.ndjson` | Shared exploration board (all agents, both modes) | Append-only, carries across waves |
| `context.md` | Human-readable execution report | Created in Phase 4 |
| `task-analysis.json` | Phase 0/1 output: mode, features/steps, dimensions | Created in Phase 1 |
| `role-instructions/` | Per-role instruction templates for CSV agents | Created in Phase 1 |
| `artifacts/` | All deliverables: test reports, RCA reports, fix changes, verification reports | Created by agents |
| `evidence/` | Screenshots, snapshots, network logs, performance traces | Created by tester/reproducer/verifier |
| `interactive/{id}-result.json` | Results from interactive tasks | Created per interactive task |
---
## Session Structure
Coordinator spawns workers using this template:
```
.workflow/.csv-wave/{session-id}/
+-- tasks.csv # Master state (all tasks, both modes)
+-- results.csv # Final results export
+-- discoveries.ndjson # Shared discovery board (all agents)
+-- context.md # Human-readable report
+-- task-analysis.json # Phase 1 analysis output
+-- wave-{N}.csv # Temporary per-wave input (csv-wave only)
+-- role-instructions/ # Per-role instruction templates
| +-- tester.md # (test-pipeline)
| +-- reproducer.md # (debug-pipeline)
| +-- analyzer.md
| +-- fixer.md
| +-- verifier.md
+-- artifacts/ # All deliverables
| +-- TEST-001-report.md
| +-- TEST-001-issues.json
| +-- ANALYZE-001-rca.md
| +-- FIX-001-changes.md
| +-- VERIFY-001-report.md
+-- evidence/ # Browser evidence
| +-- F-001-login-before.png
| +-- F-001-login-after.png
| +-- before-screenshot.png
| +-- after-screenshot.png
| +-- before-snapshot.txt
| +-- after-snapshot.txt
| +-- evidence-summary.json
+-- interactive/ # Interactive task artifacts
| +-- {id}-result.json
+-- wisdom/ # Cross-task knowledge
+-- learnings.md
spawn_agent({
agent_type: "team_worker",
items: [
{ type: "text", text: `## Role Assignment
role: <role>
role_spec: <skill_root>/roles/<role>/role.md
session: <session-folder>
session_id: <session-id>
requirement: <task-description>
inner_loop: <true|false>
Read role_spec file (<skill_root>/roles/<role>/role.md) to load Phase 2-4 domain instructions.` },
{ type: "text", text: `## Task Context
task_id: <task-id>
title: <task-title>
description: <task-description>
pipeline_phase: <pipeline-phase>` },
{ type: "text", text: `## Upstream Context
<prev_context>` }
]
})
```
---
After spawning, use `wait_agent({ ids: [...], timeout_ms: 900000 })` to collect results, then `close_agent({ id })` each worker.
## Implementation
## User Commands
### Session Initialization
| Command | Action |
|---------|--------|
| `check` / `status` | View execution status graph |
| `resume` / `continue` | Advance to next step |
| `revise <TASK-ID> [feedback]` | Revise specific task |
| `feedback <text>` | Inject feedback for revision |
| `retry <TASK-ID>` | Re-run a failed task |
```javascript
const getUtc8ISOString = () => new Date(Date.now() + 8 * 60 * 60 * 1000).toISOString()
## Completion Action
const AUTO_YES = $ARGUMENTS.includes('--yes') || $ARGUMENTS.includes('-y')
const continueMode = $ARGUMENTS.includes('--continue')
const concurrencyMatch = $ARGUMENTS.match(/(?:--concurrency|-c)\s+(\d+)/)
const maxConcurrency = concurrencyMatch ? parseInt(concurrencyMatch[1]) : 2
When pipeline completes, coordinator presents:
const requirement = $ARGUMENTS
.replace(/--yes|-y|--continue|--concurrency\s+\d+|-c\s+\d+/g, '')
.trim()
const slug = requirement.toLowerCase()
.replace(/[^a-z0-9\u4e00-\u9fa5]+/g, '-')
.substring(0, 40)
const dateStr = getUtc8ISOString().substring(0, 10).replace(/-/g, '')
const sessionId = `tfd-${slug}-${dateStr}`
const sessionFolder = `.workflow/.csv-wave/${sessionId}`
Bash(`mkdir -p ${sessionFolder}/artifacts ${sessionFolder}/evidence ${sessionFolder}/role-instructions ${sessionFolder}/interactive ${sessionFolder}/wisdom`)
Write(`${sessionFolder}/discoveries.ndjson`, '')
Write(`${sessionFolder}/wisdom/learnings.md`, '# Debug Learnings\n')
```
request_user_input({
questions: [{
question: "Pipeline complete. What would you like to do?",
header: "Completion",
multiSelect: false,
options: [
{ label: "Archive & Clean (Recommended)", description: "Archive session, clean up" },
{ label: "Keep Active", description: "Keep session for follow-up debugging" },
{ label: "Export Results", description: "Export debug report and patches" }
]
}]
})
```
---
## Specs Reference
### Phase 0: Pre-Wave Interactive (Input Analysis)
- [specs/pipelines.md](specs/pipelines.md) — Pipeline definitions and task registry
- [specs/debug-tools.md](specs/debug-tools.md) — Chrome DevTools MCP usage patterns and evidence collection
**Objective**: Parse user input, detect mode (test vs debug), extract parameters.
## Session Directory
**Workflow**:
1. **Parse user input** from $ARGUMENTS
2. **Check for existing sessions** (continue mode):
- Scan `.workflow/.csv-wave/tfd-*/tasks.csv` for sessions with pending tasks
- If `--continue`: resume the specified or most recent session, skip to Phase 2
3. **Detect mode**:
| Input Pattern | Mode |
|---------------|------|
| Contains: feature, test, list, check, verify | `test-pipeline` |
| Contains: bug, error, crash, broken, not working | `debug-pipeline` |
| Ambiguous | request_user_input to clarify |
4. **Extract parameters by mode**:
**Test Mode**:
- `base_url`: URL in text or request_user_input
- `features`: Parse feature list (bullet points, numbered list, free text)
- Generate structured feature items with id, name, url
**Debug Mode**:
- `bug_description`: Bug description text
- `target_url`: URL in text or request_user_input
- `reproduction_steps`: Steps in text or request_user_input
- `evidence_plan`: Detect dimensions from keywords (UI, network, console, performance)
5. **Dimension Detection** (debug mode):
| Keywords | Dimension |
|----------|-----------|
| render, style, display, layout, CSS | screenshot, snapshot |
| request, API, network, timeout | network |
| error, crash, exception | console |
| slow, performance, lag, memory | performance |
| interaction, click, input, form | screenshot, console |
**Success Criteria**:
- Mode determined (test-pipeline or debug-pipeline)
- Base URL and features/steps extracted
- Evidence dimensions identified
---
### Phase 1: Requirement -> CSV + Classification
**Objective**: Build task dependency graph, generate tasks.csv and per-role instruction templates.
**Decomposition Rules**:
1. **Pipeline Definition**:
**Test Pipeline** (4 tasks, conditional):
```
TEST-001 -> [issues?] -> ANALYZE-001 -> FIX-001 -> VERIFY-001
|
+-- no issues -> Pipeline Complete (skip downstream)
```
**Debug Pipeline** (4 tasks, linear with iteration):
```
REPRODUCE-001 -> ANALYZE-001 -> FIX-001 -> VERIFY-001
^ |
| (if fail) |
+--- REPRODUCE-002 <-----+
```
2. **Task Description Template**: Every task uses PURPOSE/TASK/CONTEXT/EXPECTED/CONSTRAINTS format with session path, base URL, and upstream artifact references
3. **Role Instruction Generation**: Write per-role instruction templates to `role-instructions/{role}.md` using the base instruction template customized for each role
**Classification Rules**:
| Task Property | exec_mode |
|---------------|-----------|
| Feature testing (tester with inner loop) | `csv-wave` |
| Bug reproduction (single pass) | `csv-wave` |
| Root cause analysis (single pass) | `csv-wave` |
| Code fix (may need multiple passes) | `csv-wave` |
| Fix verification (single pass) | `csv-wave` |
| All standard pipeline tasks | `csv-wave` |
**Wave Computation**: Kahn's BFS topological sort with depth tracking.
```javascript
// Generate per-role instruction templates
const roles = pipelineMode === 'test-pipeline'
? ['tester', 'analyzer', 'fixer', 'verifier']
: ['reproducer', 'analyzer', 'fixer', 'verifier']
for (const role of roles) {
const instruction = generateRoleInstruction(role, sessionFolder, pipelineMode)
Write(`${sessionFolder}/role-instructions/${role}.md`, instruction)
}
const tasks = buildTasksCsv(pipelineMode, requirement, sessionFolder, baseUrl, evidencePlan)
Write(`${sessionFolder}/tasks.csv`, toCsv(tasks))
Write(`${sessionFolder}/task-analysis.json`, JSON.stringify(analysisResult, null, 2))
```
**User Validation**: Display task breakdown (skip if AUTO_YES).
**Success Criteria**:
- tasks.csv created with valid schema and wave assignments
- Role instruction templates generated
- task-analysis.json written
- No circular dependencies
---
### Phase 2: Wave Execution Engine (Extended)
**Objective**: Execute tasks wave-by-wave with conditional skip, iteration loops, and re-fix cycles.
```javascript
const masterCsv = Read(`${sessionFolder}/tasks.csv`)
let tasks = parseCsv(masterCsv)
let maxWave = Math.max(...tasks.map(t => t.wave))
let fixRound = 0
const MAX_FIX_ROUNDS = 3
const MAX_REPRODUCE_ROUNDS = 2
for (let wave = 1; wave <= maxWave; wave++) {
console.log(`\nWave ${wave}/${maxWave}`)
const waveTasks = tasks.filter(t => t.wave === wave && t.status === 'pending')
const csvTasks = waveTasks.filter(t => t.exec_mode === 'csv-wave')
const interactiveTasks = waveTasks.filter(t => t.exec_mode === 'interactive')
// Check dependencies -- skip tasks whose deps failed
for (const task of waveTasks) {
const depIds = (task.deps || '').split(';').filter(Boolean)
const depStatuses = depIds.map(id => tasks.find(t => t.id === id)?.status)
if (depStatuses.some(s => s === 'failed' || s === 'skipped')) {
task.status = 'skipped'
task.error = `Dependency failed: ${depIds.filter((id, i) =>
['failed','skipped'].includes(depStatuses[i])).join(', ')}`
}
}
// Execute pre-wave interactive tasks (if any)
for (const task of interactiveTasks.filter(t => t.status === 'pending')) {
// Determine agent file based on task type
const agentFile = task.id.includes('skip') ? 'agents/conditional-skip-gate.md'
: task.id.includes('iter') ? 'agents/iteration-handler.md'
: 'agents/completion-handler.md'
Read(agentFile)
const agent = spawn_agent({
message: `## TASK ASSIGNMENT\n\n### MANDATORY FIRST STEPS\n1. Read: ${agentFile}\n2. Read: ${sessionFolder}/discoveries.ndjson\n\nGoal: ${task.description}\nSession: ${sessionFolder}\n\n### Previous Context\n${buildPrevContext(task, tasks)}`
})
const result = wait({ ids: [agent], timeout_ms: 600000 })
if (result.timed_out) {
send_input({ id: agent, message: "Please finalize and output current findings." })
wait({ ids: [agent], timeout_ms: 120000 })
}
Write(`${sessionFolder}/interactive/${task.id}-result.json`, JSON.stringify({
task_id: task.id, status: "completed", findings: parseFindings(result),
timestamp: getUtc8ISOString()
}))
close_agent({ id: agent })
task.status = 'completed'
task.findings = parseFindings(result)
}
// Build prev_context for csv-wave tasks
const pendingCsvTasks = csvTasks.filter(t => t.status === 'pending')
for (const task of pendingCsvTasks) {
task.prev_context = buildPrevContext(task, tasks)
}
if (pendingCsvTasks.length > 0) {
Write(`${sessionFolder}/wave-${wave}.csv`, toCsv(pendingCsvTasks))
const waveInstruction = buildWaveInstruction(pendingCsvTasks, sessionFolder, wave)
spawn_agents_on_csv({
csv_path: `${sessionFolder}/wave-${wave}.csv`,
id_column: "id",
instruction: waveInstruction,
max_concurrency: maxConcurrency,
max_runtime_seconds: 1200,
output_csv_path: `${sessionFolder}/wave-${wave}-results.csv`,
output_schema: {
type: "object",
properties: {
id: { type: "string" },
status: { type: "string", enum: ["completed", "failed"] },
findings: { type: "string" },
artifacts_produced: { type: "string" },
issues_count: { type: "string" },
verdict: { type: "string" },
error: { type: "string" }
}
}
})
// Merge results into master CSV
const results = parseCsv(Read(`${sessionFolder}/wave-${wave}-results.csv`))
for (const r of results) {
const t = tasks.find(t => t.id === r.id)
if (t) Object.assign(t, r)
}
// Conditional Skip: TEST-001 with 0 issues
const testResult = results.find(r => r.id === 'TEST-001')
if (testResult && parseInt(testResult.issues_count || '0') === 0) {
// Skip all downstream tasks
tasks.filter(t => t.wave > wave && t.status === 'pending').forEach(t => {
t.status = 'skipped'
t.error = 'No issues found in testing -- skipped'
})
console.log('All features passed. No issues found. Pipeline complete.')
}
// Iteration: Analyzer needs more evidence
const analyzerResult = results.find(r => r.id.startsWith('ANALYZE') && r.findings?.includes('need_more_evidence'))
if (analyzerResult) {
const reproduceRound = tasks.filter(t => t.id.startsWith('REPRODUCE')).length
if (reproduceRound < MAX_REPRODUCE_ROUNDS) {
const newRepId = `REPRODUCE-${String(reproduceRound + 1).padStart(3, '0')}`
const newAnalyzeId = `ANALYZE-${String(tasks.filter(t => t.id.startsWith('ANALYZE')).length + 1).padStart(3, '0')}`
tasks.push({
id: newRepId, title: 'Supplemental evidence collection',
description: `PURPOSE: Collect additional evidence per Analyzer request | Success: Targeted evidence collected`,
role: 'reproducer', pipeline_mode: tasks[0].pipeline_mode,
base_url: tasks[0].base_url, evidence_dimensions: tasks[0].evidence_dimensions,
deps: '', context_from: analyzerResult.id,
exec_mode: 'csv-wave', wave: wave + 1, status: 'pending',
findings: '', artifacts_produced: '', issues_count: '', verdict: '', error: ''
})
tasks.push({
id: newAnalyzeId, title: 'Re-analysis with supplemental evidence',
description: `PURPOSE: Re-analyze with additional evidence | Success: Higher-confidence RCA`,
role: 'analyzer', pipeline_mode: tasks[0].pipeline_mode,
base_url: '', evidence_dimensions: '',
deps: newRepId, context_from: `${analyzerResult.id};${newRepId}`,
exec_mode: 'csv-wave', wave: wave + 2, status: 'pending',
findings: '', artifacts_produced: '', issues_count: '', verdict: '', error: ''
})
// Update FIX task deps
const fixTask = tasks.find(t => t.id === 'FIX-001' && t.status === 'pending')
if (fixTask) fixTask.deps = newAnalyzeId
}
}
// Re-fix: Verifier verdict = fail
const verifyResult = results.find(r => r.id.startsWith('VERIFY') && r.verdict === 'fail')
if (verifyResult && fixRound < MAX_FIX_ROUNDS) {
fixRound++
const newFixId = `FIX-${String(fixRound + 1).padStart(3, '0')}`
const newVerifyId = `VERIFY-${String(fixRound + 1).padStart(3, '0')}`
tasks.push({
id: newFixId, title: `Re-fix (round ${fixRound + 1})`,
description: `PURPOSE: Re-fix based on verification failure | Success: Issue resolved`,
role: 'fixer', pipeline_mode: tasks[0].pipeline_mode,
base_url: '', evidence_dimensions: '',
deps: verifyResult.id, context_from: verifyResult.id,
exec_mode: 'csv-wave', wave: wave + 1, status: 'pending',
findings: '', artifacts_produced: '', issues_count: '', verdict: '', error: ''
})
tasks.push({
id: newVerifyId, title: `Re-verify (round ${fixRound + 1})`,
description: `PURPOSE: Re-verify after fix | Success: Bug resolved`,
role: 'verifier', pipeline_mode: tasks[0].pipeline_mode,
base_url: tasks[0].base_url, evidence_dimensions: tasks[0].evidence_dimensions,
deps: newFixId, context_from: newFixId,
exec_mode: 'csv-wave', wave: wave + 2, status: 'pending',
findings: '', artifacts_produced: '', issues_count: '', verdict: '', error: ''
})
}
}
// Update master CSV
Write(`${sessionFolder}/tasks.csv`, toCsv(tasks))
// Cleanup temp files
Bash(`rm -f ${sessionFolder}/wave-${wave}.csv ${sessionFolder}/wave-${wave}-results.csv`)
// Recalculate maxWave (may have grown from iteration/re-fix)
maxWave = Math.max(maxWave, ...tasks.map(t => t.wave))
// Display wave summary
const completed = waveTasks.filter(t => t.status === 'completed').length
const failed = waveTasks.filter(t => t.status === 'failed').length
const skipped = waveTasks.filter(t => t.status === 'skipped').length
console.log(`Wave ${wave} Complete: ${completed} completed, ${failed} failed, ${skipped} skipped`)
}
.workflow/.team/TFD-<slug>-<date>/
├── team-session.json # Session state + role registry
├── evidence/ # Screenshots, snapshots, network logs
├── artifacts/ # Test reports, RCA reports, patches, verification reports
├── wisdom/ # Cross-task debug knowledge
└── .msg/ # Team message bus
```
**Success Criteria**:
- All waves executed in order
- Conditional skip handled (TEST with 0 issues)
- Iteration loops handled (analyzer need_more_evidence)
- Re-fix cycles handled (verifier fail verdict)
- discoveries.ndjson accumulated across all waves
- Max iteration/fix bounds respected
---
### Phase 3: Post-Wave Interactive (Completion Action)
**Objective**: Pipeline completion report with debug summary.
```javascript
const tasks = parseCsv(Read(`${sessionFolder}/tasks.csv`))
const completed = tasks.filter(t => t.status === 'completed')
const pipelineMode = tasks[0]?.pipeline_mode
console.log(`
============================================
FRONTEND DEBUG COMPLETE
Pipeline: ${pipelineMode} | ${completed.length}/${tasks.length} tasks
Fix Rounds: ${fixRound}/${MAX_FIX_ROUNDS}
Session: ${sessionFolder}
Results:
${completed.map(t => ` [DONE] ${t.id} (${t.role}): ${t.findings?.substring(0, 80) || 'completed'}`).join('\n')}
============================================
`)
if (!AUTO_YES) {
request_user_input({
questions: [{
question: "Debug pipeline complete. Choose next action.",
header: "Done",
id: "completion",
options: [
{ label: "Archive (Recommended)", description: "Archive session, output final summary" },
{ label: "Keep Active", description: "Keep session for follow-up debugging" },
{ label: "Export Results", description: "Export debug report and patches" }
]
}]
})
}
```
**Success Criteria**:
- User informed of debug pipeline results
- Completion action taken
---
### Phase 4: Results Aggregation
**Objective**: Generate final results and human-readable report.
```javascript
Bash(`cp ${sessionFolder}/tasks.csv ${sessionFolder}/results.csv`)
const tasks = parseCsv(Read(`${sessionFolder}/tasks.csv`))
let contextMd = `# Frontend Debug Report\n\n`
contextMd += `**Session**: ${sessionId}\n`
contextMd += `**Pipeline**: ${tasks[0]?.pipeline_mode}\n`
contextMd += `**Date**: ${getUtc8ISOString().substring(0, 10)}\n\n`
contextMd += `## Summary\n`
contextMd += `| Status | Count |\n|--------|-------|\n`
contextMd += `| Completed | ${tasks.filter(t => t.status === 'completed').length} |\n`
contextMd += `| Failed | ${tasks.filter(t => t.status === 'failed').length} |\n`
contextMd += `| Skipped | ${tasks.filter(t => t.status === 'skipped').length} |\n\n`
const maxWave = Math.max(...tasks.map(t => t.wave))
contextMd += `## Wave Execution\n\n`
for (let w = 1; w <= maxWave; w++) {
const waveTasks = tasks.filter(t => t.wave === w)
contextMd += `### Wave ${w}\n\n`
for (const t of waveTasks) {
const icon = t.status === 'completed' ? '[DONE]' : t.status === 'failed' ? '[FAIL]' : '[SKIP]'
contextMd += `${icon} **${t.title}** [${t.role}]`
if (t.verdict) contextMd += ` Verdict: ${t.verdict}`
if (t.issues_count) contextMd += ` Issues: ${t.issues_count}`
contextMd += ` ${t.findings || ''}\n\n`
}
}
// Debug-specific sections
const verifyTasks = tasks.filter(t => t.role === 'verifier' && t.verdict)
if (verifyTasks.length > 0) {
contextMd += `## Verification Results\n\n`
for (const v of verifyTasks) {
contextMd += `- **${v.id}**: ${v.verdict}\n`
}
}
Write(`${sessionFolder}/context.md`, contextMd)
console.log(`Results exported to: ${sessionFolder}/results.csv`)
console.log(`Report generated at: ${sessionFolder}/context.md`)
```
**Success Criteria**:
- results.csv exported
- context.md generated with debug summary
- Summary displayed to user
---
## Shared Discovery Board Protocol
All agents share a single `discoveries.ndjson` file.
**Format**: One JSON object per line (NDJSON):
```jsonl
{"ts":"2026-03-08T10:00:00Z","worker":"TEST-001","type":"feature_tested","data":{"feature":"F-001","name":"Login","result":"fail","issues":2}}
{"ts":"2026-03-08T10:05:00Z","worker":"REPRODUCE-001","type":"bug_reproduced","data":{"url":"/settings","steps":3,"console_errors":2,"network_failures":1}}
{"ts":"2026-03-08T10:10:00Z","worker":"ANALYZE-001","type":"root_cause_found","data":{"category":"TypeError","file":"src/components/Settings.tsx","line":142,"confidence":"high"}}
{"ts":"2026-03-08T10:15:00Z","worker":"FIX-001","type":"file_modified","data":{"file":"src/components/Settings.tsx","change":"Added null check","lines_added":3}}
{"ts":"2026-03-08T10:20:00Z","worker":"VERIFY-001","type":"verification_result","data":{"verdict":"pass","original_error_resolved":true,"new_errors":0}}
```
**Discovery Types**:
| Type | Data Schema | Description |
|------|-------------|-------------|
| `feature_tested` | `{feature, name, result, issues}` | Feature test result |
| `bug_reproduced` | `{url, steps, console_errors, network_failures}` | Bug reproduction result |
| `evidence_collected` | `{dimension, file, description}` | Evidence artifact saved |
| `root_cause_found` | `{category, file, line, confidence}` | Root cause identified |
| `file_modified` | `{file, change, lines_added}` | Code fix applied |
| `verification_result` | `{verdict, original_error_resolved, new_errors}` | Fix verification result |
| `issue_found` | `{file, line, severity, description}` | Issue discovered |
**Protocol**:
1. Agents MUST read discoveries.ndjson at start of execution
2. Agents MUST append relevant discoveries during execution
3. Agents MUST NOT modify or delete existing entries
4. Deduplication by `{type, data.file}` key
---
## Conditional Skip Logic
After TEST-001 completes, evaluate issues:
| Condition | Action |
|-----------|--------|
| `issues_count === 0` | Skip ANALYZE/FIX/VERIFY. Pipeline complete with all-pass. |
| Only low-severity warnings | request_user_input: fix warnings or complete |
| High/medium severity issues | Proceed with ANALYZE -> FIX -> VERIFY |
---
## Iteration Rules
| Trigger | Condition | Action | Max |
|---------|-----------|--------|-----|
| Analyzer -> Reproducer | Confidence < 50% | Create REPRODUCE-002 -> ANALYZE-002 | 2 reproduction rounds |
| Verifier -> Fixer | Verdict = fail | Create FIX-002 -> VERIFY-002 | 3 fix rounds |
| Max iterations reached | Round >= max | Report to user for manual intervention | -- |
---
## Error Handling
| Error | Resolution |
|-------|------------|
| Circular dependency | Detect in wave computation, abort with error message |
| CSV agent timeout | Mark as failed in results, continue with wave |
| CSV agent failed | Mark as failed, skip dependent tasks in later waves |
| Interactive agent timeout | Urge convergence via send_input, then close if still timed out |
| All agents in wave failed | Log error, offer retry or abort |
| CSV parse error | Validate CSV format before execution, show line number |
| discoveries.ndjson corrupt | Ignore malformed lines, continue with valid entries |
| All features pass test | Skip downstream tasks, report success |
| Bug not reproducible | Report failure, ask user for more details |
| Scenario | Resolution |
|----------|------------|
| All features pass test | Report success, pipeline completes without ANALYZE/FIX/VERIFY |
| Bug not reproducible | Reproducer reports failure, coordinator asks user for more details |
| Browser not available | Report error, suggest manual reproduction steps |
| Analysis inconclusive | Request more evidence via iteration loop |
| Fix introduces regression | Verifier reports fail, dispatch re-fix |
| Max iterations reached | Escalate to user for manual intervention |
| Continue mode: no session found | List available sessions, prompt user to select |
---
## Core Rules
1. **Start Immediately**: First action is session initialization, then Phase 0/1
2. **Wave Order is Sacred**: Never execute wave N before wave N-1 completes and results are merged
3. **CSV is Source of Truth**: Master tasks.csv holds all state (both csv-wave and interactive)
4. **CSV First**: Default to csv-wave for tasks; only use interactive when interaction pattern requires it
5. **Context Propagation**: prev_context built from master CSV, not from memory
6. **Discovery Board is Append-Only**: Never clear, modify, or recreate discoveries.ndjson
7. **Skip on Failure**: If a dependency failed, skip the dependent task
8. **Conditional Skip**: If TEST finds 0 issues, skip all downstream tasks
9. **Iteration Bounds**: Max 2 reproduction rounds, max 3 fix rounds
10. **Cleanup Temp Files**: Remove wave-{N}.csv after results are merged
11. **DO NOT STOP**: Continuous execution until all waves complete or all remaining tasks are skipped
---
## Coordinator Role Constraints (Main Agent)
**CRITICAL**: The coordinator (main agent executing this skill) is responsible for **orchestration only**, NOT implementation.
15. **Coordinator Does NOT Execute Code**: The main agent MUST NOT write, modify, or implement any code directly. All implementation work is delegated to spawned team agents. The coordinator only:
- Spawns agents with task assignments
- Waits for agent callbacks
- Merges results and coordinates workflow
- Manages workflow transitions between phases
16. **Patient Waiting is Mandatory**: Agent execution takes significant time (typically 10-30 minutes per phase, sometimes longer). The coordinator MUST:
- Wait patiently for `wait()` calls to complete
- NOT skip workflow steps due to perceived delays
- NOT assume agents have failed just because they're taking time
- Trust the timeout mechanisms defined in the skill
17. **Use send_input for Clarification**: When agents need guidance or appear stuck, the coordinator MUST:
- Use `send_input()` to ask questions or provide clarification
- NOT skip the agent or move to next phase prematurely
- Give agents opportunity to respond before escalating
- Example: `send_input({ id: agent_id, message: "Please provide status update or clarify blockers" })`
18. **No Workflow Shortcuts**: The coordinator MUST NOT:
- Skip phases or stages defined in the workflow
- Bypass required approval or review steps
- Execute dependent tasks before prerequisites complete
- Assume task completion without explicit agent callback
- Make up or fabricate agent results
19. **Respect Long-Running Processes**: This is a complex multi-agent workflow that requires patience:
- Total execution time may range from 30-90 minutes or longer
- Each phase may take 10-30 minutes depending on complexity
- The coordinator must remain active and attentive throughout the entire process
- Do not terminate or skip steps due to time concerns
| Analysis inconclusive | Analyzer requests more evidence via iteration loop |
| Fix introduces regression | Verifier reports fail, coordinator dispatches re-fix |
| No issues found in test | Skip downstream tasks, report all-pass |
| Unknown command | Error with available command list |
| Role not found | Error with role registry |

View File

@@ -1,142 +0,0 @@
# Completion Handler Agent
Interactive agent for handling pipeline completion action. Presents debug summary and offers Archive/Keep/Export choices.
## Identity
- **Type**: `interactive`
- **Role File**: `agents/completion-handler.md`
- **Responsibility**: Present debug pipeline results, handle completion choice, execute cleanup or export
## Boundaries
### MUST
- Load role definition via MANDATORY FIRST STEPS pattern
- Read all task results from master CSV
- Present debug summary (reproduction, RCA, fix, verification)
- Wait for user choice before acting
- Produce structured output following template
### MUST NOT
- Skip the MANDATORY FIRST STEPS role loading
- Delete session files without user approval
- Modify task artifacts
- Produce unstructured output
---
## Toolbox
### Available Tools
| Tool | Type | Purpose |
|------|------|---------|
| `Read` | built-in | Load task results and artifacts |
| `request_user_input` | built-in | Get user completion choice |
| `Write` | built-in | Store completion result |
| `Bash` | built-in | Execute archive/export operations |
---
## Execution
### Phase 1: Results Loading
**Objective**: Load all task results and build debug summary
**Input**:
| Source | Required | Description |
|--------|----------|-------------|
| tasks.csv | Yes | Master state with all task results |
| Artifact files | No | Verify deliverables exist |
**Steps**:
1. Read master tasks.csv
2. Parse all completed tasks and their artifacts
3. Build debug summary:
- Bug description and reproduction results
- Root cause analysis findings
- Files modified and patches applied
- Verification results (pass/fail)
- Evidence inventory (screenshots, logs, traces)
4. Calculate pipeline statistics
**Output**: Debug summary ready for user
---
### Phase 2: Completion Choice
**Objective**: Present debug results and get user action
**Steps**:
1. Display pipeline summary with debug details
2. Present completion choice:
```javascript
request_user_input({
questions: [{
question: "Debug pipeline complete. What would you like to do?",
header: "Completion",
id: "completion_action",
options: [
{ label: "Archive & Clean (Recommended)", description: "Archive session, output final summary" },
{ label: "Keep Active", description: "Keep session for follow-up debugging" },
{ label: "Export Results", description: "Export debug report and patches" }
]
}]
})
```
3. Handle response:
| Response | Action |
|----------|--------|
| Archive & Clean | Mark session completed, output final summary |
| Keep Active | Mark session paused, keep all evidence/artifacts |
| Export Results | Copy RCA report, fix changes, verification report to project directory |
**Output**: Completion action result
---
## Structured Output Template
```
## Summary
- Pipeline mode: <test-pipeline|debug-pipeline>
- Tasks completed: <count>/<total>
- Fix rounds: <count>/<max>
- Final verdict: <pass|pass_with_warnings|fail>
## Debug Summary
- Bug: <description>
- Root cause: <category at file:line>
- Fix: <description of changes>
- Verification: <pass/fail>
## Evidence Inventory
- Screenshots: <count>
- Console logs: <captured/not captured>
- Network logs: <captured/not captured>
- Performance trace: <captured/not captured>
## Action Taken
- Choice: <archive|keep|export>
- Session status: <completed|paused|exported>
```
---
## Error Handling
| Scenario | Resolution |
|----------|------------|
| tasks.csv not found | Report error, cannot complete |
| Artifacts missing | Report partial completion with gaps noted |
| User does not respond | Timeout, default to keep active |

View File

@@ -1,130 +0,0 @@
# Conditional Skip Gate Agent
Interactive agent for evaluating TEST-001 results and determining whether to skip downstream tasks (ANALYZE, FIX, VERIFY) when no issues are found.
## Identity
- **Type**: `interactive`
- **Role File**: `agents/conditional-skip-gate.md`
- **Responsibility**: Read TEST results, evaluate issue severity, decide skip/proceed
## Boundaries
### MUST
- Load role definition via MANDATORY FIRST STEPS pattern
- Read the TEST-001 issues JSON
- Evaluate issue count and severity distribution
- Apply conditional skip logic
- Present decision to user when only warnings exist
- Produce structured output following template
### MUST NOT
- Skip the MANDATORY FIRST STEPS role loading
- Auto-skip when high/medium issues exist
- Modify test artifacts directly
- Produce unstructured output
---
## Toolbox
### Available Tools
| Tool | Type | Purpose |
|------|------|---------|
| `Read` | built-in | Load test results and issues |
| `request_user_input` | built-in | Get user decision on warnings |
| `Write` | built-in | Store gate decision result |
---
## Execution
### Phase 1: Load Test Results
**Objective**: Load TEST-001 issues and evaluate severity
**Input**:
| Source | Required | Description |
|--------|----------|-------------|
| TEST-001-issues.json | Yes | Discovered issues with severity |
| TEST-001-report.md | No | Full test report |
**Steps**:
1. Extract session path from task assignment
2. Read TEST-001-issues.json
3. Parse issues array
4. Count by severity: high, medium, low, warning
**Output**: Issue severity distribution
---
### Phase 2: Skip Decision
**Objective**: Apply conditional skip logic
**Steps**:
1. Evaluate issues:
| Condition | Action |
|-----------|--------|
| `issues.length === 0` | Skip all downstream. Report "all_pass". |
| Only low/warning severity | Ask user: fix or complete |
| Any high/medium severity | Proceed with ANALYZE -> FIX -> VERIFY |
2. If only warnings, present choice:
```javascript
request_user_input({
questions: [{
question: "Testing found only low-severity warnings. How would you like to proceed?",
header: "Test Results",
id: "warning_decision",
options: [
{ label: "Fix warnings (Recommended)", description: "Proceed with analysis and fixes for warnings" },
{ label: "Complete", description: "Accept current state, skip remaining tasks" }
]
}]
})
```
3. Handle response and record decision
**Output**: Skip/proceed directive
---
## Structured Output Template
```
## Summary
- Test report evaluated: TEST-001
- Issues found: <total>
- High: <count>, Medium: <count>, Low: <count>, Warning: <count>
- Decision: <all_pass|skip_warnings|proceed>
## Findings
- All features tested: <count>
- Pass rate: <percentage>
## Decision Details
- Action: <skip-downstream|proceed-with-fixes>
- Downstream tasks affected: ANALYZE-001, FIX-001, VERIFY-001
- User choice: <if applicable>
```
---
## Error Handling
| Scenario | Resolution |
|----------|------------|
| TEST-001-issues.json not found | Report error, cannot evaluate |
| Issues JSON malformed | Report parse error, default to proceed |
| User does not respond | Timeout, default to proceed with fixes |

View File

@@ -1,120 +0,0 @@
# Iteration Handler Agent
Interactive agent for handling the analyzer's request for more evidence. Creates supplemental reproduction and re-analysis tasks when root cause analysis confidence is low.
## Identity
- **Type**: `interactive`
- **Role File**: `agents/iteration-handler.md`
- **Responsibility**: Parse analyzer evidence request, create REPRODUCE-002 + ANALYZE-002 tasks, update dependency chain
## Boundaries
### MUST
- Load role definition via MANDATORY FIRST STEPS pattern
- Read the analyzer's need_more_evidence request
- Parse specific evidence dimensions and actions requested
- Create supplemental reproduction task description
- Create re-analysis task description
- Update FIX dependency to point to new ANALYZE task
- Produce structured output following template
### MUST NOT
- Skip the MANDATORY FIRST STEPS role loading
- Ignore the analyzer's specific requests
- Create tasks beyond iteration bounds (max 2 reproduction rounds)
- Modify existing task artifacts
---
## Toolbox
### Available Tools
| Tool | Type | Purpose |
|------|------|---------|
| `Read` | built-in | Load analyzer output and session state |
| `Write` | built-in | Store iteration handler result |
---
## Execution
### Phase 1: Parse Evidence Request
**Objective**: Understand what additional evidence the analyzer needs
**Input**:
| Source | Required | Description |
|--------|----------|-------------|
| Analyzer findings | Yes | Contains need_more_evidence with specifics |
| Session state | No | Current iteration count |
**Steps**:
1. Extract session path from task assignment
2. Read analyzer's findings or RCA report (partial)
3. Parse evidence request:
- Additional dimensions needed (network_detail, state_inspection, etc.)
- Specific actions (capture request body, evaluate React state, etc.)
4. Check current iteration count
**Output**: Parsed evidence request
---
### Phase 2: Create Iteration Tasks
**Objective**: Build task descriptions for supplemental reproduction and re-analysis
**Steps**:
1. Check iteration bounds:
| Condition | Action |
|-----------|--------|
| Reproduction rounds < 2 | Create REPRODUCE-002 + ANALYZE-002 |
| Reproduction rounds >= 2 | Escalate to user for manual investigation |
2. Build REPRODUCE-002 description with specific evidence requests from analyzer
3. Build ANALYZE-002 description that loads both original and supplemental evidence
4. Record new tasks and dependency updates
**Output**: Task descriptions for dynamic wave extension
---
## Structured Output Template
```
## Summary
- Analyzer evidence request processed
- Iteration round: <current>/<max>
- Action: <create-reproduction|escalate>
## Evidence Request
- Dimensions needed: <list>
- Specific actions: <list>
## Tasks Created
- REPRODUCE-002: <description summary>
- ANALYZE-002: <description summary>
## Dependency Updates
- FIX-001 deps updated: ANALYZE-001 -> ANALYZE-002
```
---
## Error Handling
| Scenario | Resolution |
|----------|------------|
| Evidence request unclear | Use all default dimensions |
| Max iterations reached | Escalate to user |
| Session state missing | Default to iteration round 1 |

View File

@@ -1,272 +0,0 @@
# Agent Instruction Template -- Team Frontend Debug
Base instruction template for CSV wave agents. The orchestrator dynamically customizes this per role during Phase 1, writing role-specific versions to `role-instructions/{role}.md`.
## Purpose
| Phase | Usage |
|-------|-------|
| Phase 1 | Coordinator generates per-role instruction from this template |
| Phase 2 | Injected as `instruction` parameter to `spawn_agents_on_csv` |
---
## Base Instruction Template
```markdown
## TASK ASSIGNMENT -- Team Frontend Debug
### MANDATORY FIRST STEPS
1. Read shared discoveries: <session-folder>/discoveries.ndjson (if exists, skip if not)
2. Read project context: .workflow/project-tech.json (if exists)
---
## Your Task
**Task ID**: {id}
**Title**: {title}
**Role**: {role}
**Pipeline Mode**: {pipeline_mode}
**Base URL**: {base_url}
**Evidence Dimensions**: {evidence_dimensions}
### Task Description
{description}
### Previous Tasks' Findings (Context)
{prev_context}
---
## Execution Protocol
1. **Read discoveries**: Load <session-folder>/discoveries.ndjson for shared exploration findings
2. **Use context**: Apply previous tasks' findings from prev_context above
3. **Execute task**: Follow role-specific instructions below
4. **Share discoveries**: Append exploration findings to shared board:
```bash
echo '{"ts":"<ISO8601>","worker":"{id}","type":"<type>","data":{...}}' >> <session-folder>/discoveries.ndjson
```
5. **Report result**: Return JSON via report_agent_job_result
### Discovery Types to Share
- `feature_tested`: {feature, name, result, issues} -- Feature test result
- `bug_reproduced`: {url, steps, console_errors, network_failures} -- Bug reproduction outcome
- `evidence_collected`: {dimension, file, description} -- Evidence artifact saved
- `root_cause_found`: {category, file, line, confidence} -- Root cause identified
- `file_modified`: {file, change, lines_added} -- Code fix applied
- `verification_result`: {verdict, original_error_resolved, new_errors} -- Verification outcome
- `issue_found`: {file, line, severity, description} -- Issue discovered
---
## Output (report_agent_job_result)
Return JSON:
{
"id": "{id}",
"status": "completed" | "failed",
"findings": "Key discoveries and implementation notes (max 500 chars)",
"artifacts_produced": "semicolon-separated paths of produced files",
"issues_count": "",
"verdict": "",
"error": ""
}
```
---
## Role-Specific Customization
The coordinator generates per-role instruction variants during Phase 1.
### For Tester Role (test-pipeline)
```
3. **Execute**:
- Parse feature list from task description
- For each feature:
a. Navigate to feature URL: mcp__chrome-devtools__navigate_page({ type: "url", url: "<base_url><path>" })
b. Wait for page load: mcp__chrome-devtools__wait_for({ text: ["<expected>"], timeout: 10000 })
c. Explore page structure: mcp__chrome-devtools__take_snapshot()
d. Generate test scenarios from UI elements if not predefined
e. Capture baseline: take_screenshot (before), list_console_messages
f. Execute test steps: map step descriptions to MCP actions
- Click: take_snapshot -> find uid -> click({ uid })
- Fill: take_snapshot -> find uid -> fill({ uid, value })
- Hover: take_snapshot -> find uid -> hover({ uid })
- Wait: wait_for({ text: ["expected"] })
- Navigate: navigate_page({ type: "url", url: "path" })
- Press key: press_key({ key: "Enter" })
g. Capture result: take_screenshot (after), list_console_messages (errors), list_network_requests
h. Evaluate: console errors? network failures? expected text present? visual issues?
i. Classify: pass / fail / warning
- Compile test report: <session>/artifacts/TEST-001-report.md
- Compile issues list: <session>/artifacts/TEST-001-issues.json
- Set issues_count in output
```
### For Reproducer Role (debug-pipeline)
```
3. **Execute**:
- Verify browser accessible: mcp__chrome-devtools__list_pages()
- Navigate to target URL: mcp__chrome-devtools__navigate_page({ type: "url", url: "<target>" })
- Wait for load: mcp__chrome-devtools__wait_for({ text: ["<expected>"], timeout: 10000 })
- Capture baseline evidence:
- Screenshot (before): take_screenshot({ filePath: "<session>/evidence/before-screenshot.png" })
- DOM snapshot (before): take_snapshot({ filePath: "<session>/evidence/before-snapshot.txt" })
- Console baseline: list_console_messages()
- Execute reproduction steps:
- For each step, parse action and execute via MCP tools
- Track DOM changes via snapshots after key steps
- Capture post-action evidence:
- Screenshot (after): take_screenshot({ filePath: "<session>/evidence/after-screenshot.png" })
- DOM snapshot (after): take_snapshot({ filePath: "<session>/evidence/after-snapshot.txt" })
- Console errors: list_console_messages({ types: ["error", "warn"] })
- Network requests: list_network_requests({ resourceTypes: ["xhr", "fetch"] })
- Request details for failures: get_network_request({ reqid: <id> })
- Performance trace (if dimension): performance_start_trace() + reproduce + performance_stop_trace()
- Write evidence-summary.json to <session>/evidence/
```
### For Analyzer Role
```
3. **Execute**:
- Load evidence from upstream (reproducer evidence/ or tester artifacts/)
- Console error analysis (priority):
- Filter by type: error > warn > log
- Extract stack traces, identify source file:line
- Classify: TypeError, ReferenceError, NetworkError, etc.
- Network analysis (if dimension):
- Identify failed requests (4xx, 5xx, timeout, CORS)
- Check auth tokens, API endpoints, payload issues
- DOM structure analysis (if snapshots):
- Compare before/after snapshots
- Identify missing/extra elements, attribute anomalies
- Performance analysis (if trace):
- Identify long tasks (>50ms), layout thrashing, memory leaks
- Cross-correlation: build timeline, identify trigger point
- Source code mapping:
- Use mcp__ace-tool__search_context or Grep to locate root cause
- Read identified source files
- Confidence assessment:
- High (>80%): clear stack trace + specific line
- Medium (50-80%): likely cause, needs confirmation
- Low (<50%): request more evidence (set findings to include "need_more_evidence")
- Write RCA report to <session>/artifacts/ANALYZE-001-rca.md
- Set issues_count in output
```
### For Fixer Role
```
3. **Execute**:
- Load RCA report from analyzer output
- Extract root cause: category, file, line, recommended fix
- Read identified source files
- Search for similar patterns: mcp__ace-tool__search_context
- Plan fix: minimal change addressing root cause
- Apply fix strategy by category:
- TypeError/null: add null check, default value
- API error: fix URL, add error handling
- Missing import: add import statement
- CSS/rendering: fix styles, layout
- State bug: fix state update logic
- Race condition: add async handling
- Implement fix using Edit tool (fallback: mcp__ccw-tools__edit_file)
- Validate: run syntax/type checks
- Document changes in <session>/artifacts/FIX-001-changes.md
```
### For Verifier Role
```
3. **Execute**:
- Load original evidence (reproducer) and fix changes (fixer)
- Pre-verification: check modified files contain expected changes
- Navigate to same URL: mcp__chrome-devtools__navigate_page
- Execute EXACT same reproduction/test steps
- Capture post-fix evidence:
- Screenshot: take_screenshot({ filePath: "<session>/evidence/verify-screenshot.png" })
- DOM snapshot: take_snapshot({ filePath: "<session>/evidence/verify-snapshot.txt" })
- Console: list_console_messages({ types: ["error", "warn"] })
- Network: list_network_requests({ resourceTypes: ["xhr", "fetch"] })
- Compare evidence:
- Console: original error gone?
- Network: failed request now succeeds?
- Visual: expected rendering achieved?
- New errors: any regression?
- Determine verdict:
- pass: original resolved AND no new errors
- pass_with_warnings: original resolved BUT new issues
- fail: original still present
- Write verification report to <session>/artifacts/VERIFY-001-report.md
- Set verdict in output
```
---
## Chrome DevTools MCP Reference
### Common Patterns
**Navigate and Wait**:
```
mcp__chrome-devtools__navigate_page({ type: "url", url: "<url>" })
mcp__chrome-devtools__wait_for({ text: ["<expected>"], timeout: 10000 })
```
**Find Element and Interact**:
```
mcp__chrome-devtools__take_snapshot() // Get uids
mcp__chrome-devtools__click({ uid: "<uid>" })
mcp__chrome-devtools__fill({ uid: "<uid>", value: "<value>" })
```
**Capture Evidence**:
```
mcp__chrome-devtools__take_screenshot({ filePath: "<path>" })
mcp__chrome-devtools__list_console_messages({ types: ["error", "warn"] })
mcp__chrome-devtools__list_network_requests({ resourceTypes: ["xhr", "fetch"] })
```
**Debug API Error**:
```
mcp__chrome-devtools__list_network_requests() // Find request
mcp__chrome-devtools__get_network_request({ reqid: <id> }) // Inspect details
```
---
## Quality Requirements
All agents must verify before reporting complete:
| Requirement | Criteria |
|-------------|----------|
| Files produced | Verify all claimed artifacts exist via Read |
| Evidence captured | All planned dimensions have evidence files |
| Findings accuracy | Findings reflect actual observations |
| Discovery sharing | At least 1 discovery shared to board |
| Error reporting | Non-empty error field if status is failed |
| Verdict set | verifier role sets verdict field |
| Issues count set | tester/analyzer roles set issues_count field |
---
## Placeholder Reference
| Placeholder | Resolved By | When |
|-------------|------------|------|
| `<session-folder>` | Skill designer (Phase 1) | Literal path baked into instruction |
| `{id}` | spawn_agents_on_csv | Runtime from CSV row |
| `{title}` | spawn_agents_on_csv | Runtime from CSV row |
| `{description}` | spawn_agents_on_csv | Runtime from CSV row |
| `{role}` | spawn_agents_on_csv | Runtime from CSV row |
| `{pipeline_mode}` | spawn_agents_on_csv | Runtime from CSV row |
| `{base_url}` | spawn_agents_on_csv | Runtime from CSV row |
| `{evidence_dimensions}` | spawn_agents_on_csv | Runtime from CSV row |
| `{prev_context}` | spawn_agents_on_csv | Runtime from CSV row |

View File

@@ -0,0 +1,208 @@
---
role: analyzer
prefix: ANALYZE
inner_loop: false
message_types:
success: rca_ready
iteration: need_more_evidence
error: error
---
# Analyzer
Root cause analysis from debug evidence.
## Identity
- Tag: [analyzer] | Prefix: ANALYZE-*
- Responsibility: Analyze evidence artifacts, identify root cause, produce RCA report
## Boundaries
### MUST
- Load ALL evidence from reproducer before analysis
- Correlate findings across multiple evidence types
- Identify specific file:line location when possible
- Request supplemental evidence if analysis is inconclusive
- Produce structured RCA report
### MUST NOT
- Modify source code or project files
- Skip loading upstream evidence
- Guess root cause without evidence support
- Proceed with low-confidence RCA (request more evidence instead)
## Phase 2: Load Evidence
1. Load debug specs: Run `ccw spec load --category debug` for known issues, workarounds, and root-cause notes
2. Read upstream artifacts via team_msg(operation="get_state", role="reproducer")
3. Extract evidence paths from reproducer's state_update ref
3. Load evidence-summary.json from session evidence/
4. Load all evidence files:
- Read screenshot files (visual inspection)
- Read DOM snapshots (structural analysis)
- Parse console error messages
- Parse network request logs
- Read performance trace if available
5. Load wisdom/ for any prior debug knowledge
## Phase 3: Root Cause Analysis
### Step 3.1: Console Error Analysis
Priority analysis — most bugs have console evidence:
1. Filter console messages by type: error > warn > log
2. For each error:
- Extract error message and stack trace
- Identify source file and line number from stack
- Classify: TypeError, ReferenceError, SyntaxError, NetworkError, CustomError
3. Map errors to reproduction steps (correlation by timing)
### Step 3.2: Network Analysis
If network evidence collected:
1. Identify failed requests (4xx, 5xx, timeout, CORS)
2. For each failed request:
- Request URL, method, headers
- Response status, body (if captured)
- Timing information
3. Check for:
- Missing authentication tokens
- Incorrect API endpoints
- CORS policy violations
- Request/response payload issues
### Step 3.3: DOM Structure Analysis
If snapshots collected:
1. Compare before/after snapshots
2. Identify:
- Missing or extra elements
- Incorrect attributes or content
- Accessibility tree anomalies
- State-dependent rendering issues
### Step 3.4: Performance Analysis
If performance trace collected:
1. Identify long tasks (>50ms)
2. Check for:
- JavaScript execution bottlenecks
- Layout thrashing
- Excessive re-renders
- Memory leaks (growing heap)
- Large resource loads
### Step 3.5: Cross-Correlation
Combine findings from all dimensions:
1. Build timeline of events leading to bug
2. Identify the earliest trigger point
3. Trace from trigger to visible symptom
4. Determine if issue is:
- Frontend code bug (logic error, missing null check, etc.)
- Backend/API issue (wrong data, missing endpoint)
- Configuration issue (env vars, build config)
- Race condition / timing issue
### Step 3.6: Source Code Mapping
Use codebase search to locate root cause:
```
mcp__ace-tool__search_context({
project_root_path: "<project-root>",
query: "<error message or function name from stack trace>"
})
```
Read identified source files to confirm root cause location.
### Step 3.7: Confidence Assessment
| Confidence | Criteria | Action |
|------------|----------|--------|
| High (>80%) | Stack trace points to specific line + error is clear | Proceed with RCA |
| Medium (50-80%) | Likely cause identified but needs confirmation | Proceed with caveats |
| Low (<50%) | Multiple possible causes, insufficient evidence | Request more evidence |
If Low confidence: send `need_more_evidence` message with specific requests.
## Phase 4: RCA Report
Write `<session>/artifacts/ANALYZE-001-rca.md`:
```markdown
# Root Cause Analysis Report
## Bug Summary
- **Description**: <bug description>
- **URL**: <target url>
- **Reproduction**: <success/partial/failed>
## Root Cause
- **Category**: <JS Error | Network | Rendering | Performance | State>
- **Confidence**: <High | Medium | Low>
- **Source File**: <file path>
- **Source Line**: <line number>
- **Root Cause**: <detailed explanation>
## Evidence Chain
1. <evidence 1 -> finding>
2. <evidence 2 -> finding>
3. <correlation -> root cause>
## Fix Recommendation
- **Approach**: <description of recommended fix>
- **Files to modify**: <list>
- **Risk level**: <Low | Medium | High>
- **Estimated scope**: <lines of code / number of files>
## Additional Observations
- <any related issues found>
- <potential regression risks>
```
Send state_update:
```json
{
"status": "task_complete",
"task_id": "ANALYZE-001",
"ref": "<session>/artifacts/ANALYZE-001-rca.md",
"key_findings": ["Root cause: <summary>", "Location: <file:line>"],
"decisions": ["Recommended fix: <approach>"],
"verification": "self-validated"
}
```
## Iteration Protocol
When evidence is insufficient (confidence < 50%):
1. Send state_update with `need_more_evidence: true`:
```json
{
"status": "need_more_evidence",
"task_id": "ANALYZE-001",
"ref": null,
"key_findings": ["Partial analysis: <what we know>"],
"decisions": [],
"evidence_request": {
"dimensions": ["network_detail", "state_inspection"],
"specific_actions": ["Capture request body for /api/users", "Evaluate React state after click"]
}
}
```
2. Coordinator creates REPRODUCE-002 + ANALYZE-002
3. ANALYZE-002 loads both original and supplemental evidence
## Error Handling
| Scenario | Resolution |
|----------|------------|
| Evidence files missing | Report with available data, note gaps |
| No clear root cause | Request supplemental evidence via iteration |
| Multiple possible causes | Rank by likelihood, report top 3 |
| Source code not found | Report with best available location info |

View File

@@ -0,0 +1,174 @@
# Analyze Input
Parse user input -> detect mode (feature-test vs bug-report) -> build dependency graph -> assign roles.
**CONSTRAINT**: Text-level analysis only. NO source code reading, NO codebase exploration.
## Step 1: Detect Input Mode
```
if input contains: 功能, feature, 清单, list, 测试, test, 完成, done, 验收
→ mode = "test-pipeline"
elif input contains: bug, 错误, 报错, crash, 问题, 不工作, 白屏, 异常
→ mode = "debug-pipeline"
else
→ request_user_input to clarify
```
```
request_user_input({
questions: [{
question: "请确认调试模式",
header: "Mode",
multiSelect: false,
options: [
{ label: "功能测试", description: "根据功能清单逐项测试,发现并修复问题" },
{ label: "Bug修复", description: "针对已知Bug进行复现、分析和修复" }
]
}]
})
```
---
## Mode A: Feature Test (test-pipeline)
### Parse Feature List
Extract from user input:
| Field | Source | Required |
|-------|--------|----------|
| base_url | URL in text or request_user_input | Yes |
| features | Feature list (bullet points, numbered list, or free text) | Yes |
| test_depth | User preference or default "standard" | Auto |
Parse features into structured format:
```json
[
{ "id": "F-001", "name": "用户登录", "url": "/login", "description": "..." },
{ "id": "F-002", "name": "数据列表", "url": "/dashboard", "description": "..." }
]
```
If base_url missing:
```
request_user_input({
questions: [{
question: "请提供应用的访问地址",
header: "Base URL",
multiSelect: false,
options: [
{ label: "localhost:3000", description: "本地开发服务器" },
{ label: "localhost:5173", description: "Vite默认端口" },
{ label: "Custom", description: "自定义URL" }
]
}]
})
```
### Complexity Scoring (Test Mode)
| Factor | Points |
|--------|--------|
| Per feature | +1 |
| Features > 5 | +2 |
| Features > 10 | +3 |
| Cross-page workflows | +1 |
Results: 1-3 Low, 4-6 Medium, 7+ High
### Output (Test Mode)
```json
{
"mode": "test-pipeline",
"base_url": "<url>",
"features": [
{ "id": "F-001", "name": "<name>", "url": "<path>", "description": "<desc>" }
],
"pipeline_type": "test-pipeline",
"dependency_graph": {
"TEST-001": { "role": "tester", "blockedBy": [], "priority": "P0" },
"ANALYZE-001": { "role": "analyzer", "blockedBy": ["TEST-001"], "priority": "P0", "conditional": true },
"FIX-001": { "role": "fixer", "blockedBy": ["ANALYZE-001"], "priority": "P0", "conditional": true },
"VERIFY-001": { "role": "verifier", "blockedBy": ["FIX-001"], "priority": "P0", "conditional": true }
},
"roles": [
{ "name": "tester", "prefix": "TEST", "inner_loop": true },
{ "name": "analyzer", "prefix": "ANALYZE", "inner_loop": false },
{ "name": "fixer", "prefix": "FIX", "inner_loop": true },
{ "name": "verifier", "prefix": "VERIFY", "inner_loop": false }
],
"complexity": { "score": 0, "level": "Low|Medium|High" }
}
```
---
## Mode B: Bug Report (debug-pipeline)
### Parse Bug Report
Extract from user input:
| Field | Source | Required |
|-------|--------|----------|
| bug_description | User text | Yes |
| target_url | URL in text or request_user_input | Yes |
| reproduction_steps | Steps in text or request_user_input | Yes |
| expected_behavior | User description | Recommended |
| actual_behavior | User description | Recommended |
| severity | User indication or auto-assess | Auto |
### Debug Dimension Detection
| Keywords | Dimension | Evidence Needed |
|----------|-----------|-----------------|
| 渲染, 样式, 显示, 布局, CSS | UI/Rendering | screenshot, snapshot |
| 请求, API, 接口, 网络, 超时 | Network | network_requests |
| 错误, 报错, 异常, crash | JavaScript Error | console_messages |
| 慢, 卡顿, 性能, 加载 | Performance | performance_trace |
| 状态, 数据, 更新, 不同步 | State Management | console + snapshot |
| 交互, 点击, 输入, 表单 | User Interaction | click/fill + screenshot |
### Complexity Scoring (Debug Mode)
| Factor | Points |
|--------|--------|
| Single dimension (e.g., JS error only) | 1 |
| Multi-dimension (UI + Network) | +1 per extra |
| Intermittent / hard to reproduce | +2 |
| Performance profiling needed | +1 |
Results: 1-2 Low, 3-4 Medium, 5+ High
### Output (Debug Mode)
```json
{
"mode": "debug-pipeline",
"bug_description": "<original>",
"target_url": "<url>",
"reproduction_steps": ["step 1", "step 2"],
"dimensions": ["ui_rendering", "javascript_error"],
"evidence_plan": {
"screenshot": true, "snapshot": true,
"console": true, "network": true, "performance": false
},
"pipeline_type": "debug-pipeline",
"dependency_graph": {
"REPRODUCE-001": { "role": "reproducer", "blockedBy": [], "priority": "P0" },
"ANALYZE-001": { "role": "analyzer", "blockedBy": ["REPRODUCE-001"], "priority": "P0" },
"FIX-001": { "role": "fixer", "blockedBy": ["ANALYZE-001"], "priority": "P0" },
"VERIFY-001": { "role": "verifier", "blockedBy": ["FIX-001"], "priority": "P0" }
},
"roles": [
{ "name": "reproducer", "prefix": "REPRODUCE", "inner_loop": false },
{ "name": "analyzer", "prefix": "ANALYZE", "inner_loop": false },
{ "name": "fixer", "prefix": "FIX", "inner_loop": true },
{ "name": "verifier", "prefix": "VERIFY", "inner_loop": false }
],
"complexity": { "score": 0, "level": "Low|Medium|High" }
}
```

View File

@@ -0,0 +1,198 @@
# Dispatch Debug Tasks
Create task chains from dependency graph with proper blockedBy relationships.
## Workflow
1. Read task-analysis.json -> extract pipeline_type and dependency_graph
2. Read specs/pipelines.md -> get task registry for selected pipeline
3. Topological sort tasks (respect blockedBy)
4. Validate all owners exist in role registry (SKILL.md)
5. For each task (in order):
- Build JSON entry with structured description (see template below)
- Set blockedBy and owner fields in the entry
6. Write all entries to `<session>/tasks.json`
7. Update team-session.json with pipeline.tasks_total
8. Validate chain (no orphans, no cycles, all refs valid)
## Task Description Template
Each task is a JSON entry in the tasks array:
```json
{
"id": "<TASK-ID>",
"subject": "<TASK-ID>",
"description": "PURPOSE: <goal> | Success: <criteria>\nTASK:\n - <step 1>\n - <step 2>\nCONTEXT:\n - Session: <session-folder>\n - Base URL / Bug URL: <url>\n - Upstream artifacts: <list>\nEXPECTED: <artifact path> + <quality criteria>\nCONSTRAINTS: <scope limits>\n---\nInnerLoop: <true|false>\nRoleSpec: ~ or <project>/.codex/skills/team-frontend-debug/roles/<role>/role.md",
"status": "pending",
"owner": "<role>",
"blockedBy": ["<dependency-list>"]
}
```
---
## Test Pipeline Tasks (mode: test-pipeline)
### TEST-001: Feature Testing
```json
{
"id": "TEST-001",
"subject": "TEST-001",
"description": "PURPOSE: Test all features from feature list and discover issues | Success: All features tested with pass/fail results\nTASK:\n - Parse feature list from task description\n - For each feature: navigate to URL, explore page, generate test scenarios\n - Execute test scenarios using Chrome DevTools MCP (click, fill, hover, etc.)\n - Capture evidence: screenshots, console logs, network requests\n - Classify results: pass / fail / warning\n - Compile test report with discovered issues\nCONTEXT:\n - Session: <session-folder>\n - Base URL: <base-url>\n - Features: <feature-list-from-task-analysis>\nEXPECTED: <session>/artifacts/TEST-001-report.md + <session>/artifacts/TEST-001-issues.json\nCONSTRAINTS: Use Chrome DevTools MCP only | Do not modify any code | Test all listed features\n---\nInnerLoop: true\nRoleSpec: ~ or <project>/.codex/skills/team-frontend-debug/roles/tester/role.md",
"status": "pending",
"owner": "tester",
"blockedBy": []
}
```
### ANALYZE-001 (Test Mode): Analyze Discovered Issues
```json
{
"id": "ANALYZE-001",
"subject": "ANALYZE-001",
"description": "PURPOSE: Analyze issues discovered by tester to identify root causes | Success: RCA for each discovered issue\nTASK:\n - Load test report and issues list from TEST-001\n - For each high/medium severity issue: analyze evidence, identify root cause\n - Correlate console errors, network failures, DOM anomalies to source code\n - Produce consolidated RCA report covering all issues\nCONTEXT:\n - Session: <session-folder>\n - Upstream: <session>/artifacts/TEST-001-issues.json\n - Test evidence: <session>/evidence/\nEXPECTED: <session>/artifacts/ANALYZE-001-rca.md with root causes for all issues\nCONSTRAINTS: Read-only analysis | Skip low-severity warnings unless user requests\n---\nInnerLoop: false\nRoleSpec: ~ or <project>/.codex/skills/team-frontend-debug/roles/analyzer/role.md",
"status": "pending",
"owner": "analyzer",
"blockedBy": ["TEST-001"]
}
```
**Conditional**: If TEST-001 reports zero issues -> skip ANALYZE-001, FIX-001, VERIFY-001. Pipeline completes.
### FIX-001 (Test Mode): Fix All Issues
```json
{
"id": "FIX-001",
"subject": "FIX-001",
"description": "PURPOSE: Fix all identified issues from RCA | Success: All high/medium issues resolved\nTASK:\n - Load consolidated RCA report from ANALYZE-001\n - For each root cause: locate code, implement fix\n - Run syntax/type check after all modifications\n - Document all changes\nCONTEXT:\n - Session: <session-folder>\n - Upstream: <session>/artifacts/ANALYZE-001-rca.md\nEXPECTED: Modified source files + <session>/artifacts/FIX-001-changes.md\nCONSTRAINTS: Minimal changes per issue | Follow existing code style\n---\nInnerLoop: true\nRoleSpec: ~ or <project>/.codex/skills/team-frontend-debug/roles/fixer/role.md",
"status": "pending",
"owner": "fixer",
"blockedBy": ["ANALYZE-001"]
}
```
### VERIFY-001 (Test Mode): Re-Test After Fix
```json
{
"id": "VERIFY-001",
"subject": "VERIFY-001",
"description": "PURPOSE: Re-run failed test scenarios to verify fixes | Success: Previously failed scenarios now pass\nTASK:\n - Load original test report (failed scenarios only)\n - Re-execute failed scenarios using Chrome DevTools MCP\n - Capture evidence and compare with original\n - Report pass/fail per scenario\nCONTEXT:\n - Session: <session-folder>\n - Original test report: <session>/artifacts/TEST-001-report.md\n - Fix changes: <session>/artifacts/FIX-001-changes.md\n - Failed features: <from TEST-001-issues.json>\nEXPECTED: <session>/artifacts/VERIFY-001-report.md with pass/fail per previously-failed scenario\nCONSTRAINTS: Only re-test failed scenarios | Use Chrome DevTools MCP only\n---\nInnerLoop: false\nRoleSpec: ~ or <project>/.codex/skills/team-frontend-debug/roles/verifier/role.md",
"status": "pending",
"owner": "verifier",
"blockedBy": ["FIX-001"]
}
```
---
## Debug Pipeline Tasks (mode: debug-pipeline)
### REPRODUCE-001: Evidence Collection
```json
{
"id": "REPRODUCE-001",
"subject": "REPRODUCE-001",
"description": "PURPOSE: Reproduce reported bug and collect debug evidence | Success: Bug reproduced with evidence artifacts\nTASK:\n - Navigate to target URL\n - Execute reproduction steps using Chrome DevTools MCP\n - Capture evidence: screenshots, DOM snapshots, console logs, network requests\n - If performance dimension: run performance trace\n - Package all evidence into session evidence/ directory\nCONTEXT:\n - Session: <session-folder>\n - Bug URL: <target-url>\n - Steps: <reproduction-steps>\n - Evidence plan: <from task-analysis.json>\nEXPECTED: <session>/evidence/ directory with all captures + reproduction report\nCONSTRAINTS: Use Chrome DevTools MCP only | Do not modify any code\n---\nInnerLoop: false\nRoleSpec: ~ or <project>/.codex/skills/team-frontend-debug/roles/reproducer/role.md",
"status": "pending",
"owner": "reproducer",
"blockedBy": []
}
```
### ANALYZE-001 (Debug Mode): Root Cause Analysis
```json
{
"id": "ANALYZE-001",
"subject": "ANALYZE-001",
"description": "PURPOSE: Analyze evidence to identify root cause | Success: RCA report with specific file:line location\nTASK:\n - Load evidence from REPRODUCE-001\n - Analyze console errors and stack traces\n - Analyze failed/abnormal network requests\n - Compare DOM snapshot against expected structure\n - Correlate findings to source code location\nCONTEXT:\n - Session: <session-folder>\n - Upstream: <session>/evidence/\n - Bug description: <bug-description>\nEXPECTED: <session>/artifacts/ANALYZE-001-rca.md with root cause, file:line, fix recommendation\nCONSTRAINTS: Read-only analysis | Request more evidence if inconclusive\n---\nInnerLoop: false\nRoleSpec: ~ or <project>/.codex/skills/team-frontend-debug/roles/analyzer/role.md",
"status": "pending",
"owner": "analyzer",
"blockedBy": ["REPRODUCE-001"]
}
```
### FIX-001 (Debug Mode): Code Fix
```json
{
"id": "FIX-001",
"subject": "FIX-001",
"description": "PURPOSE: Fix the identified bug | Success: Code changes that resolve the root cause\nTASK:\n - Load RCA report from ANALYZE-001\n - Locate the problematic code\n - Implement fix following existing code patterns\n - Run syntax/type check on modified files\nCONTEXT:\n - Session: <session-folder>\n - Upstream: <session>/artifacts/ANALYZE-001-rca.md\nEXPECTED: Modified source files + <session>/artifacts/FIX-001-changes.md\nCONSTRAINTS: Minimal changes | Follow existing code style | No breaking changes\n---\nInnerLoop: true\nRoleSpec: ~ or <project>/.codex/skills/team-frontend-debug/roles/fixer/role.md",
"status": "pending",
"owner": "fixer",
"blockedBy": ["ANALYZE-001"]
}
```
### VERIFY-001 (Debug Mode): Fix Verification
```json
{
"id": "VERIFY-001",
"subject": "VERIFY-001",
"description": "PURPOSE: Verify bug is fixed | Success: Original bug no longer reproduces\nTASK:\n - Navigate to same URL as REPRODUCE-001\n - Execute same reproduction steps\n - Capture evidence and compare with original\n - Confirm bug is resolved and no regressions\nCONTEXT:\n - Session: <session-folder>\n - Original evidence: <session>/evidence/\n - Fix changes: <session>/artifacts/FIX-001-changes.md\nEXPECTED: <session>/artifacts/VERIFY-001-report.md with pass/fail verdict\nCONSTRAINTS: Use Chrome DevTools MCP only | Same steps as reproduction\n---\nInnerLoop: false\nRoleSpec: ~ or <project>/.codex/skills/team-frontend-debug/roles/verifier/role.md",
"status": "pending",
"owner": "verifier",
"blockedBy": ["FIX-001"]
}
```
---
## Dynamic Iteration Tasks
### REPRODUCE-002 (Debug Mode): Supplemental Evidence
Created when Analyzer requests more evidence -- add new entry to `<session>/tasks.json`:
```json
{
"id": "REPRODUCE-002",
"subject": "REPRODUCE-002",
"description": "PURPOSE: Collect additional evidence per Analyzer request | Success: Targeted evidence collected\nTASK: <specific evidence requests from Analyzer>\nCONTEXT: Session + Analyzer request\n---\nInnerLoop: false\nRoleSpec: ~ or <project>/.codex/skills/team-frontend-debug/roles/reproducer/role.md",
"status": "pending",
"owner": "reproducer",
"blockedBy": []
}
```
### FIX-002 (Either Mode): Re-Fix After Failed Verification
Created when Verifier reports fail -- add new entry to `<session>/tasks.json`:
```json
{
"id": "FIX-002",
"subject": "FIX-002",
"description": "PURPOSE: Re-fix based on verification failure feedback | Success: Issue resolved\nTASK: Review VERIFY-001 failure details, apply corrective fix\nCONTEXT: Session + VERIFY-001-report.md\n---\nInnerLoop: true\nRoleSpec: ~ or <project>/.codex/skills/team-frontend-debug/roles/fixer/role.md",
"status": "pending",
"owner": "fixer",
"blockedBy": ["VERIFY-001"]
}
```
## Conditional Skip Rules
| Condition | Action |
|-----------|--------|
| test-pipeline + TEST-001 finds 0 issues | Skip ANALYZE/FIX/VERIFY -> pipeline complete |
| test-pipeline + TEST-001 finds only warnings | request_user_input: fix warnings or complete |
| debug-pipeline + REPRODUCE-001 cannot reproduce | request_user_input: retry with more info or abort |
## InnerLoop Flag Rules
- true: tester (iterates over features), fixer (may need multiple fix passes)
- false: reproducer, analyzer, verifier (single-pass tasks)
## Dependency Validation
- No orphan tasks (all tasks have valid owner)
- No circular dependencies
- All blockedBy references exist
- Session reference in every task description
- RoleSpec reference in every task description

View File

@@ -0,0 +1,143 @@
# Monitor Pipeline
Event-driven pipeline coordination. Beat model: coordinator wake -> process -> spawn -> STOP.
## Constants
- SPAWN_MODE: background
- ONE_STEP_PER_INVOCATION: true
- FAST_ADVANCE_AWARE: true
- WORKER_AGENT: team-worker
## Handler Router
| Source | Handler |
|--------|---------|
| Message contains [role-name] | handleCallback |
| "need_more_evidence" | handleIteration |
| "check" or "status" | handleCheck |
| "resume" or "continue" | handleResume |
| All tasks completed | handleComplete |
| Default | handleSpawnNext |
## handleCallback
Worker completed. Process and advance.
1. Find matching worker by role in message
2. Check if progress update (inner loop) or final completion
3. Progress -> update session state, STOP
4. Completion -> mark task done (read `<session>/tasks.json`, set status to "completed", write back), remove from active_workers
5. Check for special conditions:
- **TEST-001 with 0 issues** -> skip ANALYZE/FIX/VERIFY (mark as completed in tasks.json), handleComplete
- **TEST-001 with only warnings** -> request_user_input: fix warnings or complete
- **TEST-001 with high/medium issues** -> proceed to ANALYZE-001
- ANALYZE-001 with `need_more_evidence: true` -> handleIteration
- VERIFY-001 with `verdict: fail` -> re-dispatch FIX (add FIX-002 entry to tasks.json blocked by VERIFY-001)
- VERIFY-001 with `verdict: pass` -> handleComplete
6. -> handleSpawnNext
## handleIteration
Analyzer needs more evidence. Create supplemental reproduction task.
1. Parse Analyzer's evidence request (dimensions, specific actions)
2. Add REPRODUCE-002 entry to `<session>/tasks.json`:
- Set owner to "reproducer" (no blockedBy -- can start immediately)
3. Add ANALYZE-002 entry to `<session>/tasks.json`:
- Set blockedBy: ["REPRODUCE-002"]
- Update FIX-001 entry to add ANALYZE-002 to its blockedBy
4. Write updated tasks.json
5. -> handleSpawnNext
## handleCheck
Read-only status report, then STOP.
Output:
```
[coordinator] Debug Pipeline Status
[coordinator] Bug: <bug-description-summary>
[coordinator] Progress: <done>/<total> (<pct>%)
[coordinator] Active: <workers with elapsed time>
[coordinator] Ready: <pending tasks with resolved deps>
[coordinator] Evidence: <list of collected evidence types>
[coordinator] Commands: 'resume' to advance | 'check' to refresh
```
## handleResume
1. No active workers -> handleSpawnNext
2. Has active -> check each status
- completed -> mark done
- in_progress -> still running
3. Some completed -> handleSpawnNext
4. All running -> report status, STOP
## handleSpawnNext
Find ready tasks, spawn workers, STOP.
1. Collect: completedSubjects, inProgressSubjects, readySubjects (from tasks.json)
2. No ready + work in progress -> report waiting, STOP
3. No ready + nothing in progress -> handleComplete
4. Has ready -> for each:
a. Check if inner loop role with active worker -> skip (worker picks up)
b. Standard spawn:
- Update task status to "in_progress" in tasks.json
- team_msg log -> task_unblocked
- Spawn team-worker:
```
spawn_agent({
agent_type: "team_worker",
items: [{
description: "Spawn <role> worker for <task-id>",
team_name: "frontend-debug",
name: "<role>",
prompt: `## Role Assignment
role: <role>
role_spec: ~ or <project>/.codex/skills/team-frontend-debug/roles/<role>/role.md
session: <session-folder>
session_id: <session-id>
team_name: frontend-debug
requirement: <task-description>
inner_loop: <true|false>
Read role_spec file to load Phase 2-4 domain instructions.
Execute built-in Phase 1 -> role-spec Phase 2-4 -> built-in Phase 5.`
}]
})
```
- Add to active_workers
5. Update session, output summary, STOP
6. Use `wait_agent({ ids: [<spawned-agent-ids>] })` to wait for callbacks. Workers use `report_agent_job_result()` to send results back.
## handleComplete
Pipeline done. Generate debug report and completion action.
1. Generate debug summary:
- Bug description and reproduction results
- Root cause analysis (from ANALYZE artifacts)
- Code changes applied (from FIX artifacts)
- Verification verdict (from VERIFY artifacts)
- Evidence inventory (screenshots, logs, traces)
2. Read session.completion_action:
- interactive -> request_user_input (Archive/Keep/Export)
- auto_archive -> Archive & Clean (status=completed, remove/archive session folder)
- auto_keep -> Keep Active (status=paused)
## handleAdapt
Not typically needed for debug pipeline. If Analyzer identifies a dimension not covered:
1. Parse gap description
2. Check if reproducer can cover it -> add to evidence plan
3. Create supplemental REPRODUCE task entry in tasks.json
## Fast-Advance Reconciliation
On every coordinator wake:
1. Read team_msg entries with type="fast_advance"
2. Sync active_workers with spawned successors
3. No duplicate spawns

View File

@@ -0,0 +1,131 @@
# Coordinator Role
Orchestrate team-frontend-debug: analyze -> dispatch -> spawn -> monitor -> report.
## Identity
- Name: coordinator | Tag: [coordinator]
- Responsibility: Analyze bug report -> Create team -> Dispatch debug tasks -> Monitor progress -> Report results
## Boundaries
### MUST
- Parse bug report description (text-level only, no codebase reading)
- Create team and spawn team-worker agents in background
- Dispatch tasks with proper dependency chains
- Monitor progress via callbacks and route messages
- Maintain session state (team-session.json)
- Handle iteration loops (analyzer requesting more evidence)
- Execute completion action when pipeline finishes
### MUST NOT
- Read source code or explore codebase (delegate to workers)
- Execute debug/fix work directly
- Modify task output artifacts
- Spawn workers with general-purpose agent (MUST use team-worker)
- Generate more than 5 worker roles
## Command Execution Protocol
When coordinator needs to execute a specific phase:
1. Read `commands/<command>.md`
2. Follow the workflow defined in the command
3. Commands are inline execution guides, NOT separate agents
4. Execute synchronously, complete before proceeding
## Entry Router
| Detection | Condition | Handler |
|-----------|-----------|---------|
| Worker callback | Message contains [role-name] | -> handleCallback (monitor.md) |
| Status check | Args contain "check" or "status" | -> handleCheck (monitor.md) |
| Manual resume | Args contain "resume" or "continue" | -> handleResume (monitor.md) |
| Iteration request | Message contains "need_more_evidence" | -> handleIteration (monitor.md) |
| Pipeline complete | All tasks completed | -> handleComplete (monitor.md) |
| Interrupted session | Active session in .workflow/.team/TFD-* | -> Phase 0 |
| New session | None of above | -> Phase 1 |
For callback/check/resume/iteration/complete: load commands/monitor.md, execute handler, STOP.
## Phase 0: Session Resume Check
1. Scan .workflow/.team/TFD-*/team-session.json for active/paused sessions
2. No sessions -> Phase 1
3. Single session -> reconcile:
a. Audit tasks.json, reset in_progress->pending
b. Rebuild team workers
c. Kick first ready task
4. Multiple -> request_user_input for selection
## Phase 1: Requirement Clarification
TEXT-LEVEL ONLY. No source code reading.
1. Parse user input — detect mode:
- Feature list / 功能清单 → **test-pipeline**
- Bug report / 错误描述 → **debug-pipeline**
- Ambiguous → request_user_input to clarify
2. Extract relevant info based on mode:
- Test mode: base URL, feature list
- Debug mode: bug description, URL, reproduction steps
3. Clarify if ambiguous (request_user_input)
4. Delegate to @commands/analyze.md
5. Output: task-analysis.json
6. CRITICAL: Always proceed to Phase 2, never skip team workflow
## Phase 2: Create Team + Initialize Session
1. Resolve workspace paths (MUST do first):
- `project_root` = result of `Bash({ command: "pwd" })`
- `skill_root` = `<project_root>/.claude/skills/team-frontend-debug`
2. Generate session ID: TFD-<slug>-<date>
3. Create session folder structure:
```
.workflow/.team/TFD-<slug>-<date>/
├── team-session.json
├── evidence/
├── artifacts/
├── wisdom/
└── .msg/
```
3. Initialize session folder (replaces TeamCreate)
4. Read specs/pipelines.md -> select pipeline (default: debug-pipeline)
5. Register roles in team-session.json
6. Initialize pipeline via team_msg state_update
7. Write team-session.json
## Phase 3: Create Task Chain
Delegate to @commands/dispatch.md:
1. Read dependency graph from task-analysis.json
2. Read specs/pipelines.md for debug-pipeline task registry
3. Topological sort tasks
4. Build tasks array as JSON entries in `<session>/tasks.json`; set deps via `blockedBy` field in each entry
5. Update team-session.json
## Phase 4: Spawn-and-Stop
Delegate to @commands/monitor.md#handleSpawnNext:
1. Find ready tasks (pending + blockedBy resolved)
2. Spawn team-worker agents (see SKILL.md Spawn Template)
3. Output status summary
4. STOP
## Phase 5: Report + Completion Action
1. Generate debug summary:
- Bug description and reproduction results
- Root cause analysis findings
- Files modified and patches applied
- Verification results (pass/fail)
2. Execute completion action per session.completion_action:
- interactive -> request_user_input (Archive/Keep/Export)
- auto_archive -> Archive & Clean
## Error Handling
| Error | Resolution |
|-------|------------|
| Bug report too vague | request_user_input for URL, steps, expected behavior |
| Session corruption | Attempt recovery, fallback to manual |
| Worker crash | Reset task to pending, respawn |
| Dependency cycle | Detect in analysis, halt |
| Browser unavailable | Report to user, suggest manual steps |

View File

@@ -0,0 +1,147 @@
---
role: fixer
prefix: FIX
inner_loop: true
message_types:
success: fix_complete
progress: fix_progress
error: error
---
# Fixer
Code fix implementation based on root cause analysis.
## Identity
- Tag: [fixer] | Prefix: FIX-*
- Responsibility: Implement code fixes based on RCA report, validate with syntax checks
## Boundaries
### MUST
- Read RCA report before any code changes
- Locate exact source code to modify
- Follow existing code patterns and style
- Run syntax/type check after modifications
- Document all changes made
### MUST NOT
- Skip reading the RCA report
- Make changes unrelated to the identified root cause
- Introduce new dependencies without justification
- Skip syntax validation after changes
- Make breaking changes to public APIs
## Phase 2: Parse RCA + Plan Fix
1. Read upstream artifacts via team_msg(operation="get_state", role="analyzer")
2. Extract RCA report path from analyzer's state_update ref
3. Load RCA report: `<session>/artifacts/ANALYZE-001-rca.md`
4. Extract:
- Root cause category and description
- Source file(s) and line(s)
- Recommended fix approach
- Risk level
5. Read identified source files to understand context
6. Search for similar patterns in codebase:
```
mcp__ace-tool__search_context({
project_root_path: "<project-root>",
query: "<function/component name from RCA>"
})
```
7. Plan fix approach:
- Minimal change that addresses root cause
- Consistent with existing code patterns
- No side effects on other functionality
## Phase 3: Implement Fix
### Fix Strategy by Category
| Category | Typical Fix | Tools |
|----------|-------------|-------|
| TypeError / null | Add null check, default value | Edit |
| API Error | Fix URL, add error handling | Edit |
| Missing import | Add import statement | Edit |
| CSS/Rendering | Fix styles, layout properties | Edit |
| State bug | Fix state update logic | Edit |
| Race condition | Add proper async handling | Edit |
| Performance | Optimize render, memoize | Edit |
### Implementation Steps
1. Read the target file(s)
2. Apply minimal code changes using Edit tool
3. If Edit fails, use mcp__ccw-tools__edit_file as fallback
4. For each modified file:
- Keep changes minimal and focused
- Preserve existing code style (indentation, naming)
- Add inline comment only if fix is non-obvious
### Syntax Validation
After all changes:
```
mcp__ide__getDiagnostics({ uri: "file://<modified-file>" })
```
If diagnostics show errors:
- Fix syntax/type errors
- Re-validate
- Max 3 fix iterations for syntax issues
## Phase 4: Document Changes + Report
Write `<session>/artifacts/FIX-001-changes.md`:
```markdown
# Fix Report
## Root Cause Reference
- RCA: <session>/artifacts/ANALYZE-001-rca.md
- Category: <category>
- Source: <file:line>
## Changes Applied
### <file-path>
- **Line(s)**: <line numbers>
- **Change**: <description of what was changed>
- **Reason**: <why this change fixes the root cause>
## Validation
- Syntax check: <pass/fail>
- Type check: <pass/fail>
- Diagnostics: <clean / N warnings>
## Files Modified
- <file1.ts>
- <file2.tsx>
## Risk Assessment
- Breaking changes: <none / description>
- Side effects: <none / potential>
- Rollback: <how to revert>
```
Send state_update:
```json
{
"status": "task_complete",
"task_id": "FIX-001",
"ref": "<session>/artifacts/FIX-001-changes.md",
"key_findings": ["Fixed <root-cause-summary>", "Modified N files"],
"decisions": ["Applied <fix-approach>"],
"files_modified": ["path/to/file1.ts", "path/to/file2.tsx"],
"verification": "self-validated"
}
```
## Error Handling
| Scenario | Resolution |
|----------|------------|
| Source file not found | Search codebase, report if not found |
| RCA location incorrect | Use ACE search to find correct location |
| Syntax errors after fix | Iterate fix (max 3 attempts) |
| Fix too complex | Report complexity, suggest manual intervention |
| Multiple files need changes | Apply all changes, validate each |

View File

@@ -0,0 +1,147 @@
---
role: reproducer
prefix: REPRODUCE
inner_loop: false
message_types:
success: evidence_ready
error: error
---
# Reproducer
Bug reproduction and evidence collection using Chrome DevTools MCP.
## Identity
- Tag: [reproducer] | Prefix: REPRODUCE-*
- Responsibility: Reproduce bug in browser, collect structured debug evidence
## Boundaries
### MUST
- Navigate to target URL using Chrome DevTools MCP
- Execute reproduction steps precisely
- Collect ALL evidence types specified in evidence plan
- Save evidence to session evidence/ directory
- Report reproduction success/failure with evidence paths
### MUST NOT
- Modify source code or any project files
- Make architectural decisions or suggest fixes
- Skip evidence collection for any planned dimension
- Navigate away from target URL without completing steps
## Phase 2: Prepare Reproduction
1. Read upstream artifacts via team_msg(operation="get_state")
2. Extract from task description:
- Session folder path
- Target URL
- Reproduction steps (ordered list)
- Evidence plan (which dimensions to capture)
3. Verify browser is accessible:
```
mcp__chrome-devtools__list_pages()
```
4. If no pages available, report error to coordinator
## Phase 3: Execute Reproduction + Collect Evidence
### Step 3.1: Navigate to Target
```
mcp__chrome-devtools__navigate_page({ type: "url", url: "<target-url>" })
```
Wait for page load:
```
mcp__chrome-devtools__wait_for({ text: ["<expected-element>"], timeout: 10000 })
```
### Step 3.2: Capture Baseline Evidence
Before executing steps, capture baseline state:
| Evidence Type | Tool | Save To |
|---------------|------|---------|
| Screenshot (before) | `take_screenshot({ filePath: "<session>/evidence/before-screenshot.png" })` | evidence/ |
| DOM Snapshot (before) | `take_snapshot({ filePath: "<session>/evidence/before-snapshot.txt" })` | evidence/ |
| Console messages | `list_console_messages()` | In-memory for comparison |
### Step 3.3: Execute Reproduction Steps
For each reproduction step:
1. Parse action type from step description:
| Action | Tool |
|--------|------|
| Click element | `mcp__chrome-devtools__click({ uid: "<uid>" })` |
| Fill input | `mcp__chrome-devtools__fill({ uid: "<uid>", value: "<value>" })` |
| Hover element | `mcp__chrome-devtools__hover({ uid: "<uid>" })` |
| Press key | `mcp__chrome-devtools__press_key({ key: "<key>" })` |
| Wait for element | `mcp__chrome-devtools__wait_for({ text: ["<text>"] })` |
| Run script | `mcp__chrome-devtools__evaluate_script({ function: "<js>" })` |
2. After each step, take snapshot to track DOM changes if needed
3. If step involves finding an element by text/role:
- First `take_snapshot()` to get current DOM with uids
- Find target uid from snapshot
- Execute action with uid
### Step 3.4: Capture Post-Action Evidence
After all steps executed:
| Evidence | Tool | Condition |
|----------|------|-----------|
| Screenshot (after) | `take_screenshot({ filePath: "<session>/evidence/after-screenshot.png" })` | Always |
| DOM Snapshot (after) | `take_snapshot({ filePath: "<session>/evidence/after-snapshot.txt" })` | Always |
| Console Errors | `list_console_messages({ types: ["error", "warn"] })` | Always |
| All Console Logs | `list_console_messages()` | If console dimension |
| Network Requests | `list_network_requests()` | If network dimension |
| Failed Requests | `list_network_requests({ resourceTypes: ["xhr", "fetch"] })` | If network dimension |
| Request Details | `get_network_request({ reqid: <id> })` | For failed/suspicious requests |
| Performance Trace | `performance_start_trace()` + reproduce + `performance_stop_trace()` | If performance dimension |
### Step 3.5: Save Evidence Summary
Write `<session>/evidence/evidence-summary.json`:
```json
{
"reproduction_success": true,
"target_url": "<url>",
"steps_executed": ["step1", "step2"],
"evidence_collected": {
"screenshots": ["before-screenshot.png", "after-screenshot.png"],
"snapshots": ["before-snapshot.txt", "after-snapshot.txt"],
"console_errors": [{ "type": "error", "text": "..." }],
"network_failures": [{ "url": "...", "status": 500, "method": "GET" }],
"performance_trace": "trace.json"
},
"observations": ["Error X appeared after step 3", "Network request Y failed"]
}
```
## Phase 4: Report
1. Write evidence summary to session evidence/
2. Send state_update:
```json
{
"status": "task_complete",
"task_id": "REPRODUCE-001",
"ref": "<session>/evidence/evidence-summary.json",
"key_findings": ["Bug reproduced successfully", "3 console errors captured", "1 failed API request"],
"decisions": [],
"verification": "self-validated"
}
```
3. Report: reproduction result, evidence inventory, key observations
## Error Handling
| Scenario | Resolution |
|----------|------------|
| Page fails to load | Retry once, then report navigation error |
| Element not found | Take snapshot, search alternative selectors, report if still not found |
| Bug not reproduced | Report with evidence of non-reproduction, suggest step refinement |
| Browser disconnected | Report error to coordinator |
| Timeout during wait | Capture current state, report partial reproduction |

View File

@@ -0,0 +1,231 @@
---
role: tester
prefix: TEST
inner_loop: true
message_types:
success: test_complete
progress: test_progress
error: error
---
# Tester
Feature-driven testing using Chrome DevTools MCP. Proactively discover bugs from feature list.
## Identity
- Tag: [tester] | Prefix: TEST-*
- Responsibility: Parse feature list → generate test scenarios → execute in browser → report discovered issues
## Boundaries
### MUST
- Parse feature list into testable scenarios
- Navigate to each feature's page using Chrome DevTools MCP
- Execute test scenarios with user interaction simulation
- Capture evidence for each test (screenshot, console, network)
- Classify results: pass / fail / warning
- Report all discovered issues with evidence
### MUST NOT
- Modify source code or project files
- Skip features in the list
- Report pass without actually testing
- Make assumptions about expected behavior without evidence
## Phase 2: Parse Feature List + Plan Tests
1. Read upstream artifacts via team_msg(operation="get_state")
2. Extract from task description:
- Session folder path
- Feature list (structured or free-text)
- Base URL for the application
3. Parse each feature into test items:
```json
{
"features": [
{
"id": "F-001",
"name": "用户登录",
"url": "/login",
"scenarios": [
{ "name": "正常登录", "steps": ["填写用户名", "填写密码", "点击登录"], "expected": "跳转到首页" },
{ "name": "空密码登录", "steps": ["填写用户名", "点击登录"], "expected": "显示密码必填提示" }
]
}
]
}
```
4. If feature descriptions lack detail, use page exploration to generate scenarios:
- Navigate to feature URL
- Take snapshot to discover interactive elements
- Generate scenarios from available UI elements (forms, buttons, links)
## Phase 3: Execute Tests
### Inner Loop: Process One Feature at a Time
For each feature in the list:
#### Step 3.1: Navigate to Feature Page
```
mcp__chrome-devtools__navigate_page({ type: "url", url: "<base-url><feature-url>" })
mcp__chrome-devtools__wait_for({ text: ["<expected-element>"], timeout: 10000 })
```
#### Step 3.2: Explore Page Structure
```
mcp__chrome-devtools__take_snapshot()
```
Parse snapshot to identify:
- Interactive elements (buttons, inputs, links, selects)
- Form fields and their labels
- Navigation elements
- Dynamic content areas
If no predefined scenarios, generate test scenarios from discovered elements.
#### Step 3.3: Execute Each Scenario
For each scenario:
1. **Capture baseline**:
```
mcp__chrome-devtools__take_screenshot({ filePath: "<session>/evidence/F-<id>-<scenario>-before.png" })
mcp__chrome-devtools__list_console_messages() // baseline errors
```
2. **Execute steps**:
- Map step descriptions to MCP actions:
| Step Pattern | MCP Action |
|-------------|------------|
| 点击/click XX | `take_snapshot` → find uid → `click({ uid })` |
| 填写/输入/fill XX with YY | `take_snapshot` → find uid → `fill({ uid, value })` |
| 悬停/hover XX | `take_snapshot` → find uid → `hover({ uid })` |
| 等待/wait XX | `wait_for({ text: ["XX"] })` |
| 导航/navigate to XX | `navigate_page({ type: "url", url: "XX" })` |
| 按键/press XX | `press_key({ key: "XX" })` |
| 滚动/scroll | `evaluate_script({ function: "() => window.scrollBy(0, 500)" })` |
3. **Capture result**:
```
mcp__chrome-devtools__take_screenshot({ filePath: "<session>/evidence/F-<id>-<scenario>-after.png" })
mcp__chrome-devtools__list_console_messages({ types: ["error", "warn"] })
mcp__chrome-devtools__list_network_requests({ resourceTypes: ["xhr", "fetch"] })
```
#### Step 3.4: Evaluate Scenario Result
| Check | Pass Condition | Fail Condition |
|-------|---------------|----------------|
| Console errors | No new errors after action | New Error/TypeError/ReferenceError |
| Network requests | All 2xx responses | Any 4xx/5xx response |
| Expected text | Expected text appears on page | Expected text not found |
| Visual state | Page renders without broken layout | Blank area, overflow, missing elements |
| Page responsive | Actions complete within timeout | Timeout or page freeze |
Classify result:
```
pass: All checks pass
fail: Console error OR network failure OR expected behavior not met
warning: Deprecation warnings OR slow response (>3s) OR minor visual issue
```
#### Step 3.5: Report Progress (Inner Loop)
After each feature, send progress via state_update:
```json
{
"status": "in_progress",
"task_id": "TEST-001",
"progress": "3/5 features tested",
"issues_found": 2
}
```
## Phase 4: Test Report
Write `<session>/artifacts/TEST-001-report.md`:
```markdown
# Test Report
## Summary
- **Features tested**: N
- **Passed**: X
- **Failed**: Y
- **Warnings**: Z
- **Test date**: <timestamp>
- **Base URL**: <url>
## Results by Feature
### F-001: <feature-name> — PASS/FAIL/WARNING
**Scenarios:**
| # | Scenario | Result | Issue |
|---|----------|--------|-------|
| 1 | <scenario-name> | PASS | — |
| 2 | <scenario-name> | FAIL | Console TypeError at step 3 |
**Evidence:**
- Screenshot (before): evidence/F-001-scenario1-before.png
- Screenshot (after): evidence/F-001-scenario1-after.png
- Console errors: [list]
- Network failures: [list]
### F-002: ...
## Discovered Issues
| ID | Feature | Severity | Description | Evidence |
|----|---------|----------|-------------|----------|
| BUG-001 | F-001 | High | TypeError on login submit | Console error + screenshot |
| BUG-002 | F-003 | Medium | API returns 500 on save | Network log |
| BUG-003 | F-005 | Low | Deprecation warning in console | Console warning |
```
Write `<session>/artifacts/TEST-001-issues.json`:
```json
{
"issues": [
{
"id": "BUG-001",
"feature": "F-001",
"feature_name": "用户登录",
"severity": "high",
"description": "点击登录按钮后控制台报TypeError",
"category": "javascript_error",
"evidence": {
"console_errors": ["TypeError: Cannot read property 'token' of undefined"],
"screenshot": "evidence/F-001-login-after.png",
"network_failures": []
},
"reproduction_steps": ["导航到/login", "填写用户名admin", "填写密码test", "点击登录按钮"]
}
]
}
```
Send state_update:
```json
{
"status": "task_complete",
"task_id": "TEST-001",
"ref": "<session>/artifacts/TEST-001-report.md",
"key_findings": ["Tested N features", "Found X issues (Y high, Z medium)"],
"decisions": [],
"verification": "tested",
"issues_ref": "<session>/artifacts/TEST-001-issues.json"
}
```
## Error Handling
| Scenario | Resolution |
|----------|------------|
| Feature URL not accessible | Log as failed, continue to next feature |
| Element not found for action | Take snapshot, search alternatives, skip scenario if not found |
| Page crash during test | Capture console, reload, continue next scenario |
| All features pass | Report success, no downstream ANALYZE needed |
| Timeout during interaction | Capture current state, mark as warning, continue |

View File

@@ -0,0 +1,172 @@
---
role: verifier
prefix: VERIFY
inner_loop: false
message_types:
success: verification_result
error: error
---
# Verifier
Fix verification using Chrome DevTools MCP to confirm bug resolution.
## Identity
- Tag: [verifier] | Prefix: VERIFY-*
- Responsibility: Re-execute reproduction steps after fix, verify bug is resolved
## Boundaries
### MUST
- Execute EXACT same reproduction steps as Reproducer
- Capture same evidence types for comparison
- Compare before/after evidence objectively
- Report clear pass/fail verdict
### MUST NOT
- Modify source code or project files
- Skip any reproduction step
- Report pass without evidence comparison
- Make subjective judgments without evidence
## Phase 2: Load Context
1. Read upstream artifacts via team_msg(operation="get_state")
2. Load from multiple upstream roles:
- Reproducer: evidence-summary.json (original evidence + steps)
- Fixer: FIX-001-changes.md (what was changed)
3. Extract:
- Target URL
- Reproduction steps (exact same sequence)
- Original evidence for comparison
- Expected behavior (from bug report)
- Files modified by fixer
## Phase 3: Execute Verification
### Step 3.1: Pre-Verification Check
Verify fix was applied:
- Check that modified files exist and contain expected changes
- If running in dev server context, ensure server reflects changes
### Step 3.2: Navigate and Reproduce
Execute SAME steps as Reproducer:
```
mcp__chrome-devtools__navigate_page({ type: "url", url: "<target-url>" })
mcp__chrome-devtools__wait_for({ text: ["<expected-element>"], timeout: 10000 })
```
### Step 3.3: Capture Post-Fix Evidence
Capture same evidence types as original reproduction:
| Evidence | Tool | Save To |
|----------|------|---------|
| Screenshot | `take_screenshot({ filePath: "<session>/evidence/verify-screenshot.png" })` | evidence/ |
| DOM Snapshot | `take_snapshot({ filePath: "<session>/evidence/verify-snapshot.txt" })` | evidence/ |
| Console Messages | `list_console_messages({ types: ["error", "warn"] })` | In-memory |
| Network Requests | `list_network_requests({ resourceTypes: ["xhr", "fetch"] })` | In-memory |
### Step 3.4: Execute Reproduction Steps
For each step from original reproduction:
1. Execute same action (click, fill, hover, etc.)
2. Observe result
3. Note any differences from original reproduction
### Step 3.5: Capture Final State
After all steps:
- Screenshot of final state
- Console messages (check for new errors)
- Network requests (check for new failures)
## Phase 4: Compare and Report
### Comparison Criteria
| Dimension | Pass | Fail |
|-----------|------|------|
| Console Errors | Original error no longer appears | Original error still present |
| Network | Failed request now succeeds | Request still fails |
| Visual | Expected rendering achieved | Bug still visible |
| DOM | Expected structure present | Structure still wrong |
| New Errors | No new errors introduced | New errors detected |
### Verdict Logic
```
if original_error_resolved AND no_new_errors:
verdict = "pass"
elif original_error_resolved AND has_new_errors:
verdict = "pass_with_warnings" # bug fixed but new issues
else:
verdict = "fail"
```
### Write Verification Report
Write `<session>/artifacts/VERIFY-001-report.md`:
```markdown
# Verification Report
## Verdict: <PASS / PASS_WITH_WARNINGS / FAIL>
## Bug Status
- **Original bug**: <resolved / still present>
- **Reproduction steps**: <all executed / partial>
## Evidence Comparison
### Console Errors
- **Before fix**: <N errors>
- <error 1>
- <error 2>
- **After fix**: <N errors>
- <error 1 if any>
- **Resolution**: <original errors cleared / still present>
### Network Requests
- **Before fix**: <N failed requests>
- **After fix**: <N failed requests>
- **Resolution**: <requests now succeed / still failing>
### Visual Comparison
- **Before fix**: <description or screenshot ref>
- **After fix**: <description or screenshot ref>
- **Resolution**: <visual bug fixed / still present>
## Regression Check
- **New console errors**: <none / list>
- **New network failures**: <none / list>
- **Visual regressions**: <none / description>
## Files Verified
- <file1.ts> — changes confirmed applied
- <file2.tsx> — changes confirmed applied
```
Send state_update:
```json
{
"status": "task_complete",
"task_id": "VERIFY-001",
"ref": "<session>/artifacts/VERIFY-001-report.md",
"key_findings": ["Verdict: <PASS/FAIL>", "Original bug: <resolved/present>"],
"decisions": [],
"verification": "tested",
"verdict": "<pass|pass_with_warnings|fail>"
}
```
## Error Handling
| Scenario | Resolution |
|----------|------------|
| Page fails to load | Retry once, report if still fails |
| Fix not applied | Report to coordinator, suggest re-fix |
| New errors detected | Report pass_with_warnings with details |
| Bug still present | Report fail with evidence comparison |
| Partial reproduction | Report with completed steps, note gaps |

View File

@@ -1,198 +0,0 @@
# Team Frontend Debug -- CSV Schema
## Master CSV: tasks.csv
### Column Definitions
#### Input Columns (Set by Decomposer)
| Column | Type | Required | Description | Example |
|--------|------|----------|-------------|---------|
| `id` | string | Yes | Unique task identifier (PREFIX-NNN) | `"TEST-001"` |
| `title` | string | Yes | Short task title | `"Feature testing"` |
| `description` | string | Yes | Detailed task description (self-contained) with PURPOSE/TASK/CONTEXT/EXPECTED/CONSTRAINTS | `"PURPOSE: Test all features from list..."` |
| `role` | enum | Yes | Worker role: `tester`, `reproducer`, `analyzer`, `fixer`, `verifier` | `"tester"` |
| `pipeline_mode` | enum | Yes | Pipeline mode: `test-pipeline` or `debug-pipeline` | `"test-pipeline"` |
| `base_url` | string | No | Target URL for browser-based tasks | `"http://localhost:3000"` |
| `evidence_dimensions` | string | No | Semicolon-separated evidence types to collect | `"screenshot;console;network"` |
| `deps` | string | No | Semicolon-separated dependency task IDs | `"TEST-001"` |
| `context_from` | string | No | Semicolon-separated task IDs for context | `"TEST-001"` |
| `exec_mode` | enum | Yes | Execution mechanism: `csv-wave` or `interactive` | `"csv-wave"` |
#### Computed Columns (Set by Wave Engine)
| Column | Type | Description | Example |
|--------|------|-------------|---------|
| `wave` | integer | Wave number (1-based, from topological sort) | `2` |
| `prev_context` | string | Aggregated findings from context_from tasks (per-wave CSV only) | `"[TEST-001] Found 3 issues: 2 high, 1 medium..."` |
#### Output Columns (Set by Agent)
| Column | Type | Description | Example |
|--------|------|-------------|---------|
| `status` | enum | `pending` -> `completed` / `failed` / `skipped` | `"completed"` |
| `findings` | string | Key discoveries (max 500 chars) | `"Tested 5 features: 3 pass, 2 fail. BUG-001: TypeError on login. BUG-002: API 500 on save."` |
| `artifacts_produced` | string | Semicolon-separated paths of produced artifacts | `"artifacts/TEST-001-report.md;artifacts/TEST-001-issues.json"` |
| `issues_count` | string | Number of issues found (tester/analyzer only, empty for others) | `"2"` |
| `verdict` | string | Verification verdict: `pass`, `pass_with_warnings`, `fail` (verifier only) | `"pass"` |
| `error` | string | Error message if failed | `""` |
---
### exec_mode Values
| Value | Mechanism | Description |
|-------|-----------|-------------|
| `csv-wave` | `spawn_agents_on_csv` | One-shot batch execution within wave |
| `interactive` | `spawn_agent`/`wait`/`send_input`/`close_agent` | Multi-round individual execution |
Interactive tasks appear in master CSV for dependency tracking but are NOT included in wave-{N}.csv files.
---
### Role Prefixes
| Role | Prefix | Pipeline | Inner Loop |
|------|--------|----------|------------|
| tester | TEST | test-pipeline | Yes (iterates over features) |
| reproducer | REPRODUCE | debug-pipeline | No |
| analyzer | ANALYZE | both | No |
| fixer | FIX | both | Yes (may need multiple fix passes) |
| verifier | VERIFY | both | No |
---
### Example Data (Test Pipeline)
```csv
id,title,description,role,pipeline_mode,base_url,evidence_dimensions,deps,context_from,exec_mode,wave,status,findings,artifacts_produced,issues_count,verdict,error
"TEST-001","Feature testing","PURPOSE: Test all features from feature list and discover issues | Success: All features tested with pass/fail results\nTASK:\n- Parse feature list\n- Navigate to each feature URL using Chrome DevTools\n- Execute test scenarios (click, fill, hover)\n- Capture evidence: screenshots, console logs, network requests\n- Classify results: pass/fail/warning\nCONTEXT:\n- Session: .workflow/.csv-wave/tfd-login-test-20260308\n- Base URL: http://localhost:3000\n- Features: Login, Dashboard, Profile\nEXPECTED: artifacts/TEST-001-report.md + artifacts/TEST-001-issues.json\nCONSTRAINTS: Chrome DevTools MCP only | No code modifications","tester","test-pipeline","http://localhost:3000","screenshot;console;network","","","csv-wave","1","pending","","","","",""
"ANALYZE-001","Root cause analysis","PURPOSE: Analyze discovered issues to identify root causes | Success: RCA for each high/medium issue\nTASK:\n- Load test report and issues list\n- Analyze console errors, network failures, DOM anomalies\n- Map to source code locations\nCONTEXT:\n- Session: .workflow/.csv-wave/tfd-login-test-20260308\n- Upstream: artifacts/TEST-001-issues.json\nEXPECTED: artifacts/ANALYZE-001-rca.md","analyzer","test-pipeline","","console;network","TEST-001","TEST-001","csv-wave","2","pending","","","","",""
"FIX-001","Fix all issues","PURPOSE: Fix identified issues | Success: All high/medium issues resolved\nTASK:\n- Load RCA report\n- Locate and fix each root cause\n- Run syntax/type checks\nCONTEXT:\n- Session: .workflow/.csv-wave/tfd-login-test-20260308\n- Upstream: artifacts/ANALYZE-001-rca.md\nEXPECTED: Modified source files + artifacts/FIX-001-changes.md","fixer","test-pipeline","","","ANALYZE-001","ANALYZE-001","csv-wave","3","pending","","","","",""
"VERIFY-001","Verify fixes","PURPOSE: Re-test failed scenarios to verify fixes | Success: Previously failed scenarios now pass\nTASK:\n- Re-execute failed test scenarios\n- Capture evidence and compare\n- Report pass/fail per scenario\nCONTEXT:\n- Session: .workflow/.csv-wave/tfd-login-test-20260308\n- Original: artifacts/TEST-001-report.md\n- Fix: artifacts/FIX-001-changes.md\nEXPECTED: artifacts/VERIFY-001-report.md","verifier","test-pipeline","http://localhost:3000","screenshot;console;network","FIX-001","FIX-001;TEST-001","csv-wave","4","pending","","","","",""
```
### Example Data (Debug Pipeline)
```csv
id,title,description,role,pipeline_mode,base_url,evidence_dimensions,deps,context_from,exec_mode,wave,status,findings,artifacts_produced,issues_count,verdict,error
"REPRODUCE-001","Bug reproduction","PURPOSE: Reproduce bug and collect evidence | Success: Bug reproduced with artifacts\nTASK:\n- Navigate to target URL\n- Execute reproduction steps\n- Capture screenshots, snapshots, console logs, network\nCONTEXT:\n- Session: .workflow/.csv-wave/tfd-save-crash-20260308\n- Bug URL: http://localhost:3000/settings\n- Steps: 1. Click save 2. Observe white screen\nEXPECTED: evidence/ directory with all captures","reproducer","debug-pipeline","http://localhost:3000/settings","screenshot;console;network;snapshot","","","csv-wave","1","pending","","","","",""
"ANALYZE-001","Root cause analysis","PURPOSE: Analyze evidence to find root cause | Success: RCA with file:line location\nTASK:\n- Load evidence from reproducer\n- Analyze console errors and stack traces\n- Map to source code\nCONTEXT:\n- Session: .workflow/.csv-wave/tfd-save-crash-20260308\n- Upstream: evidence/\nEXPECTED: artifacts/ANALYZE-001-rca.md","analyzer","debug-pipeline","","","REPRODUCE-001","REPRODUCE-001","csv-wave","2","pending","","","","",""
"FIX-001","Code fix","PURPOSE: Fix the identified bug | Success: Root cause resolved\nTASK:\n- Load RCA report\n- Implement fix\n- Validate syntax\nCONTEXT:\n- Session: .workflow/.csv-wave/tfd-save-crash-20260308\n- Upstream: artifacts/ANALYZE-001-rca.md\nEXPECTED: Modified files + artifacts/FIX-001-changes.md","fixer","debug-pipeline","","","ANALYZE-001","ANALYZE-001","csv-wave","3","pending","","","","",""
"VERIFY-001","Fix verification","PURPOSE: Verify bug is fixed | Success: Original bug no longer reproduces\nTASK:\n- Same reproduction steps as REPRODUCE-001\n- Capture evidence and compare\n- Confirm resolution\nCONTEXT:\n- Session: .workflow/.csv-wave/tfd-save-crash-20260308\n- Original: evidence/\n- Fix: artifacts/FIX-001-changes.md\nEXPECTED: artifacts/VERIFY-001-report.md","verifier","debug-pipeline","http://localhost:3000/settings","screenshot;console;network;snapshot","FIX-001","FIX-001;REPRODUCE-001","csv-wave","4","pending","","","","",""
```
---
### Column Lifecycle
```
Decomposer (Phase 1) Wave Engine (Phase 2) Agent (Execution)
--------------------- -------------------- -----------------
id ----------> id ----------> id
title ----------> title ----------> (reads)
description ----------> description ----------> (reads)
role ----------> role ----------> (reads)
pipeline_mode ---------> pipeline_mode ---------> (reads)
base_url ----------> base_url ----------> (reads)
evidence_dimensions ---> evidence_dimensions ---> (reads)
deps ----------> deps ----------> (reads)
context_from----------> context_from----------> (reads)
exec_mode ----------> exec_mode ----------> (reads)
wave ----------> (reads)
prev_context ----------> (reads)
status
findings
artifacts_produced
issues_count
verdict
error
```
---
## Output Schema (JSON)
Agent output via `report_agent_job_result` (csv-wave tasks):
Tester output:
```json
{
"id": "TEST-001",
"status": "completed",
"findings": "Tested 5 features: 3 pass, 2 fail. BUG-001: TypeError on login submit. BUG-002: API 500 on profile save.",
"artifacts_produced": "artifacts/TEST-001-report.md;artifacts/TEST-001-issues.json",
"issues_count": "2",
"verdict": "",
"error": ""
}
```
Verifier output:
```json
{
"id": "VERIFY-001",
"status": "completed",
"findings": "Original bug resolved. Login error no longer appears. No new console errors. No new network failures.",
"artifacts_produced": "artifacts/VERIFY-001-report.md",
"issues_count": "",
"verdict": "pass",
"error": ""
}
```
Interactive tasks output via structured text or JSON written to `interactive/{id}-result.json`.
---
## Discovery Types
| Type | Dedup Key | Data Schema | Description |
|------|-----------|-------------|-------------|
| `feature_tested` | `data.feature` | `{feature, name, result, issues}` | Feature test result |
| `bug_reproduced` | `data.url` | `{url, steps, console_errors, network_failures}` | Bug reproduction result |
| `evidence_collected` | `data.dimension+data.file` | `{dimension, file, description}` | Evidence artifact saved |
| `root_cause_found` | `data.file+data.line` | `{category, file, line, confidence}` | Root cause identified |
| `file_modified` | `data.file` | `{file, change, lines_added}` | Code fix applied |
| `verification_result` | `data.verdict` | `{verdict, original_error_resolved, new_errors}` | Fix verification |
| `issue_found` | `data.file+data.line` | `{file, line, severity, description}` | Issue discovered |
### Discovery NDJSON Format
```jsonl
{"ts":"2026-03-08T10:00:00Z","worker":"TEST-001","type":"feature_tested","data":{"feature":"F-001","name":"Login","result":"fail","issues":1}}
{"ts":"2026-03-08T10:05:00Z","worker":"REPRODUCE-001","type":"bug_reproduced","data":{"url":"/settings","steps":3,"console_errors":2,"network_failures":0}}
{"ts":"2026-03-08T10:10:00Z","worker":"ANALYZE-001","type":"root_cause_found","data":{"category":"TypeError","file":"src/components/Settings.tsx","line":142,"confidence":"high"}}
{"ts":"2026-03-08T10:15:00Z","worker":"FIX-001","type":"file_modified","data":{"file":"src/components/Settings.tsx","change":"Added null check for user object","lines_added":3}}
```
> Both csv-wave and interactive agents read/write the same discoveries.ndjson file.
---
## Cross-Mechanism Context Flow
| Source | Target | Mechanism |
|--------|--------|-----------|
| CSV task findings | Interactive task | Injected via spawn message or send_input |
| Interactive task result | CSV task prev_context | Read from interactive/{id}-result.json |
| Any agent discovery | Any agent | Shared via discoveries.ndjson |
---
## Validation Rules
| Rule | Check | Error |
|------|-------|-------|
| Unique IDs | No duplicate `id` values | "Duplicate task ID: {id}" |
| Valid deps | All dep IDs exist in tasks | "Unknown dependency: {dep_id}" |
| No self-deps | Task cannot depend on itself | "Self-dependency: {id}" |
| No circular deps | Topological sort completes | "Circular dependency detected involving: {ids}" |
| context_from valid | All context IDs exist and in earlier waves | "Invalid context_from: {id}" |
| exec_mode valid | Value is `csv-wave` or `interactive` | "Invalid exec_mode: {value}" |
| Description non-empty | Every task has description | "Empty description for task: {id}" |
| Status enum | status in {pending, completed, failed, skipped} | "Invalid status: {status}" |
| Role valid | role in {tester, reproducer, analyzer, fixer, verifier} | "Invalid role: {role}" |
| Pipeline mode valid | pipeline_mode in {test-pipeline, debug-pipeline} | "Invalid pipeline_mode: {mode}" |
| Verdict valid | verdict in {pass, pass_with_warnings, fail, ""} | "Invalid verdict: {verdict}" |
| Base URL for browser tasks | tester/reproducer/verifier have non-empty base_url | "Missing base_url for browser task: {id}" |

View File

@@ -0,0 +1,215 @@
# Chrome DevTools MCP Usage Patterns
Reference for debug tool usage across all roles. Reproducer and Verifier are primary consumers.
## 1. Navigation & Page Control
### Navigate to URL
```
mcp__chrome-devtools__navigate_page({ type: "url", url: "http://localhost:3000/page" })
```
### Wait for Page Load
```
mcp__chrome-devtools__wait_for({ text: ["Expected Text"], timeout: 10000 })
```
### Reload Page
```
mcp__chrome-devtools__navigate_page({ type: "reload" })
```
### List Open Pages
```
mcp__chrome-devtools__list_pages()
```
### Select Page
```
mcp__chrome-devtools__select_page({ pageId: 0 })
```
## 2. User Interaction Simulation
### Click Element
```
// First take snapshot to find uid
mcp__chrome-devtools__take_snapshot()
// Then click by uid
mcp__chrome-devtools__click({ uid: "<uid-from-snapshot>" })
```
### Fill Input
```
mcp__chrome-devtools__fill({ uid: "<uid>", value: "input text" })
```
### Fill Multiple Fields
```
mcp__chrome-devtools__fill_form({
elements: [
{ uid: "<uid1>", value: "value1" },
{ uid: "<uid2>", value: "value2" }
]
})
```
### Hover Element
```
mcp__chrome-devtools__hover({ uid: "<uid>" })
```
### Press Key
```
mcp__chrome-devtools__press_key({ key: "Enter" })
mcp__chrome-devtools__press_key({ key: "Control+A" })
```
### Type Text
```
mcp__chrome-devtools__type_text({ text: "typed content", submitKey: "Enter" })
```
## 3. Evidence Collection
### Screenshot
```
// Full viewport
mcp__chrome-devtools__take_screenshot({ filePath: "<session>/evidence/screenshot.png" })
// Full page
mcp__chrome-devtools__take_screenshot({ filePath: "<path>", fullPage: true })
// Specific element
mcp__chrome-devtools__take_screenshot({ uid: "<uid>", filePath: "<path>" })
```
### DOM/A11y Snapshot
```
// Standard snapshot
mcp__chrome-devtools__take_snapshot()
// Verbose (all a11y info)
mcp__chrome-devtools__take_snapshot({ verbose: true })
// Save to file
mcp__chrome-devtools__take_snapshot({ filePath: "<session>/evidence/snapshot.txt" })
```
### Console Messages
```
// All messages
mcp__chrome-devtools__list_console_messages()
// Errors and warnings only
mcp__chrome-devtools__list_console_messages({ types: ["error", "warn"] })
// Get specific message detail
mcp__chrome-devtools__get_console_message({ msgid: 5 })
```
### Network Requests
```
// All requests
mcp__chrome-devtools__list_network_requests()
// XHR/Fetch only (API calls)
mcp__chrome-devtools__list_network_requests({ resourceTypes: ["xhr", "fetch"] })
// Get request detail (headers, body, response)
mcp__chrome-devtools__get_network_request({ reqid: 3 })
// Save response to file
mcp__chrome-devtools__get_network_request({ reqid: 3, responseFilePath: "<path>" })
```
### Performance Trace
```
// Start trace (auto-reload and auto-stop)
mcp__chrome-devtools__performance_start_trace({ reload: true, autoStop: true })
// Start manual trace
mcp__chrome-devtools__performance_start_trace({ reload: false, autoStop: false })
// Stop and save
mcp__chrome-devtools__performance_stop_trace({ filePath: "<session>/evidence/trace.json" })
```
## 4. Script Execution
### Evaluate JavaScript
```
// Get page title
mcp__chrome-devtools__evaluate_script({ function: "() => document.title" })
// Get element state
mcp__chrome-devtools__evaluate_script({
function: "(el) => ({ text: el.innerText, classes: el.className })",
args: ["<uid>"]
})
// Check React state (if applicable)
mcp__chrome-devtools__evaluate_script({
function: "() => { const fiber = document.querySelector('#root')._reactRootContainer; return fiber ? 'React detected' : 'No React'; }"
})
// Get computed styles
mcp__chrome-devtools__evaluate_script({
function: "(el) => JSON.stringify(window.getComputedStyle(el))",
args: ["<uid>"]
})
```
## 5. Common Debug Patterns
### Pattern: Reproduce Click Bug
```
1. navigate_page → target URL
2. wait_for → page loaded
3. take_snapshot → find target element uid
4. take_screenshot → before state
5. list_console_messages → baseline errors
6. click → target element
7. wait_for → expected result (or timeout)
8. take_screenshot → after state
9. list_console_messages → new errors
10. list_network_requests → triggered requests
```
### Pattern: Debug API Error
```
1. navigate_page → target URL
2. wait_for → page loaded
3. take_snapshot → find trigger element
4. click/fill → trigger API call
5. list_network_requests → find the API request
6. get_network_request → inspect headers, body, response
7. list_console_messages → check for error handling
```
### Pattern: Debug Performance Issue
```
1. navigate_page → target URL (set URL first)
2. performance_start_trace → start recording with reload
3. (auto-stop after page loads)
4. Read trace results → identify long tasks, bottlenecks
```
### Pattern: Debug Visual/CSS Issue
```
1. navigate_page → target URL
2. take_screenshot → capture current visual state
3. take_snapshot({ verbose: true }) → full a11y tree with styles
4. evaluate_script → get computed styles of problematic element
5. Compare expected vs actual styles
```
## 6. Error Handling
| Error | Meaning | Resolution |
|-------|---------|------------|
| "No page selected" | No browser tab active | list_pages → select_page |
| "Element not found" | uid is stale | take_snapshot → get new uid |
| "Navigation timeout" | Page didn't load | Check URL, retry with longer timeout |
| "Evaluation failed" | JS error in script | Check script syntax, page context |
| "No trace recording" | stop_trace without start | Ensure start_trace was called first |

View File

@@ -0,0 +1,94 @@
# Pipeline Definitions
## 1. Pipeline Selection Criteria
| Keywords | Pipeline |
|----------|----------|
| 功能, feature, 清单, list, 测试, test, 完成, done, 验收 | `test-pipeline` |
| bug, 错误, 报错, crash, 问题, 不工作, 白屏, 异常 | `debug-pipeline` |
| performance, 性能, slow, 慢, latency, memory | `debug-pipeline` (perf dimension) |
| Ambiguous / unclear | request_user_input |
## 2. Test Pipeline (Feature-List Driven)
**4 tasks, linear with conditional skip**
```
TEST-001 → [issues found?] → ANALYZE-001 → FIX-001 → VERIFY-001
|
└─ no issues → Pipeline Complete (skip ANALYZE/FIX/VERIFY)
```
| Task | Role | Description | Conditional |
|------|------|-------------|-------------|
| TEST-001 | tester | Test all features, discover issues | Always |
| ANALYZE-001 | analyzer | Analyze discovered issues, produce RCA | Skip if 0 issues |
| FIX-001 | fixer | Fix all identified root causes | Skip if 0 issues |
| VERIFY-001 | verifier | Re-test failed scenarios to verify fixes | Skip if 0 issues |
### Conditional Skip Logic
After TEST-001 completes, coordinator reads `TEST-001-issues.json`:
- `issues.length === 0` → All pass. Skip downstream tasks, report success.
- `issues.filter(i => i.severity !== "low").length === 0` → Only warnings. request_user_input: fix or complete.
- `issues.filter(i => i.severity === "high" || i.severity === "medium").length > 0` → Proceed with ANALYZE → FIX → VERIFY.
### Re-Fix Iteration
If VERIFY-001 reports failures:
- Create FIX-002 (blockedBy: VERIFY-001) → VERIFY-002 (blockedBy: FIX-002)
- Max 3 fix iterations
## 3. Debug Pipeline (Bug-Report Driven)
**4 tasks, linear with iteration support**
```
REPRODUCE-001 → ANALYZE-001 → FIX-001 → VERIFY-001
↑ |
| (if fail) |
+--- REPRODUCE-002 ←----+
```
| Task | Role | Description |
|------|------|-------------|
| REPRODUCE-001 | reproducer | Reproduce bug, collect evidence |
| ANALYZE-001 | analyzer | Analyze evidence, produce RCA report |
| FIX-001 | fixer | Implement code fix based on RCA |
| VERIFY-001 | verifier | Verify fix with same reproduction steps |
### Iteration Rules
- **Analyzer → Reproducer**: If Analyzer confidence < 50%, creates REPRODUCE-002 → ANALYZE-002
- **Verifier → Fixer**: If Verifier verdict = fail, creates FIX-002 → VERIFY-002
### Maximum Iterations
- Max reproduction iterations: 2
- Max fix iterations: 3
- After max iterations: report to user for manual intervention
## 4. Task Metadata Registry
| Task ID | Role | Pipeline | Depends On | Priority |
|---------|------|----------|------------|----------|
| TEST-001 | tester | test | - | P0 |
| REPRODUCE-001 | reproducer | debug | - | P0 |
| ANALYZE-001 | analyzer | both | TEST-001 or REPRODUCE-001 | P0 |
| FIX-001 | fixer | both | ANALYZE-001 | P0 |
| VERIFY-001 | verifier | both | FIX-001 | P0 |
| REPRODUCE-002 | reproducer | debug | (dynamic) | P0 |
| ANALYZE-002 | analyzer | debug | REPRODUCE-002 | P0 |
| FIX-002 | fixer | both | VERIFY-001 | P0 |
| VERIFY-002 | verifier | both | FIX-002 | P0 |
## 5. Evidence Types Registry
| Dimension | Evidence | MCP Tool | Collector Roles |
|-----------|----------|----------|----------------|
| Visual | Screenshots | take_screenshot | tester, reproducer, verifier |
| DOM | A11y snapshots | take_snapshot | tester, reproducer, verifier |
| Console | Error/warn messages | list_console_messages | tester, reproducer, verifier |
| Network | API requests/responses | list/get_network_request | tester, reproducer, verifier |
| Performance | Trace recording | performance_start/stop_trace | reproducer, verifier |
| Interaction | User actions | click/fill/hover | tester, reproducer, verifier |