feat: enhance wave pipeline skills with rich task fields and cross-file consistency

- Add 7 new CSV columns (test, acceptance_criteria, scope, hints, execution_directives, tests_passed, acceptance_met) to tasks.csv schema across all 3 pipeline skills - Create .codex/skills/wave-plan-pipeline as Codex version of workflow-wave-plan with spawn_agents_on_csv calling conventions - Align instruction templates with MANDATORY FIRST STEPS and 11-step execution protocol across all files - Standardize context.md reports with Waves metric and Dependencies row - Unify Discovery Board protocol with Dedup Key table and test_command - Add Best Practices and Usage Recommendations to workflow-wave-plan
2026-03-01 15:03:57 +08:00 · 2026-02-28 16:02:05 +08:00
parent 3788ba1268
commit ab65caec45
3 changed files with 1384 additions and 49 deletions
--- a/.claude/skills/workflow-wave-plan/SKILL.md
+++ b/.claude/skills/workflow-wave-plan/SKILL.md
@@ -86,13 +86,20 @@ Two context channels:
 |--------|------|--------|-------------|
 | `id` | string | Planner | T1, T2, ... |
 | `title` | string | Planner | Task title |
-| `description` | string | Planner | Self-contained task description |
+| `description` | string | Planner | Self-contained task description — what to implement |
 | `test` | string | Planner | Test cases: what tests to write and how to verify (unit/integration/edge) |
 | `acceptance_criteria` | string | Planner | Measurable conditions that define "done" |
 | `scope` | string | Planner | Target file/directory glob — constrains agent write area, prevents cross-task file conflicts |
 | `hints` | string | Planner | Implementation tips + reference files. Format: `tips text \|\| file1;file2`. Either part is optional |
 | `execution_directives` | string | Planner | Execution constraints: commands to run for verification, tool restrictions |
 | `deps` | string | Planner | Dependency task IDs: T1;T2 |
 | `context_from` | string | Planner | Context source IDs: **E1;E2;T1** |
 | `wave` | integer | Wave Engine | Wave number (computed from deps) |
 | `status` | enum | Agent | pending / completed / failed / skipped |
 | `findings` | string | Agent | Execution findings (max 500 chars) |
 | `files_modified` | string | Agent | Files modified (semicolon-separated) |
 | `tests_passed` | boolean | Agent | Whether all defined test cases passed (true/false) |
 | `acceptance_met` | string | Agent | Summary of which acceptance criteria were met/unmet |
 | `error` | string | Agent | Error if failed |
 **context_from prefix convention**: `E*` → explore.csv lookup, `T*` → tasks.csv lookup.
@@ -261,12 +268,19 @@ function buildExplorePrompt(row, requirement, sessionFolder) {
 **Requirement**: ${requirement}
 **Focus**: ${row.focus}
 ### MANDATORY FIRST STEPS
 1. Read shared discoveries: ${sessionFolder}/discoveries.ndjson (if exists, skip if not)
 2. Read project context: .workflow/project-tech.json (if exists)
 ---
 ## Instructions
 Explore the codebase from the **${row.angle}** perspective:
 1. Discover relevant files, modules, and patterns
 2. Identify integration points and dependencies
 3. Note constraints, risks, and conventions
 4. Find existing patterns to follow
 5. Share discoveries: append findings to ${sessionFolder}/discoveries.ndjson
 ## Output
 Write findings to: ${sessionFolder}/explore-results/${row.id}.json
@@ -354,22 +368,27 @@ Decompose into execution tasks based on synthesized exploration:
 // 5. Prefer parallel (minimize deps)
 // 6. Use exploration findings: key_files → target files, patterns → references,
 //    integration_points → dependency relationships, constraints → included in description
 // 7. Each task MUST include: test (how to verify), acceptance_criteria (what defines done)
 // 8. scope must not overlap between tasks in the same wave
 // 9. hints = implementation tips + reference files (format: tips || file1;file2)
 // 10. execution_directives = commands to run for verification, tool restrictions
 const tasks = []
 // Claude decomposes requirement using exploration synthesis
 // Example:
-// tasks.push({ id: 'T1', title: 'Setup types', description: '...', deps: '', context_from: 'E1;E2' })
+// tasks.push({ id: 'T1', title: 'Setup types', description: '...', test: 'Verify types compile', acceptance_criteria: 'All interfaces exported', scope: 'src/types/**', hints: 'Follow existing type patterns || src/types/index.ts', execution_directives: 'tsc --noEmit', deps: '', context_from: 'E1;E2' })
-// tasks.push({ id: 'T2', title: 'Implement core', description: '...', deps: 'T1', context_from: 'E1;E2;T1' })
+// tasks.push({ id: 'T2', title: 'Implement core', description: '...', test: 'Unit test: core logic', acceptance_criteria: 'All functions pass tests', scope: 'src/core/**', hints: 'Reuse BaseService || src/services/Base.ts', execution_directives: 'npm test -- --grep core', deps: 'T1', context_from: 'E1;E2;T1' })
-// tasks.push({ id: 'T3', title: 'Add tests', description: '...', deps: 'T2', context_from: 'E3;T2' })
+// tasks.push({ id: 'T3', title: 'Add tests', description: '...', test: 'Integration test suite', acceptance_criteria: '>80% coverage', scope: 'tests/**', hints: 'Follow existing test patterns || tests/auth.test.ts', execution_directives: 'npm test', deps: 'T2', context_from: 'E3;T2' })
 // Compute waves
 const waves = computeWaves(tasks)
 tasks.forEach(t => { t.wave = waves[t.id] })
 // Write tasks.csv
-const header = 'id,title,description,deps,context_from,wave,status,findings,files_modified,error'
+const header = 'id,title,description,test,acceptance_criteria,scope,hints,execution_directives,deps,context_from,wave,status,findings,files_modified,tests_passed,acceptance_met,error'
 const rows = tasks.map(t =>
-  `"${t.id}","${escCSV(t.title)}","${escCSV(t.description)}","${t.deps}","${t.context_from}",${t.wave},"pending","","",""`
+  [t.id, escCSV(t.title), escCSV(t.description), escCSV(t.test), escCSV(t.acceptance_criteria), escCSV(t.scope), escCSV(t.hints), escCSV(t.execution_directives), t.deps, t.context_from, t.wave, 'pending', '', '', '', '', '']
    .map(v => `"${v}"`).join(',')
 )
 Write(`${sessionFolder}/tasks.csv`, [header, ...rows].join('\n'))
@@ -478,6 +497,8 @@ for (let wave = 1; wave <= maxWave; wave++) {
      row.files_modified = Array.isArray(result.files_modified)
        ? result.files_modified.join(';')
        : (result.files_modified || '')
      row.tests_passed = String(result.tests_passed ?? '')
      row.acceptance_met = result.acceptance_met || ''
      row.error = result.error || ''
    } else {
      row.status = 'completed'
@@ -533,33 +554,69 @@ function buildExecutePrompt(row, requirement, sessionFolder) {
 **ID**: ${row.id}
 **Goal**: ${requirement}
 **Scope**: ${row.scope || 'Not specified'}
 ## Description
 ${row.description}
 ### Implementation Hints & Reference Files
 ${row.hints || 'None'}
 > Format: \`tips text || file1;file2\`. Read ALL reference files (after ||) before starting. Apply tips (before ||) as guidance.
 ### Execution Directives
 ${row.execution_directives || 'None'}
 > Commands to run for verification, tool restrictions, or environment requirements.
 ### Test Cases
 ${row.test || 'None specified'}
 ### Acceptance Criteria
 ${row.acceptance_criteria || 'None specified'}
 ## Previous Context (from exploration and predecessor tasks)
 ${row._prev_context}
-## Discovery Board
+### MANDATORY FIRST STEPS
-Read shared discoveries first: ${sessionFolder}/discoveries.ndjson (if exists)
+1. Read shared discoveries: ${sessionFolder}/discoveries.ndjson (if exists, skip if not)
-After execution, append any discoveries:
+2. Read project context: .workflow/project-tech.json (if exists)
 echo '{"ts":"<ISO>","worker":"${row.id}","type":"<type>","data":{...}}' >> ${sessionFolder}/discoveries.ndjson
-## Instructions
+---
-1. Read the relevant files identified in the context above
+
-2. Implement changes described in the task description
+## Execution Protocol
-3. Ensure changes are consistent with exploration findings
+
-4. Test changes if applicable
+1. **Read references**: Parse hints — read all files listed after \`||\` to understand existing patterns
 2. **Read discoveries**: Load ${sessionFolder}/discoveries.ndjson for shared exploration findings
 3. **Use context**: Apply previous tasks' findings from prev_context above
 4. **Stay in scope**: ONLY create/modify files within ${row.scope || 'project'} — do NOT touch files outside this boundary
 5. **Apply hints**: Follow implementation tips from hints (before \`||\`)
 6. **Implement**: Execute changes described in the task description
 7. **Write tests**: Implement the test cases defined above
 8. **Run directives**: Execute commands from execution_directives to verify your work
 9. **Verify acceptance**: Ensure all acceptance criteria are met before reporting completion
 10. **Share discoveries**: Append exploration findings to shared board:
   \\\`\\\`\\\`bash
   echo '{"ts":"<ISO>","worker":"${row.id}","type":"<type>","data":{...}}' >> ${sessionFolder}/discoveries.ndjson
   \\\`\\\`\\\`
 11. **Report result**: Write JSON to output file
 ## Output
 Write results to: ${sessionFolder}/task-results/${row.id}.json
 {
-  "status": "completed",
+  "status": "completed" | "failed",
  "findings": "What was done (max 500 chars)",
  "files_modified": ["file1.ts", "file2.ts"],
  "tests_passed": true | false,
  "acceptance_met": "Summary of which acceptance criteria were met/unmet",
  "error": ""
-}`
+}
 **IMPORTANT**: Set status to "completed" ONLY if:
 - All test cases pass
 - All acceptance criteria are met
 Otherwise set status to "failed" with details in error field.`
 }
 ```
@@ -610,11 +667,30 @@ Key files: ${e.key_files || 'none'}`).join('\n\n')}
 ## Task Results
 ${finalTasks.map(t => `### ${t.id}: ${t.title} (${t.status})
- Context from: ${t.context_from || 'none'}
+
- Wave: ${t.wave}
+| Field | Value |
- Findings: ${t.findings || 'N/A'}
+|-------|-------|
- Files: ${t.files_modified || 'none'}
+| Wave | ${t.wave} |
-${t.error ? `- Error: ${t.error}` : ''}`).join('\n\n')}
+| Scope | ${t.scope || 'none'} |
 | Dependencies | ${t.deps || 'none'} |
 | Context From | ${t.context_from || 'none'} |
 | Tests Passed | ${t.tests_passed || 'N/A'} |
 | Acceptance Met | ${t.acceptance_met || 'N/A'} |
 | Error | ${t.error || 'none'} |
 **Description**: ${t.description}
 **Test Cases**: ${t.test || 'N/A'}
 **Acceptance Criteria**: ${t.acceptance_criteria || 'N/A'}
 **Hints**: ${t.hints || 'N/A'}
 **Execution Directives**: ${t.execution_directives || 'N/A'}
 **Findings**: ${t.findings || 'N/A'}
 **Files Modified**: ${t.files_modified || 'none'}`).join('\n\n---\n\n')}
 ## All Modified Files
@@ -739,13 +815,34 @@ function truncate(s, max) {
 Shared `discoveries.ndjson` — append-only NDJSON accessible to all agents across all phases.
 **Lifecycle**:
 - Created by the first agent to write a discovery
 - Carries over across all phases and waves — never cleared
 - Agents append via `echo '...' >> discoveries.ndjson`
 **Format**: NDJSON, each line is a self-contained JSON:
 ```jsonl
 {"ts":"...","worker":"E1","type":"code_pattern","data":{"name":"repo-pattern","file":"src/repos/Base.ts"}}
 {"ts":"...","worker":"T2","type":"integration_point","data":{"file":"src/auth/index.ts","exports":["auth"]}}
 ```
-**Types**: `code_pattern`, `integration_point`, `convention`, `blocker`, `tech_stack`
+**Discovery Types**:
-**Rules**: Read first → write immediately → deduplicate → append-only
+
 | type | Dedup Key | Description |
 |------|-----------|-------------|
 | `code_pattern` | `data.name` | Reusable code pattern found |
 | `integration_point` | `data.file` | Module connection point |
 | `convention` | singleton | Code style conventions |
 | `blocker` | `data.issue` | Blocking issue encountered |
 | `tech_stack` | singleton | Project technology stack |
 | `test_command` | singleton | Test commands discovered |
 **Protocol Rules**:
 1. Read board before own exploration → skip covered areas
 2. Write discoveries immediately via `echo >>` → don't batch
 3. Deduplicate — check existing entries; skip if same type + dedup key exists
 4. Append-only — never modify or delete existing lines
 ---
@@ -754,11 +851,12 @@ Shared `discoveries.ndjson` — append-only NDJSON accessible to all agents acro
 | Error | Resolution |
 |-------|------------|
 | Explore agent failure | Mark as failed in explore.csv, exclude from planning |
 | All explores failed | Fallback: plan directly from requirement without exploration |
 | Execute agent failure | Mark as failed, skip dependents (cascade) |
 | Agent timeout | Mark as failed in results, continue with wave |
 | Circular dependency | Abort wave computation, report cycle |
-| All explores failed | Fallback: plan directly from requirement |
+| CSV parse error | Validate CSV format before execution, show line number |
-| CSV parse error | Re-validate format |
+| discoveries.ndjson corrupt | Ignore malformed lines, continue with valid entries |
 | discoveries.ndjson corrupt | Ignore malformed lines |
 ---
@@ -772,3 +870,27 @@ Shared `discoveries.ndjson` — append-only NDJSON accessible to all agents acro
 6. **Discovery Board Append-Only**: Never clear or modify discoveries.ndjson
 7. **Explore Before Execute**: Phase 2 completes before Phase 4 starts
 8. **DO NOT STOP**: Continuous execution until all waves complete or remaining skipped
 ---
 ## Best Practices
 1. **Exploration Angles**: 1 for simple, 3-4 for complex; avoid redundant angles
 2. **Context Linking**: Link every task to at least one explore row (E*) — exploration was done for a reason
 3. **Task Granularity**: 3-10 tasks optimal; too many = overhead, too few = no parallelism
 4. **Minimize Cross-Wave Deps**: More tasks in wave 1 = more parallelism
 5. **Specific Descriptions**: Agent sees only its CSV row + prev_context — make description self-contained
 6. **Non-Overlapping Scopes**: Same-wave tasks must not write to the same files
 7. **Context From ≠ Deps**: `deps` = execution order constraint; `context_from` = information flow
 ---
 ## Usage Recommendations
 | Scenario | Recommended Approach |
 |----------|---------------------|
 | Complex feature (unclear architecture) | `workflow:wave-plan` — explore first, then plan |
 | Simple known-pattern task | `$csv-wave-pipeline` — skip exploration, direct execution |
 | Independent parallel tasks | `$csv-wave-pipeline -c 8` — single wave, max parallelism |
 | Diamond dependency (A→B,C→D) | `workflow:wave-plan` — 3 waves with context propagation |
 | Unknown codebase | `workflow:wave-plan` — exploration phase is essential |
--- a/.codex/skills/csv-wave-pipeline/SKILL.md
+++ b/.codex/skills/csv-wave-pipeline/SKILL.md
@@ -73,11 +73,11 @@ Wave-based batch execution using `spawn_agents_on_csv` with **cross-wave context
 ### tasks.csv (Master State)
 ```csv
-id,title,description,deps,context_from,wave,status,findings,files_modified,error
+id,title,description,test,acceptance_criteria,scope,hints,execution_directives,deps,context_from,wave,status,findings,files_modified,tests_passed,acceptance_met,error
-1,Setup auth module,Create auth directory structure and base files,,,1,,,,
+"1","Setup auth module","Create auth directory structure and base files","Verify directory exists and base files export expected interfaces","auth/ dir created; index.ts and types.ts export AuthProvider interface","src/auth/**","Follow monorepo module pattern || package.json;src/shared/types.ts","","","","1","","","","","",""
-2,Implement OAuth,Add OAuth provider integration with Google and GitHub,1,1,2,,,,
+"2","Implement OAuth","Add OAuth provider integration with Google and GitHub","Unit test: mock OAuth callback returns valid token; Integration test: verify redirect URL generation","OAuth login redirects to provider; callback returns JWT; supports Google and GitHub","src/auth/oauth/**","Use passport.js strategy pattern || src/auth/index.ts;docs/oauth-flow.md","Run npm test -- --grep oauth before completion","1","1","2","","","","","",""
-3,Add JWT tokens,Implement JWT generation and validation,1,1,2,,,,
+"3","Add JWT tokens","Implement JWT generation and validation","Unit test: sign/verify round-trip; Edge test: expired token returns 401","generateToken() returns valid JWT; verifyToken() rejects expired/tampered tokens","src/auth/jwt/**","Use jsonwebtoken library; Set default expiry 1h || src/config/auth.ts","Ensure tsc --noEmit passes","1","1","2","","","","","",""
-4,Setup 2FA,Add TOTP-based 2FA with QR code generation,2;3,1;2;3,3,,,,
+"4","Setup 2FA","Add TOTP-based 2FA with QR code generation","Unit test: TOTP verify with correct code; Test: QR data URL is valid","QR code generates scannable image; TOTP verification succeeds within time window","src/auth/2fa/**","Use speakeasy + qrcode libraries || src/auth/oauth/strategy.ts;src/auth/jwt/token.ts","Run full test suite: npm test","2;3","1;2;3","3","","","","","",""
 ```
 **Columns**:
@@ -86,13 +86,20 @@ id,title,description,deps,context_from,wave,status,findings,files_modified,error
 |--------|-------|-------------|
 | `id` | Input | Unique task identifier (string) |
 | `title` | Input | Short task title |
-| `description` | Input | Detailed task description |
+| `description` | Input | Detailed task description — what to implement |
 | `test` | Input | Test cases: what tests to write and how to verify (unit/integration/edge) |
 | `acceptance_criteria` | Input | Acceptance criteria: measurable conditions that define "done" |
 | `scope` | Input | Target file/directory glob — constrains agent work area, prevents cross-task file conflicts |
 | `hints` | Input | Implementation tips + reference files. Format: `tips text \|\| file1;file2`. Before `\|\|` = how to implement; after `\|\|` = existing files to read before starting. Either part is optional |
 | `execution_directives` | Input | Execution constraints: commands to run for verification, tool restrictions, environment requirements |
 | `deps` | Input | Semicolon-separated dependency task IDs (empty = no deps) |
 | `context_from` | Input | Semicolon-separated task IDs whose findings this task needs |
 | `wave` | Computed | Wave number (computed by topological sort, 1-based) |
 | `status` | Output | `pending` → `completed` / `failed` / `skipped` |
 | `findings` | Output | Key discoveries or implementation notes (max 500 chars) |
 | `files_modified` | Output | Semicolon-separated file paths |
 | `tests_passed` | Output | Whether all defined test cases passed (true/false) |
 | `acceptance_met` | Output | Summary of which acceptance criteria were met/unmet |
 | `error` | Output | Error message if failed (empty if success) |
 ### Per-Wave CSV (Temporary)
@@ -100,9 +107,9 @@ id,title,description,deps,context_from,wave,status,findings,files_modified,error
 Each wave generates a temporary `wave-{N}.csv` with an extra `prev_context` column:
 ```csv
-id,title,description,deps,context_from,wave,prev_context
+id,title,description,test,acceptance_criteria,scope,hints,execution_directives,deps,context_from,wave,prev_context
-2,Implement OAuth,Add OAuth integration,1,1,2,"[Task 1] Created auth/ with index.ts and types.ts"
+"2","Implement OAuth","Add OAuth integration","Unit test: mock OAuth callback returns valid token","OAuth login redirects to provider; callback returns JWT","src/auth/oauth/**","Use passport.js strategy pattern || src/auth/index.ts;docs/oauth-flow.md","Run npm test -- --grep oauth","1","1","2","[Task 1] Created auth/ with index.ts and types.ts"
-3,Add JWT tokens,Implement JWT,1,1,2,"[Task 1] Created auth/ with index.ts and types.ts"
+"3","Add JWT tokens","Implement JWT","Unit test: sign/verify round-trip; Edge test: expired token returns 401","generateToken() returns valid JWT; verifyToken() rejects expired/tampered tokens","src/auth/jwt/**","Use jsonwebtoken library; Set default expiry 1h || src/config/auth.ts","Ensure tsc --noEmit passes","1","1","2","[Task 1] Created auth/ with index.ts and types.ts"
 ```
 The `prev_context` column is built from `context_from` by looking up completed tasks' `findings` in the master CSV.
@@ -187,16 +194,26 @@ Bash(`mkdir -p ${sessionFolder}`)
   ```javascript
   // Use ccw cli to decompose requirement into subtasks
   Bash({
-     command: `ccw cli -p "PURPOSE: Decompose requirement into 3-10 atomic tasks for batch agent execution.
+     command: `ccw cli -p "PURPOSE: Decompose requirement into 3-10 atomic tasks for batch agent execution. Each task must include implementation description, test cases, and acceptance criteria.
 TASK:
  • Parse requirement into independent subtasks
  • Identify dependencies between tasks (which must complete before others)
  • Identify context flow (which tasks need previous tasks' findings)
  • For each task, define concrete test cases (unit/integration/edge)
  • For each task, define measurable acceptance criteria (what defines 'done')
  • Each task must be executable by a single agent with file read/write access
 MODE: analysis
 CONTEXT: @**/*
-EXPECTED: JSON object with tasks array. Each task: {id: string, title: string, description: string, deps: string[], context_from: string[]}. deps = task IDs that must complete first. context_from = task IDs whose findings are needed.
+EXPECTED: JSON object with tasks array. Each task: {id: string, title: string, description: string, test: string, acceptance_criteria: string, scope: string, hints: string, execution_directives: string, deps: string[], context_from: string[]}.
-CONSTRAINTS: 3-10 tasks | Each task is atomic | No circular deps | description must be specific enough for an agent to execute independently
+  - description: what to implement (specific enough for an agent to execute independently)
  - test: what tests to write and how to verify (e.g. 'Unit test: X returns Y; Edge test: handles Z')
  - acceptance_criteria: measurable conditions that define done (e.g. 'API returns 200; token expires after 1h')
  - scope: target file/directory glob (e.g. 'src/auth/**') — tasks in same wave MUST have non-overlapping scopes
  - hints: implementation tips + reference files, format '<tips> || <ref_file1>;<ref_file2>' (e.g. 'Use strategy pattern || src/base/Strategy.ts;docs/design.md')
  - execution_directives: commands to run for verification or tool constraints (e.g. 'Run npm test --bail; Ensure tsc passes')
  - deps: task IDs that must complete first
  - context_from: task IDs whose findings are needed
 CONSTRAINTS: 3-10 tasks | Each task is atomic | No circular deps | test and acceptance_criteria must be concrete and verifiable | Same-wave tasks must have non-overlapping scopes
 REQUIREMENT: ${requirement}" --tool gemini --mode analysis --rule planning-breakdown-task-steps`,
     run_in_background: true
@@ -266,19 +283,26 @@ REQUIREMENT: ${requirement}" --tool gemini --mode analysis --rule planning-break
 3. **Generate tasks.csv**
   ```javascript
-   const header = 'id,title,description,deps,context_from,wave,status,findings,files_modified,error'
+   const header = 'id,title,description,test,acceptance_criteria,scope,hints,execution_directives,deps,context_from,wave,status,findings,files_modified,tests_passed,acceptance_met,error'
   const rows = decomposedTasks.map(task => {
     const wave = waveAssignment.get(task.id)
     return [
       task.id,
       csvEscape(task.title),
       csvEscape(task.description),
       csvEscape(task.test),
       csvEscape(task.acceptance_criteria),
       csvEscape(task.scope),
       csvEscape(task.hints),
       csvEscape(task.execution_directives),
       task.deps.join(';'),
       task.context_from.join(';'),
       wave,
       'pending',  // status
       '',         // findings
       '',         // files_modified
       '',         // tests_passed
       '',         // acceptance_met
       ''          // error
     ].map(cell => `"${String(cell).replace(/"/g, '""')}"`).join(',')
   })
@@ -387,9 +411,9 @@ REQUIREMENT: ${requirement}" --tool gemini --mode analysis --rule planning-break
     }
     // 5. Write wave CSV
-     const waveHeader = 'id,title,description,deps,context_from,wave,prev_context'
+     const waveHeader = 'id,title,description,test,acceptance_criteria,scope,hints,execution_directives,deps,context_from,wave,prev_context'
     const waveRows = executableTasks.map(t =>
-       [t.id, t.title, t.description, t.deps, t.context_from, t.wave, t.prev_context]
+       [t.id, t.title, t.description, t.test, t.acceptance_criteria, t.scope, t.hints, t.execution_directives, t.deps, t.context_from, t.wave, t.prev_context]
         .map(cell => `"${String(cell).replace(/"/g, '""')}"`)
         .join(',')
     )
@@ -412,9 +436,11 @@ REQUIREMENT: ${requirement}" --tool gemini --mode analysis --rule planning-break
           status: { type: "string", enum: ["completed", "failed"] },
           findings: { type: "string" },
           files_modified: { type: "array", items: { type: "string" } },
           tests_passed: { type: "boolean" },
           acceptance_met: { type: "string" },
           error: { type: "string" }
         },
-         required: ["id", "status", "findings"]
+         required: ["id", "status", "findings", "tests_passed"]
       }
     })
     // ↑ Blocks until all agents in this wave complete
@@ -426,6 +452,8 @@ REQUIREMENT: ${requirement}" --tool gemini --mode analysis --rule planning-break
         status: result.status,
         findings: result.findings || '',
         files_modified: (result.files_modified || []).join(';'),
         tests_passed: String(result.tests_passed ?? ''),
         acceptance_met: result.acceptance_met || '',
         error: result.error || ''
       })
@@ -462,6 +490,23 @@ REQUIREMENT: ${requirement}" --tool gemini --mode analysis --rule planning-break
 **Task ID**: {id}
 **Title**: {title}
 **Description**: {description}
 **Scope**: {scope}
 ### Implementation Hints & Reference Files
 {hints}
 > Format: \`<tips> || <ref_file1>;<ref_file2>\`. Read ALL reference files (after ||) before starting implementation. Apply tips (before ||) as implementation guidance.
 ### Execution Directives
 {execution_directives}
 > Commands to run for verification, tool restrictions, or environment requirements. Follow these constraints during and after implementation.
 ### Test Cases
 {test}
 ### Acceptance Criteria
 {acceptance_criteria}
 ### Previous Tasks' Findings (Context)
 {prev_context}
@@ -470,14 +515,20 @@ REQUIREMENT: ${requirement}" --tool gemini --mode analysis --rule planning-break
 ## Execution Protocol
-1. **Read discoveries**: Load ${sessionFolder}/discoveries.ndjson for shared exploration findings
+1. **Read references**: Parse {hints} — read all files listed after \`||\` to understand existing patterns
-2. **Use context**: Apply previous tasks' findings from prev_context above
+2. **Read discoveries**: Load ${sessionFolder}/discoveries.ndjson for shared exploration findings
-3. **Execute**: Implement the task as described
+3. **Use context**: Apply previous tasks' findings from prev_context above
-4. **Share discoveries**: Append exploration findings to shared board:
+4. **Stay in scope**: ONLY create/modify files within {scope} — do NOT touch files outside this boundary
 5. **Apply hints**: Follow implementation tips from {hints} (before \`||\`)
 6. **Execute**: Implement the task as described
 7. **Write tests**: Implement the test cases defined above
 8. **Run directives**: Execute commands from {execution_directives} to verify your work
 9. **Verify acceptance**: Ensure all acceptance criteria are met before reporting completion
 10. **Share discoveries**: Append exploration findings to shared board:
   \`\`\`bash
   echo '{"ts":"<ISO8601>","worker":"{id}","type":"<type>","data":{...}}' >> ${sessionFolder}/discoveries.ndjson
   \`\`\`
-5. **Report result**: Return JSON via report_agent_job_result
+11. **Report result**: Return JSON via report_agent_job_result
 ### Discovery Types to Share
 - \`code_pattern\`: {name, file, description} — reusable patterns found
@@ -495,8 +546,15 @@ Return JSON:
  "status": "completed" | "failed",
  "findings": "Key discoveries and implementation notes (max 500 chars)",
  "files_modified": ["path1", "path2"],
  "tests_passed": true | false,
  "acceptance_met": "Summary of which acceptance criteria were met/unmet",
  "error": ""
 }
 **IMPORTANT**: Set status to "completed" ONLY if:
 - All test cases pass
 - All acceptance criteria are met
 Otherwise set status to "failed" with details in error field.
 `
   }
   ```
@@ -576,6 +634,7 @@ Return JSON:
 | Completed | ${completed.length} |
 | Failed | ${failed.length} |
 | Skipped | ${skipped.length} |
 | Waves | ${maxWave} |
 ---
@@ -584,7 +643,7 @@ Return JSON:
 ${Array.from({ length: maxWave }, (_, i) => i + 1).map(w => {
  const waveTasks = tasks.filter(t => parseInt(t.wave) === w)
  return `### Wave ${w}
-${waveTasks.map(t => `- **[${t.id}] ${t.title}**: ${t.status}${t.error ? ' — ' + t.error : ''}
+${waveTasks.map(t => `- **[${t.id}] ${t.title}**: ${t.status}${t.tests_passed ? ' ✓tests' : ''}${t.error ? ' — ' + t.error : ''}
  ${t.findings ? 'Findings: ' + t.findings : ''}`).join('\n')}`
 }).join('\n\n')}
@@ -598,10 +657,23 @@ ${tasks.map(t => `### ${t.id}: ${t.title}
 |-------|-------|
 | Status | ${t.status} |
 | Wave | ${t.wave} |
 | Scope | ${t.scope || 'none'} |
 | Dependencies | ${t.deps || 'none'} |
 | Context From | ${t.context_from || 'none'} |
 | Tests Passed | ${t.tests_passed || 'N/A'} |
 | Acceptance Met | ${t.acceptance_met || 'N/A'} |
 | Error | ${t.error || 'none'} |
 **Description**: ${t.description}
 **Test Cases**: ${t.test || 'N/A'}
 **Acceptance Criteria**: ${t.acceptance_criteria || 'N/A'}
 **Hints**: ${t.hints || 'N/A'}
 **Execution Directives**: ${t.execution_directives || 'N/A'}
 **Findings**: ${t.findings || 'N/A'}
 **Files Modified**: ${t.files_modified || 'none'}
--- a/.codex/skills/wave-plan-pipeline/SKILL.md
+++ b/.codex/skills/wave-plan-pipeline/SKILL.md