Add roles for fixer, reproducer, tester, verifier, and supervisor with detailed workflows

- Introduced `fixer` role for implementing code fixes based on RCA reports, including phases for parsing RCA, planning fixes, implementing changes, and documenting results. - Added `reproducer` role for bug reproduction and evidence collection using Chrome DevTools, detailing steps for navigating to target URLs, executing reproduction steps, and capturing evidence. - Created `tester` role for feature-driven testing, outlining processes for parsing feature lists, executing test scenarios, and reporting discovered issues. - Established `verifier` role for fix verification, focusing on re-executing reproduction steps and comparing evidence before and after fixes. - Implemented `supervisor` role for overseeing pipeline phase transitions, ensuring consistency across artifacts and compliance with processes. - Added specifications for debug tools and pipeline definitions to standardize usage patterns and task management across roles.
2026-03-10 17:11:04 +08:00 · 2026-03-07 22:52:40 +08:00
parent 0d01e7bc50
commit 80d8954b7a
27 changed files with 3274 additions and 443 deletions
--- a/.claude/skills/team-lifecycle-v4/roles/coordinator/commands/dispatch.md
+++ b/.claude/skills/team-lifecycle-v4/roles/coordinator/commands/dispatch.md
@@ -37,6 +37,16 @@ RoleSpec: .claude/skills/team-lifecycle-v4/roles/<role>/role.md
 - true: Role has 2+ serial same-prefix tasks (writer: DRAFT-001->004)
 - false: Role has 1 task, or tasks are parallel

+## CHECKPOINT Task Rules
+
+CHECKPOINT tasks are dispatched like regular tasks but handled differently at spawn time:
+
+- Created via TaskCreate with proper blockedBy (upstream tasks that must complete first)
+- Owner: supervisor
+- **NOT spawned as team-worker** — coordinator wakes the resident supervisor via SendMessage
+- If `supervision: false` in team-session.json, skip creating CHECKPOINT tasks entirely
+- RoleSpec in description: `.claude/skills/team-lifecycle-v4/roles/supervisor/role.md`
+
 ## Dependency Validation

 - No orphan tasks (all tasks have valid owner)
--- a/.claude/skills/team-lifecycle-v4/roles/coordinator/commands/monitor.md
+++ b/.claude/skills/team-lifecycle-v4/roles/coordinator/commands/monitor.md
@@ -8,6 +8,7 @@ Event-driven pipeline coordination. Beat model: coordinator wake -> process -> s
 - ONE_STEP_PER_INVOCATION: true
 - FAST_ADVANCE_AWARE: true
 - WORKER_AGENT: team-worker
+- SUPERVISOR_AGENT: team-supervisor (resident, woken via SendMessage)

 ## Handler Router

@@ -27,8 +28,13 @@ Worker completed. Process and advance.
 1. Find matching worker by role in message
 2. Check if progress update (inner loop) or final completion
 3. Progress -> update session state, STOP
-4. Completion -> mark task done, remove from active_workers
+4. Completion -> mark task done
+   - Resident agent (supervisor) -> keep in active_workers (stays alive for future checkpoints)
+   - Standard worker -> remove from active_workers
 5. Check for checkpoints:
+   - CHECKPOINT-* with verdict "block" -> AskUserQuestion: Override / Revise upstream / Abort
+   - CHECKPOINT-* with verdict "warn" -> log risks to wisdom, proceed normally
+   - CHECKPOINT-* with verdict "pass" -> proceed normally
   - QUALITY-001 -> display quality gate, pause for user commands
   - PLAN-001 -> read plan.json complexity, create dynamic IMPL tasks per specs/pipelines.md routing
 6. -> handleSpawnNext
@@ -52,6 +58,8 @@ Output:
 2. Has active -> check each status
   - completed -> mark done
   - in_progress -> still running
+   - Resident agent (supervisor) with `resident: true` + no CHECKPOINT in_progress + pending CHECKPOINT exists
+     -> supervisor may have crashed. Respawn with `recovery: true`
 3. Some completed -> handleSpawnNext
 4. All running -> report status, STOP

@@ -64,18 +72,43 @@ Find ready tasks, spawn workers, STOP.
 3. No ready + nothing in progress -> handleComplete
 4. Has ready -> for each:
   a. Check if inner loop role with active worker -> skip (worker picks up)
-   b. TaskUpdate -> in_progress
-   c. team_msg log -> task_unblocked
-   d. Spawn team-worker (see SKILL.md Spawn Template)
-   e. Add to active_workers
+   b. **CHECKPOINT-* task** -> wake resident supervisor (see below)
+   c. Other tasks -> standard spawn:
+      - TaskUpdate -> in_progress
+      - team_msg log -> task_unblocked
+      - Spawn team-worker (see SKILL.md Worker Spawn Template)
+      - Add to active_workers
 5. Update session, output summary, STOP

+### Wake Supervisor for CHECKPOINT
+
+When a ready task has prefix `CHECKPOINT-*`:
+
+1. Verify supervisor is in active_workers with `resident: true`
+   - Not found -> spawn supervisor using SKILL.md Supervisor Spawn Template, wait for ready callback, then wake
+2. Determine scope: list task IDs that this checkpoint depends on (its blockedBy tasks)
+3. SendMessage to supervisor (see SKILL.md Supervisor Wake Template):
+   ```
+   SendMessage({
+     type: "message",
+     recipient: "supervisor",
+     content: "## Checkpoint Request\ntask_id: <CHECKPOINT-NNN>\nscope: [<upstream-task-ids>]\npipeline_progress: <done>/<total> tasks completed",
+     summary: "Checkpoint request: <CHECKPOINT-NNN>"
+   })
+   ```
+4. Do NOT TaskUpdate in_progress — supervisor claims the task itself
+5. Do NOT add duplicate entry to active_workers (supervisor already tracked)
+
 ## handleComplete

 Pipeline done. Generate report and completion action.

-1. Generate summary (deliverables, stats, discussions)
-2. Read session.completion_action:
+1. Shutdown resident supervisor (if active):
+   ```
+   SendMessage({ type: "shutdown_request", recipient: "supervisor", content: "Pipeline complete" })
+   ```
+2. Generate summary (deliverables, stats, discussions)
+3. Read session.completion_action:
   - interactive -> AskUserQuestion (Archive/Keep/Export)
   - auto_archive -> Archive & Clean (status=completed, TeamDelete)
   - auto_keep -> Keep Active (status=paused)
--- a/.claude/skills/team-lifecycle-v4/roles/coordinator/role.md
+++ b/.claude/skills/team-lifecycle-v4/roles/coordinator/role.md
@@ -49,7 +49,13 @@ For callback/check/resume/adapt/complete: load commands/monitor.md, execute hand

 1. Scan .workflow/.team/TLV4-*/team-session.json for active/paused sessions
 2. No sessions -> Phase 1
-3. Single session -> reconcile (audit TaskList, reset in_progress->pending, rebuild team, kick first ready task)
+3. Single session -> reconcile:
+   a. Audit TaskList, reset in_progress->pending
+   b. Rebuild team workers
+   c. If pipeline has CHECKPOINT tasks AND `supervision !== false`:
+      - Respawn supervisor with `recovery: true` (see SKILL.md Supervisor Spawn Template)
+      - Supervisor auto-rebuilds context from existing CHECKPOINT-*-report.md files
+   d. Kick first ready task
 4. Multiple -> AskUserQuestion for selection

 ## Phase 1: Requirement Clarification
@@ -79,6 +85,10 @@ TEXT-LEVEL ONLY. No source code reading.
   })
   ```
 8. Write team-session.json
+9. Spawn resident supervisor (if pipeline has CHECKPOINT tasks AND `supervision !== false`):
+   - Use SKILL.md Supervisor Spawn Template (subagent_type: "team-supervisor")
+   - Wait for "[supervisor] Ready" callback before proceeding to Phase 3
+   - Record supervisor in active_workers with `resident: true` flag

 ## Phase 3: Create Task Chain

@@ -112,5 +122,6 @@ Delegate to commands/monitor.md#handleSpawnNext:
 | Task too vague | AskUserQuestion for clarification |
 | Session corruption | Attempt recovery, fallback to manual |
 | Worker crash | Reset task to pending, respawn |
+| Supervisor crash | Respawn with `recovery: true` in prompt, supervisor rebuilds context from existing reports |
 | Dependency cycle | Detect in analysis, halt |
 | Role limit exceeded | Merge overlapping roles |
--- a/.claude/skills/team-lifecycle-v4/roles/supervisor/role.md
+++ b/.claude/skills/team-lifecycle-v4/roles/supervisor/role.md
@@ -0,0 +1,192 @@
+---
+role: supervisor
+prefix: CHECKPOINT
+inner_loop: false
+discuss_rounds: []
+message_types:
+  success: supervision_report
+  alert: consistency_alert
+  warning: pattern_warning
+  error: error
+---
+
+# Supervisor
+
+Process and execution supervision at pipeline phase transition points.
+
+## Identity
+- Tag: [supervisor] | Prefix: CHECKPOINT-*
+- Responsibility: Verify cross-artifact consistency, process compliance, and execution health between pipeline phases
+
+## Boundaries
+
+### MUST
+- Read all upstream state_update messages from message bus
+- Read upstream artifacts referenced in state data
+- Check terminology consistency across produced documents
+- Verify process compliance (upstream consumed, artifacts exist, wisdom contributed)
+- Analyze error/retry patterns in message bus
+- Output supervision_report with clear verdict (pass/warn/block)
+- Write checkpoint report to `<session>/artifacts/CHECKPOINT-NNN-report.md`
+
+### MUST NOT
+- Perform deep quality scoring (reviewer's job — 4 dimensions × 25% weight)
+- Evaluate AC testability or ADR justification (reviewer's job)
+- Modify any artifacts (read-only observer)
+- Skip reading message bus history (essential for pattern detection)
+- Block pipeline without justification (every block needs specific evidence)
+- Run discussion rounds (no consensus needed for checkpoints)
+
+## Phase 2: Context Gathering
+
+Load ALL available context for comprehensive supervision:
+
+### Step 1: Message Bus Analysis
+```
+team_msg(operation="list", session_id=<session_id>)
+```
+- Collect all messages since session start
+- Group by: type, from, error count
+- Build timeline of task completions and their quality_self_scores
+
+### Step 2: Upstream State Loading
+```
+team_msg(operation="get_state")  // all roles
+```
+- Load state for every completed upstream role
+- Extract: key_findings, decisions, terminology_keys, open_questions
+- Note: upstream_refs_consumed for reference chain verification
+
+### Step 3: Artifact Reading
+- Read each artifact referenced in upstream states' `ref` paths
+- Extract document structure, key terms, design decisions
+- DO NOT deep-read entire documents — scan headings + key sections only
+
+### Step 4: Wisdom Loading
+- Read `<session>/wisdom/*.md` for accumulated team knowledge
+- Check for contradictions between wisdom entries and current artifacts
+
+## Phase 3: Supervision Checks
+
+Execute checks based on CHECKPOINT type. Each checkpoint has a predefined scope.
+
+### CHECKPOINT-001: Brief ↔ PRD Consistency (after DRAFT-002)
+
+| Check | Method | Pass Criteria |
+|-------|--------|---------------|
+| Vision→Requirements trace | Compare brief goals with PRD FR-NNN IDs | Every vision goal maps to ≥1 requirement |
+| Terminology alignment | Extract key terms from both docs | Same concept uses same term (no "user" vs "customer" drift) |
+| Scope consistency | Compare brief scope with PRD scope | No requirements outside brief scope |
+| Decision continuity | Compare decisions in analyst state vs writer state | No contradictions |
+| Artifact existence | Check file paths | product-brief.md and requirements/ exist |
+
+### CHECKPOINT-002: Full Spec Consistency (after DRAFT-004)
+
+| Check | Method | Pass Criteria |
+|-------|--------|---------------|
+| 4-doc term consistency | Extract terms from brief, PRD, arch, epics | Unified terminology across all 4 |
+| Decision chain | Trace decisions from RESEARCH → DRAFT-001 → ... → DRAFT-004 | No contradictions, decisions build progressively |
+| Architecture↔Epics alignment | Compare arch components with epic stories | Every component has implementation coverage |
+| Quality self-score trend | Compare quality_self_score across DRAFT-001..004 states | Not degrading (score[N] >= score[N-1] - 10) |
+| Open questions resolved | Check open_questions across all states | No critical open questions remaining |
+| Wisdom consistency | Cross-check wisdom entries against artifacts | No contradictory entries |
+
+### CHECKPOINT-003: Plan ↔ Input Alignment (after PLAN-001)
+
+| Check | Method | Pass Criteria |
+|-------|--------|---------------|
+| Plan covers requirements | Compare plan.json tasks with PRD/input requirements | All must-have requirements have implementation tasks |
+| Complexity assessment sanity | Read plan.json complexity vs actual scope | Low ≠ 5+ modules, High ≠ 1 module |
+| Dependency chain valid | Verify plan task dependencies | No cycles, no orphans |
+| Execution method appropriate | Check recommended_execution vs complexity | Agent mode for low, CLI for medium+ |
+| Upstream context consumed | Verify plan references spec artifacts | Plan explicitly references architecture decisions |
+
+### Execution Health Checks (all checkpoints)
+
+| Check | Method | Pass Criteria |
+|-------|--------|---------------|
+| Retry patterns | Count error-type messages per role | No role has ≥3 errors |
+| Message bus anomalies | Check for orphaned messages (from dead workers) | All in_progress tasks have recent activity |
+| Fast-advance conflicts | Check fast_advance messages | No duplicate spawns detected |
+
+## Phase 4: Verdict Generation
+
+### Scoring
+
+Each check produces: pass (1.0) | warn (0.5) | fail (0.0)
+
+```
+checkpoint_score = sum(check_scores) / num_checks
+```
+
+| Verdict | Score | Action |
+|---------|-------|--------|
+| `pass` | ≥ 0.8 | Auto-proceed, log report |
+| `warn` | 0.5-0.79 | Proceed with recorded risks in wisdom |
+| `block` | < 0.5 | Halt pipeline, report to coordinator |
+
+### Report Generation
+
+Write to `<session>/artifacts/CHECKPOINT-NNN-report.md`:
+
+```markdown
+# Checkpoint Report: CHECKPOINT-NNN
+
+## Scope
+Tasks checked: [DRAFT-001, DRAFT-002]
+
+## Results
+
+### Consistency
+| Check | Result | Details |
+|-------|--------|---------|
+| Terminology | pass | Unified across 2 docs |
+| Decision chain | warn | Minor: "auth" term undefined in PRD |
+
+### Process Compliance
+| Check | Result | Details |
+|-------|--------|---------|
+| Upstream consumed | pass | All refs loaded |
+| Artifacts exist | pass | 2/2 files present |
+
+### Execution Health
+| Check | Result | Details |
+|-------|--------|---------|
+| Error patterns | pass | 0 errors |
+| Retries | pass | No retries |
+
+## Verdict: PASS (score: 0.90)
+
+## Recommendations
+- Define "auth" explicitly in PRD glossary section
+
+## Risks Logged
+- None
+```
+
+### State Update
+
+```json
+{
+  "status": "task_complete",
+  "task_id": "CHECKPOINT-001",
+  "ref": "<session>/artifacts/CHECKPOINT-001-report.md",
+  "key_findings": ["Terminology aligned", "Decision chain consistent"],
+  "decisions": ["Proceed to architecture phase"],
+  "verification": "self-validated",
+  "supervision_verdict": "pass",
+  "supervision_score": 0.90,
+  "risks_logged": 0,
+  "blocks_detected": 0
+}
+```
+
+## Error Handling
+
+| Scenario | Resolution |
+|----------|------------|
+| Artifact file not found | Score as warn (not fail), log missing path |
+| Message bus empty | Score as warn, note "no messages to analyze" |
+| State missing for upstream role | Use artifact reading as fallback |
+| All checks pass trivially | Still generate report for audit trail |
+| Checkpoint blocked but user overrides | Log override in wisdom, proceed |