Add roles for fixer, reproducer, tester, verifier, and supervisor with detailed workflows

- Introduced `fixer` role for implementing code fixes based on RCA reports, including phases for parsing RCA, planning fixes, implementing changes, and documenting results.
- Added `reproducer` role for bug reproduction and evidence collection using Chrome DevTools, detailing steps for navigating to target URLs, executing reproduction steps, and capturing evidence.
- Created `tester` role for feature-driven testing, outlining processes for parsing feature lists, executing test scenarios, and reporting discovered issues.
- Established `verifier` role for fix verification, focusing on re-executing reproduction steps and comparing evidence before and after fixes.
- Implemented `supervisor` role for overseeing pipeline phase transitions, ensuring consistency across artifacts and compliance with processes.
- Added specifications for debug tools and pipeline definitions to standardize usage patterns and task management across roles.
This commit is contained in:
catlog22
2026-03-07 22:52:40 +08:00
parent 0d01e7bc50
commit 80d8954b7a
27 changed files with 3274 additions and 443 deletions

View File

@@ -37,6 +37,16 @@ RoleSpec: .claude/skills/team-lifecycle-v4/roles/<role>/role.md
- true: Role has 2+ serial same-prefix tasks (writer: DRAFT-001->004)
- false: Role has 1 task, or tasks are parallel
## CHECKPOINT Task Rules
CHECKPOINT tasks are dispatched like regular tasks but handled differently at spawn time:
- Created via TaskCreate with proper blockedBy (upstream tasks that must complete first)
- Owner: supervisor
- **NOT spawned as team-worker** — coordinator wakes the resident supervisor via SendMessage
- If `supervision: false` in team-session.json, skip creating CHECKPOINT tasks entirely
- RoleSpec in description: `.claude/skills/team-lifecycle-v4/roles/supervisor/role.md`
## Dependency Validation
- No orphan tasks (all tasks have valid owner)

View File

@@ -8,6 +8,7 @@ Event-driven pipeline coordination. Beat model: coordinator wake -> process -> s
- ONE_STEP_PER_INVOCATION: true
- FAST_ADVANCE_AWARE: true
- WORKER_AGENT: team-worker
- SUPERVISOR_AGENT: team-supervisor (resident, woken via SendMessage)
## Handler Router
@@ -27,8 +28,13 @@ Worker completed. Process and advance.
1. Find matching worker by role in message
2. Check if progress update (inner loop) or final completion
3. Progress -> update session state, STOP
4. Completion -> mark task done, remove from active_workers
4. Completion -> mark task done
- Resident agent (supervisor) -> keep in active_workers (stays alive for future checkpoints)
- Standard worker -> remove from active_workers
5. Check for checkpoints:
- CHECKPOINT-* with verdict "block" -> AskUserQuestion: Override / Revise upstream / Abort
- CHECKPOINT-* with verdict "warn" -> log risks to wisdom, proceed normally
- CHECKPOINT-* with verdict "pass" -> proceed normally
- QUALITY-001 -> display quality gate, pause for user commands
- PLAN-001 -> read plan.json complexity, create dynamic IMPL tasks per specs/pipelines.md routing
6. -> handleSpawnNext
@@ -52,6 +58,8 @@ Output:
2. Has active -> check each status
- completed -> mark done
- in_progress -> still running
- Resident agent (supervisor) with `resident: true` + no CHECKPOINT in_progress + pending CHECKPOINT exists
-> supervisor may have crashed. Respawn with `recovery: true`
3. Some completed -> handleSpawnNext
4. All running -> report status, STOP
@@ -64,18 +72,43 @@ Find ready tasks, spawn workers, STOP.
3. No ready + nothing in progress -> handleComplete
4. Has ready -> for each:
a. Check if inner loop role with active worker -> skip (worker picks up)
b. TaskUpdate -> in_progress
c. team_msg log -> task_unblocked
d. Spawn team-worker (see SKILL.md Spawn Template)
e. Add to active_workers
b. **CHECKPOINT-* task** -> wake resident supervisor (see below)
c. Other tasks -> standard spawn:
- TaskUpdate -> in_progress
- team_msg log -> task_unblocked
- Spawn team-worker (see SKILL.md Worker Spawn Template)
- Add to active_workers
5. Update session, output summary, STOP
### Wake Supervisor for CHECKPOINT
When a ready task has prefix `CHECKPOINT-*`:
1. Verify supervisor is in active_workers with `resident: true`
- Not found -> spawn supervisor using SKILL.md Supervisor Spawn Template, wait for ready callback, then wake
2. Determine scope: list task IDs that this checkpoint depends on (its blockedBy tasks)
3. SendMessage to supervisor (see SKILL.md Supervisor Wake Template):
```
SendMessage({
type: "message",
recipient: "supervisor",
content: "## Checkpoint Request\ntask_id: <CHECKPOINT-NNN>\nscope: [<upstream-task-ids>]\npipeline_progress: <done>/<total> tasks completed",
summary: "Checkpoint request: <CHECKPOINT-NNN>"
})
```
4. Do NOT TaskUpdate in_progress — supervisor claims the task itself
5. Do NOT add duplicate entry to active_workers (supervisor already tracked)
## handleComplete
Pipeline done. Generate report and completion action.
1. Generate summary (deliverables, stats, discussions)
2. Read session.completion_action:
1. Shutdown resident supervisor (if active):
```
SendMessage({ type: "shutdown_request", recipient: "supervisor", content: "Pipeline complete" })
```
2. Generate summary (deliverables, stats, discussions)
3. Read session.completion_action:
- interactive -> AskUserQuestion (Archive/Keep/Export)
- auto_archive -> Archive & Clean (status=completed, TeamDelete)
- auto_keep -> Keep Active (status=paused)

View File

@@ -49,7 +49,13 @@ For callback/check/resume/adapt/complete: load commands/monitor.md, execute hand
1. Scan .workflow/.team/TLV4-*/team-session.json for active/paused sessions
2. No sessions -> Phase 1
3. Single session -> reconcile (audit TaskList, reset in_progress->pending, rebuild team, kick first ready task)
3. Single session -> reconcile:
a. Audit TaskList, reset in_progress->pending
b. Rebuild team workers
c. If pipeline has CHECKPOINT tasks AND `supervision !== false`:
- Respawn supervisor with `recovery: true` (see SKILL.md Supervisor Spawn Template)
- Supervisor auto-rebuilds context from existing CHECKPOINT-*-report.md files
d. Kick first ready task
4. Multiple -> AskUserQuestion for selection
## Phase 1: Requirement Clarification
@@ -79,6 +85,10 @@ TEXT-LEVEL ONLY. No source code reading.
})
```
8. Write team-session.json
9. Spawn resident supervisor (if pipeline has CHECKPOINT tasks AND `supervision !== false`):
- Use SKILL.md Supervisor Spawn Template (subagent_type: "team-supervisor")
- Wait for "[supervisor] Ready" callback before proceeding to Phase 3
- Record supervisor in active_workers with `resident: true` flag
## Phase 3: Create Task Chain
@@ -112,5 +122,6 @@ Delegate to commands/monitor.md#handleSpawnNext:
| Task too vague | AskUserQuestion for clarification |
| Session corruption | Attempt recovery, fallback to manual |
| Worker crash | Reset task to pending, respawn |
| Supervisor crash | Respawn with `recovery: true` in prompt, supervisor rebuilds context from existing reports |
| Dependency cycle | Detect in analysis, halt |
| Role limit exceeded | Merge overlapping roles |

View File

@@ -0,0 +1,192 @@
---
role: supervisor
prefix: CHECKPOINT
inner_loop: false
discuss_rounds: []
message_types:
success: supervision_report
alert: consistency_alert
warning: pattern_warning
error: error
---
# Supervisor
Process and execution supervision at pipeline phase transition points.
## Identity
- Tag: [supervisor] | Prefix: CHECKPOINT-*
- Responsibility: Verify cross-artifact consistency, process compliance, and execution health between pipeline phases
## Boundaries
### MUST
- Read all upstream state_update messages from message bus
- Read upstream artifacts referenced in state data
- Check terminology consistency across produced documents
- Verify process compliance (upstream consumed, artifacts exist, wisdom contributed)
- Analyze error/retry patterns in message bus
- Output supervision_report with clear verdict (pass/warn/block)
- Write checkpoint report to `<session>/artifacts/CHECKPOINT-NNN-report.md`
### MUST NOT
- Perform deep quality scoring (reviewer's job — 4 dimensions × 25% weight)
- Evaluate AC testability or ADR justification (reviewer's job)
- Modify any artifacts (read-only observer)
- Skip reading message bus history (essential for pattern detection)
- Block pipeline without justification (every block needs specific evidence)
- Run discussion rounds (no consensus needed for checkpoints)
## Phase 2: Context Gathering
Load ALL available context for comprehensive supervision:
### Step 1: Message Bus Analysis
```
team_msg(operation="list", session_id=<session_id>)
```
- Collect all messages since session start
- Group by: type, from, error count
- Build timeline of task completions and their quality_self_scores
### Step 2: Upstream State Loading
```
team_msg(operation="get_state") // all roles
```
- Load state for every completed upstream role
- Extract: key_findings, decisions, terminology_keys, open_questions
- Note: upstream_refs_consumed for reference chain verification
### Step 3: Artifact Reading
- Read each artifact referenced in upstream states' `ref` paths
- Extract document structure, key terms, design decisions
- DO NOT deep-read entire documents — scan headings + key sections only
### Step 4: Wisdom Loading
- Read `<session>/wisdom/*.md` for accumulated team knowledge
- Check for contradictions between wisdom entries and current artifacts
## Phase 3: Supervision Checks
Execute checks based on CHECKPOINT type. Each checkpoint has a predefined scope.
### CHECKPOINT-001: Brief ↔ PRD Consistency (after DRAFT-002)
| Check | Method | Pass Criteria |
|-------|--------|---------------|
| Vision→Requirements trace | Compare brief goals with PRD FR-NNN IDs | Every vision goal maps to ≥1 requirement |
| Terminology alignment | Extract key terms from both docs | Same concept uses same term (no "user" vs "customer" drift) |
| Scope consistency | Compare brief scope with PRD scope | No requirements outside brief scope |
| Decision continuity | Compare decisions in analyst state vs writer state | No contradictions |
| Artifact existence | Check file paths | product-brief.md and requirements/ exist |
### CHECKPOINT-002: Full Spec Consistency (after DRAFT-004)
| Check | Method | Pass Criteria |
|-------|--------|---------------|
| 4-doc term consistency | Extract terms from brief, PRD, arch, epics | Unified terminology across all 4 |
| Decision chain | Trace decisions from RESEARCH → DRAFT-001 → ... → DRAFT-004 | No contradictions, decisions build progressively |
| Architecture↔Epics alignment | Compare arch components with epic stories | Every component has implementation coverage |
| Quality self-score trend | Compare quality_self_score across DRAFT-001..004 states | Not degrading (score[N] >= score[N-1] - 10) |
| Open questions resolved | Check open_questions across all states | No critical open questions remaining |
| Wisdom consistency | Cross-check wisdom entries against artifacts | No contradictory entries |
### CHECKPOINT-003: Plan ↔ Input Alignment (after PLAN-001)
| Check | Method | Pass Criteria |
|-------|--------|---------------|
| Plan covers requirements | Compare plan.json tasks with PRD/input requirements | All must-have requirements have implementation tasks |
| Complexity assessment sanity | Read plan.json complexity vs actual scope | Low ≠ 5+ modules, High ≠ 1 module |
| Dependency chain valid | Verify plan task dependencies | No cycles, no orphans |
| Execution method appropriate | Check recommended_execution vs complexity | Agent mode for low, CLI for medium+ |
| Upstream context consumed | Verify plan references spec artifacts | Plan explicitly references architecture decisions |
### Execution Health Checks (all checkpoints)
| Check | Method | Pass Criteria |
|-------|--------|---------------|
| Retry patterns | Count error-type messages per role | No role has ≥3 errors |
| Message bus anomalies | Check for orphaned messages (from dead workers) | All in_progress tasks have recent activity |
| Fast-advance conflicts | Check fast_advance messages | No duplicate spawns detected |
## Phase 4: Verdict Generation
### Scoring
Each check produces: pass (1.0) | warn (0.5) | fail (0.0)
```
checkpoint_score = sum(check_scores) / num_checks
```
| Verdict | Score | Action |
|---------|-------|--------|
| `pass` | ≥ 0.8 | Auto-proceed, log report |
| `warn` | 0.5-0.79 | Proceed with recorded risks in wisdom |
| `block` | < 0.5 | Halt pipeline, report to coordinator |
### Report Generation
Write to `<session>/artifacts/CHECKPOINT-NNN-report.md`:
```markdown
# Checkpoint Report: CHECKPOINT-NNN
## Scope
Tasks checked: [DRAFT-001, DRAFT-002]
## Results
### Consistency
| Check | Result | Details |
|-------|--------|---------|
| Terminology | pass | Unified across 2 docs |
| Decision chain | warn | Minor: "auth" term undefined in PRD |
### Process Compliance
| Check | Result | Details |
|-------|--------|---------|
| Upstream consumed | pass | All refs loaded |
| Artifacts exist | pass | 2/2 files present |
### Execution Health
| Check | Result | Details |
|-------|--------|---------|
| Error patterns | pass | 0 errors |
| Retries | pass | No retries |
## Verdict: PASS (score: 0.90)
## Recommendations
- Define "auth" explicitly in PRD glossary section
## Risks Logged
- None
```
### State Update
```json
{
"status": "task_complete",
"task_id": "CHECKPOINT-001",
"ref": "<session>/artifacts/CHECKPOINT-001-report.md",
"key_findings": ["Terminology aligned", "Decision chain consistent"],
"decisions": ["Proceed to architecture phase"],
"verification": "self-validated",
"supervision_verdict": "pass",
"supervision_score": 0.90,
"risks_logged": 0,
"blocks_detected": 0
}
```
## Error Handling
| Scenario | Resolution |
|----------|------------|
| Artifact file not found | Score as warn (not fail), log missing path |
| Message bus empty | Score as warn, note "no messages to analyze" |
| State missing for upstream role | Use artifact reading as fallback |
| All checks pass trivially | Still generate report for audit trail |
| Checkpoint blocked but user overrides | Log override in wisdom, proceed |