feat: migrate all codex team skills from spawn_agents_on_csv to spawn_agent + wait_agent architecture

- Delete 21 old team skill directories using CSV-wave pipeline pattern (~100+ files)
- Delete old team-lifecycle (v3) and team-planex-v2
- Create generic team-worker.toml and team-supervisor.toml (replacing tlv4-specific TOMLs)
- Convert 19 team skills from Claude Code format (Agent/SendMessage/TaskCreate)
  to Codex format (spawn_agent/wait_agent/tasks.json/request_user_input)
- Update team-lifecycle-v4 to use generic agent types (team_worker/team_supervisor)
- Convert all coordinator role files: dispatch.md, monitor.md, role.md
- Convert all worker role files: remove run_in_background, fix Bash syntax
- Convert all specs/pipelines.md references
- Final state: 20 team skills, 217 .md files, zero Claude Code API residuals

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
catlog22
2026-03-24 16:54:48 +08:00
parent 54283e5dbb
commit 1e560ab8e8
334 changed files with 28996 additions and 35516 deletions

View File

@@ -0,0 +1,80 @@
---
role: analyst
prefix: QAANA
inner_loop: false
message_types:
success: analysis_ready
report: quality_report
error: error
---
# Quality Analyst
Analyze defect patterns, coverage gaps, test effectiveness, and generate comprehensive quality reports. Maintain defect pattern database and provide quality scoring.
## Phase 2: Context Loading
| Input | Source | Required |
|-------|--------|----------|
| Task description | From task subject/description | Yes |
| Session path | Extracted from task description | Yes |
| .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
| Discovered issues | meta.json -> discovered_issues | No |
| Test strategy | meta.json -> test_strategy | No |
| Generated tests | meta.json -> generated_tests | No |
| Execution results | meta.json -> execution_results | No |
| Historical patterns | meta.json -> defect_patterns | No |
1. Extract session path from task description
2. Read .msg/meta.json for all accumulated QA data
3. Read coverage data from `coverage/coverage-summary.json` if available
4. Read layer execution results from `<session>/results/run-*.json`
5. Select analysis mode:
| Data Points | Mode |
|-------------|------|
| <= 5 issues + results | Direct inline analysis |
| > 5 | CLI-assisted deep analysis via gemini |
## Phase 3: Multi-Dimensional Analysis
**Five analysis dimensions**:
1. **Defect Pattern Analysis**: Group issues by type/perspective, identify patterns with >= 2 occurrences, record type/count/files/description
2. **Coverage Gap Analysis**: Compare actual coverage vs layer targets, identify per-file gaps (< 50% coverage), severity: critical (< 20%) / high (< 50%)
3. **Test Effectiveness**: Per layer -- files generated, pass rate, iterations needed, coverage achieved. Effective = pass_rate >= 95% AND iterations <= 2
4. **Quality Trend**: Compare against coverage_history. Trend: improving (delta > 5%), declining (delta < -5%), stable
5. **Quality Score** (0-100 starting from 100):
| Factor | Impact |
|--------|--------|
| Security issues | -10 per issue |
| Bug issues | -5 per issue |
| Coverage gap | -0.5 per gap percentage |
| Test failures | -(100 - pass_rate) * 0.3 per layer |
| Effective test layers | +5 per layer |
| Improving trend | +3 |
For CLI-assisted mode:
```
PURPOSE: Deep quality analysis on QA results to identify defect patterns and improvement opportunities
TASK: Classify defects by root cause, identify high-density files, analyze coverage gaps vs risk, generate recommendations
MODE: analysis
```
## Phase 4: Report Generation & Output
1. Generate quality report markdown with: score, defect patterns, coverage analysis, test effectiveness, quality trend, recommendations
2. Write report to `<session>/analysis/quality-report.md`
3. Update `<session>/wisdom/.msg/meta.json`:
- `defect_patterns`: identified patterns array
- `quality_score`: calculated score
- `coverage_history`: append new data point (date, coverage, quality_score, issues)
**Score-based recommendations**:
| Score | Recommendation |
|-------|----------------|
| >= 80 | Quality is GOOD. Maintain current testing practices. |
| 60-79 | Quality needs IMPROVEMENT. Focus on coverage gaps and recurring patterns. |
| < 60 | Quality is CONCERNING. Recommend comprehensive review and testing effort. |

View File

@@ -0,0 +1,72 @@
# Analyze Task
Parse user task -> detect QA capabilities -> build dependency graph -> design roles.
**CONSTRAINT**: Text-level analysis only. NO source code reading, NO codebase exploration.
## Signal Detection
| Keywords | Capability | Prefix |
|----------|------------|--------|
| scan, discover, find issues, audit | scout | SCOUT |
| strategy, plan, test layers, coverage | strategist | QASTRAT |
| generate tests, write tests, create tests | generator | QAGEN |
| run tests, execute, fix tests | executor | QARUN |
| analyze, report, quality score | analyst | QAANA |
## QA Mode Detection
| Condition | Mode |
|-----------|------|
| Keywords: discovery, scan, issues, bug-finding | discovery |
| Keywords: test, coverage, TDD, unit, integration | testing |
| Both keyword types OR no clear match | full |
## Dependency Graph
Natural ordering tiers for QA pipeline:
- Tier 0: scout (issue discovery)
- Tier 1: strategist (strategy requires scout discoveries)
- Tier 2: generator (generation requires strategy)
- Tier 3: executor (execution requires generated tests)
- Tier 4: analyst (analysis requires execution results)
## Pipeline Definitions
```
Discovery Mode: SCOUT -> QASTRAT -> QAGEN(L1) -> QARUN(L1) -> QAANA
Testing Mode: QASTRAT -> QAGEN(L1) -> QARUN(L1) -> QAGEN(L2) -> QARUN(L2) -> QAANA
Full Mode: SCOUT -> QASTRAT -> [QAGEN(L1) || QAGEN(L2)] -> [QARUN(L1) || QARUN(L2)] -> QAANA -> SCOUT(regression)
```
## Complexity Scoring
| Factor | Points |
|--------|--------|
| Per capability | +1 |
| Cross-domain (test + discovery) | +2 |
| Parallel tracks | +1 per track |
| Serial depth > 3 | +1 |
Results: 1-3 Low, 4-6 Medium, 7+ High
## Role Minimization
- Cap at 6 roles (coordinator + 5 workers)
- Merge overlapping capabilities
- Absorb trivial single-step roles
## Output
Write <session>/task-analysis.json:
```json
{
"task_description": "<original>",
"pipeline_mode": "<discovery|testing|full>",
"capabilities": [{ "name": "<cap>", "prefix": "<PREFIX>", "keywords": ["..."] }],
"dependency_graph": { "<TASK-ID>": { "role": "<role>", "addBlockedBy": ["..."], "priority": "P0|P1|P2" } },
"roles": [{ "name": "<role>", "prefix": "<PREFIX>", "inner_loop": false }],
"complexity": { "score": 0, "level": "Low|Medium|High" },
"gc_loop_enabled": true
}
```

View File

@@ -0,0 +1,108 @@
# Dispatch Tasks
Create task chains from dependency graph with proper blockedBy relationships.
## Workflow
1. Read task-analysis.json -> extract pipeline_mode and dependency_graph
2. Read specs/pipelines.md -> get task registry for selected pipeline
3. Topological sort tasks (respect blockedBy)
4. Validate all owners exist in role registry (SKILL.md)
5. For each task (in order):
- Build JSON entry with structured description (see template below)
- Set blockedBy and owner fields in the entry
6. Write all entries to `<session>/tasks.json`
7. Update session.json with pipeline.tasks_total
8. Validate chain (no orphans, no cycles, all refs valid)
## Task Description Template
Each task is a JSON entry in the tasks array:
```json
{
"id": "<TASK-ID>",
"subject": "<TASK-ID>",
"description": "PURPOSE: <goal> | Success: <criteria>\nTASK:\n - <step 1>\n - <step 2>\nCONTEXT:\n - Session: <session-folder>\n - Layer: <L1-unit|L2-integration|L3-e2e> (if applicable)\n - Upstream artifacts: <list>\n - Shared memory: <session>/wisdom/.msg/meta.json\nEXPECTED: <artifact path> + <quality criteria>\nCONSTRAINTS: <scope limits>\n---\nInnerLoop: <true|false>\nRoleSpec: ~ or <project>/.codex/skills/team-quality-assurance/roles/<role>/role.md",
"status": "pending",
"owner": "<role>",
"blockedBy": ["<dependency-list>"]
}
```
## Pipeline Task Registry
### Discovery Mode
```
SCOUT-001 (scout): Multi-perspective issue scanning
blockedBy: []
QASTRAT-001 (strategist): Test strategy formulation
blockedBy: [SCOUT-001]
QAGEN-001 (generator): L1 unit test generation
blockedBy: [QASTRAT-001], meta: layer=L1
QARUN-001 (executor): L1 test execution + fix cycles
blockedBy: [QAGEN-001], inner_loop: true, meta: layer=L1
QAANA-001 (analyst): Quality analysis report
blockedBy: [QARUN-001]
```
### Testing Mode
```
QASTRAT-001 (strategist): Test strategy formulation
blockedBy: []
QAGEN-L1-001 (generator): L1 unit test generation
blockedBy: [QASTRAT-001], meta: layer=L1
QARUN-L1-001 (executor): L1 test execution + fix cycles
blockedBy: [QAGEN-L1-001], inner_loop: true, meta: layer=L1
QAGEN-L2-001 (generator): L2 integration test generation
blockedBy: [QARUN-L1-001], meta: layer=L2
QARUN-L2-001 (executor): L2 test execution + fix cycles
blockedBy: [QAGEN-L2-001], inner_loop: true, meta: layer=L2
QAANA-001 (analyst): Quality analysis report
blockedBy: [QARUN-L2-001]
```
### Full Mode
```
SCOUT-001 (scout): Multi-perspective issue scanning
blockedBy: []
QASTRAT-001 (strategist): Test strategy formulation
blockedBy: [SCOUT-001]
QAGEN-L1-001 (generator-1): L1 unit test generation
blockedBy: [QASTRAT-001], meta: layer=L1
QAGEN-L2-001 (generator-2): L2 integration test generation
blockedBy: [QASTRAT-001], meta: layer=L2
QARUN-L1-001 (executor-1): L1 test execution + fix cycles
blockedBy: [QAGEN-L1-001], inner_loop: true, meta: layer=L1
QARUN-L2-001 (executor-2): L2 test execution + fix cycles
blockedBy: [QAGEN-L2-001], inner_loop: true, meta: layer=L2
QAANA-001 (analyst): Quality analysis report
blockedBy: [QARUN-L1-001, QARUN-L2-001]
SCOUT-002 (scout): Regression scan after fixes
blockedBy: [QAANA-001]
```
## InnerLoop Flag Rules
- true: executor roles (run-fix cycles)
- false: scout, strategist, generator, analyst roles
## Dependency Validation
- No orphan tasks (all tasks have valid owner)
- No circular dependencies
- All blockedBy references exist
- Session reference in every task description
- RoleSpec reference in every task description
## Log After Creation
```
mcp__ccw-tools__team_msg({
operation: "log",
session_id: <session-id>,
from: "coordinator",
type: "pipeline_selected",
data: { pipeline: "<mode>", task_count: <N> }
})
```

View File

@@ -0,0 +1,209 @@
# Monitor Pipeline
Event-driven pipeline coordination. Beat model: coordinator wake -> process -> spawn -> STOP.
## Constants
- SPAWN_MODE: background
- ONE_STEP_PER_INVOCATION: true
- FAST_ADVANCE_AWARE: true
- WORKER_AGENT: team-worker
- MAX_GC_ROUNDS: 3
## Handler Router
| Source | Handler |
|--------|---------|
| Message contains [scout], [strategist], [generator], [executor], [analyst] | handleCallback |
| "capability_gap" | handleAdapt |
| "check" or "status" | handleCheck |
| "resume" or "continue" | handleResume |
| All tasks completed | handleComplete |
| Default | handleSpawnNext |
## handleCallback
Worker completed. Process and advance.
1. Parse message to identify role and task ID:
| Message Pattern | Role Detection |
|----------------|---------------|
| `[scout]` or task ID `SCOUT-*` | scout |
| `[strategist]` or task ID `QASTRAT-*` | strategist |
| `[generator]` or task ID `QAGEN-*` | generator |
| `[executor]` or task ID `QARUN-*` | executor |
| `[analyst]` or task ID `QAANA-*` | analyst |
2. Check if progress update (inner loop) or final completion
3. Progress -> update session state, STOP
4. Completion -> mark task done (read `<session>/tasks.json`, set status to "completed", write back), remove from active_workers
5. Check for checkpoints:
- QARUN-* completes -> read meta.json for coverage:
- coverage >= target OR gc_rounds >= MAX_GC_ROUNDS -> proceed to handleSpawnNext
- coverage < target AND gc_rounds < MAX_GC_ROUNDS -> create GC fix tasks, increment gc_rounds
**GC Fix Task Creation** (when coverage below target) -- add new entries to `<session>/tasks.json`:
```json
{
"id": "QAGEN-fix-<round>",
"subject": "QAGEN-fix-<round>: Fix tests for <layer> (GC #<round>)",
"description": "PURPOSE: Fix failing tests and improve coverage | Success: Coverage meets target\nTASK:\n - Load execution results and failing test details\n - Fix broken tests and add missing coverage\nCONTEXT:\n - Session: <session-folder>\n - Layer: <layer>\n - Previous results: <session>/results/run-<layer>.json\nEXPECTED: Fixed test files | Improved coverage\nCONSTRAINTS: Only modify test files | No source changes\n---\nInnerLoop: false\nRoleSpec: ~ or <project>/.codex/skills/team-quality-assurance/roles/generator/role.md",
"status": "pending",
"owner": "generator",
"blockedBy": []
}
```
```json
{
"id": "QARUN-gc-<round>",
"subject": "QARUN-gc-<round>: Re-execute <layer> (GC #<round>)",
"description": "PURPOSE: Re-execute tests after fixes | Success: Coverage >= target\nTASK: Execute test suite, measure coverage, report results\nCONTEXT:\n - Session: <session-folder>\n - Layer: <layer>\nEXPECTED: <session>/results/run-<layer>-gc-<round>.json\nCONSTRAINTS: Read-only execution\n---\nInnerLoop: false\nRoleSpec: ~ or <project>/.codex/skills/team-quality-assurance/roles/executor/role.md",
"status": "pending",
"owner": "executor",
"blockedBy": ["QAGEN-fix-<round>"]
}
```
6. -> handleSpawnNext
## handleCheck
Read-only status report, then STOP.
Output:
```
[coordinator] QA Pipeline Status
[coordinator] Mode: <pipeline_mode>
[coordinator] Progress: <done>/<total> (<pct>%)
[coordinator] GC Rounds: <gc_rounds>/3
[coordinator] Pipeline Graph:
SCOUT-001: <done|run|wait> <summary>
QASTRAT-001: <done|run|wait> <summary>
QAGEN-001: <done|run|wait> <summary>
QARUN-001: <done|run|wait> <summary>
QAANA-001: <done|run|wait> <summary>
[coordinator] Active Workers: <list with elapsed time>
[coordinator] Ready: <pending tasks with resolved deps>
[coordinator] Commands: 'resume' to advance | 'check' to refresh
```
Then STOP.
## handleResume
1. No active workers -> handleSpawnNext
2. Has active -> check each status
- completed -> mark done (update tasks.json)
- in_progress -> still running
3. Some completed -> handleSpawnNext
4. All running -> report status, STOP
## handleSpawnNext
Find ready tasks, spawn workers, STOP.
1. Collect from `<session>/tasks.json`:
- completedSubjects: status = completed
- inProgressSubjects: status = in_progress
- readySubjects: status = pending AND all blockedBy in completedSubjects
2. No ready + work in progress -> report waiting, STOP
3. No ready + nothing in progress -> handleComplete
4. Has ready -> for each:
a. Determine role from task prefix:
| Prefix | Role | inner_loop |
|--------|------|------------|
| SCOUT-* | scout | false |
| QASTRAT-* | strategist | false |
| QAGEN-* | generator | false |
| QARUN-* | executor | true |
| QAANA-* | analyst | false |
b. Check if inner loop role with active worker -> skip (worker picks up next task)
c. Update task status to "in_progress" in tasks.json
d. team_msg log -> task_unblocked
e. Spawn team-worker:
```
spawn_agent({
agent_type: "team_worker",
items: [{
description: "Spawn <role> worker for <subject>",
team_name: "quality-assurance",
name: "<role>",
prompt: `## Role Assignment
role: <role>
role_spec: ~ or <project>/.codex/skills/team-quality-assurance/roles/<role>/role.md
session: <session-folder>
session_id: <session-id>
team_name: quality-assurance
requirement: <task-description>
inner_loop: <true|false>
## Current Task
- Task ID: <task-id>
- Task: <subject>
Read role_spec file to load Phase 2-4 domain instructions.
Execute built-in Phase 1 (task discovery) -> role Phase 2-4 -> built-in Phase 5 (report).`
}]
})
```
f. Add to active_workers
5. Update session, output summary, STOP
6. Use `wait_agent({ ids: [<spawned-agent-ids>] })` to wait for callbacks. Workers use `report_agent_job_result()` to send results back.
## handleComplete
Pipeline done. Generate report and completion action.
1. Verify all tasks (including GC fix/recheck tasks) have status "completed" or "deleted" in tasks.json
2. If any tasks incomplete -> return to handleSpawnNext
3. If all complete:
- Read final state from meta.json (quality_score, coverage, gc_rounds)
- Generate summary (deliverables, stats, discussions)
4. Read session.completion_action:
- interactive -> request_user_input (Archive/Keep/Export)
- auto_archive -> Archive & Clean (status=completed, remove/archive session folder)
- auto_keep -> Keep Active (status=paused)
## handleAdapt
Capability gap reported mid-pipeline.
1. Parse gap description
2. Check if existing role covers it -> redirect
3. Role count < 6 -> generate dynamic role-spec in <session>/role-specs/
4. Add new task entry to tasks.json, spawn worker
5. Role count >= 6 -> merge or pause
## Fast-Advance Reconciliation
On every coordinator wake:
1. Read team_msg entries with type="fast_advance"
2. Sync active_workers with spawned successors
3. No duplicate spawns
## Phase 4: State Persistence
After every handler execution:
1. Reconcile active_workers with actual tasks.json states
2. Remove entries for completed/deleted tasks
3. Write updated meta.json
4. STOP (wait for next callback)
## Error Handling
| Scenario | Resolution |
|----------|------------|
| Session file not found | Error, suggest re-initialization |
| Worker callback from unknown role | Log info, scan for other completions |
| Pipeline stall (no ready, no running, has pending) | Check blockedBy chains, report to user |
| GC loop exceeded | Accept current coverage with warning, proceed |
| Scout finds 0 issues | Skip to testing mode, proceed to QASTRAT |

View File

@@ -0,0 +1,143 @@
# Coordinator Role
Orchestrate team-quality-assurance: analyze -> dispatch -> spawn -> monitor -> report.
## Identity
- Name: coordinator | Tag: [coordinator]
- Responsibility: Parse requirements -> Mode selection -> Create team -> Dispatch tasks -> Monitor progress -> Report results
## Boundaries
### MUST
- Parse task description and detect QA mode
- Create team and spawn team-worker agents in background
- Dispatch tasks with proper dependency chains
- Monitor progress via callbacks and route messages
- Maintain session state
- Handle GC loop (generator-executor coverage cycles)
- Execute completion action when pipeline finishes
### MUST NOT
- Read source code or explore codebase (delegate to workers)
- Execute scan, test, or analysis work directly
- Modify test files or source code
- Spawn workers with general-purpose agent (MUST use team-worker)
- Generate more than 6 worker roles
## Command Execution Protocol
When coordinator needs to execute a specific phase:
1. Read `commands/<command>.md`
2. Follow the workflow defined in the command
3. Commands are inline execution guides, NOT separate agents
4. Execute synchronously, complete before proceeding
## Entry Router
| Detection | Condition | Handler |
|-----------|-----------|---------|
| Worker callback | Message contains [scout], [strategist], [generator], [executor], [analyst] | -> handleCallback (monitor.md) |
| Status check | Args contain "check" or "status" | -> handleCheck (monitor.md) |
| Manual resume | Args contain "resume" or "continue" | -> handleResume (monitor.md) |
| Capability gap | Message contains "capability_gap" | -> handleAdapt (monitor.md) |
| Pipeline complete | All tasks completed | -> handleComplete (monitor.md) |
| Interrupted session | Active session in .workflow/.team/QA-* | -> Phase 0 |
| New session | None of above | -> Phase 1 |
For callback/check/resume/adapt/complete: load @commands/monitor.md, execute handler, STOP.
## Phase 0: Session Resume Check
1. Scan .workflow/.team/QA-*/session.json for active/paused sessions
2. No sessions -> Phase 1
3. Single session -> reconcile (audit tasks.json, reset in_progress->pending, rebuild team, kick first ready task)
4. Multiple -> request_user_input for selection
## Phase 1: Requirement Clarification
TEXT-LEVEL ONLY. No source code reading.
1. Parse task description and extract flags
2. **QA Mode Selection**:
| Condition | Mode |
|-----------|------|
| Explicit `--mode=discovery` flag | discovery |
| Explicit `--mode=testing` flag | testing |
| Explicit `--mode=full` flag | full |
| Task description contains: discovery/scan/issue keywords | discovery |
| Task description contains: test/coverage/TDD keywords | testing |
| No explicit flag and no keyword match | full (default) |
3. Clarify if ambiguous (request_user_input: scope, deliverables, constraints)
4. Delegate to @commands/analyze.md
5. Output: task-analysis.json
6. CRITICAL: Always proceed to Phase 2, never skip team workflow
## Phase 2: Create Team + Initialize Session
1. Resolve workspace paths (MUST do first):
- `project_root` = result of `Bash({ command: "pwd" })`
- `skill_root` = `<project_root>/.claude/skills/team-quality-assurance`
2. Generate session ID: QA-<slug>-<date>
3. Create session folder structure
4. Initialize session folder structure (replaces TeamCreate)
5. Read specs/pipelines.md -> select pipeline based on mode
6. Register roles in session.json
7. Initialize shared infrastructure (wisdom/*.md)
8. Initialize pipeline via team_msg state_update:
```
mcp__ccw-tools__team_msg({
operation: "log", session_id: "<id>", from: "coordinator",
type: "state_update", summary: "Session initialized",
data: {
pipeline_mode: "<discovery|testing|full>",
pipeline_stages: [...],
team_name: "quality-assurance",
discovered_issues: [],
test_strategy: {},
generated_tests: {},
execution_results: {},
defect_patterns: [],
coverage_history: [],
quality_score: null
}
})
```
9. Write session.json
## Phase 3: Create Task Chain
Delegate to @commands/dispatch.md:
1. Read dependency graph from task-analysis.json
2. Read specs/pipelines.md for selected pipeline's task registry
3. Topological sort tasks
4. Build tasks array as JSON entries in `<session>/tasks.json`; set deps via `blockedBy` field in each entry
5. Update session.json
## Phase 4: Spawn-and-Stop
Delegate to @commands/monitor.md#handleSpawnNext:
1. Find ready tasks (pending + all addBlockedBy dependencies resolved)
2. Spawn team-worker agents (see SKILL.md Spawn Template)
3. Output status summary
4. STOP
## Phase 5: Report + Completion Action
1. Generate summary (deliverables, pipeline stats, quality score, GC rounds)
2. Execute completion action per session.completion_action:
- interactive -> request_user_input (Archive/Keep/Export)
- auto_archive -> Archive & Clean
- auto_keep -> Keep Active
## Error Handling
| Error | Resolution |
|-------|------------|
| Task too vague | request_user_input for clarification |
| Session corruption | Attempt recovery, fallback to manual |
| Worker crash | Reset task to pending, respawn |
| Dependency cycle | Detect in analysis, halt |
| Scout finds nothing | Skip to testing mode |
| GC loop stuck > 3 | Accept current coverage with warning |
| quality_score < 60 | Report with WARNING, suggest re-run |

View File

@@ -0,0 +1,66 @@
---
role: executor
prefix: QARUN
inner_loop: true
additional_prefixes: [QARUN-gc]
message_types:
success: tests_passed
failure: tests_failed
coverage: coverage_report
error: error
---
# Test Executor
Run test suites, collect coverage data, and perform automatic fix cycles when tests fail. Implements the execution side of the Generator-Executor (GC) loop.
## Phase 2: Environment Detection
| Input | Source | Required |
|-------|--------|----------|
| Task description | From task subject/description | Yes |
| Session path | Extracted from task description | Yes |
| .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
| Test strategy | meta.json -> test_strategy | Yes |
| Generated tests | meta.json -> generated_tests | Yes |
| Target layer | task description `layer: L1/L2/L3` | Yes |
1. Extract session path and target layer from task description
2. Load validation specs: Run `ccw spec load --category validation` for verification rules and acceptance criteria
3. Read .msg/meta.json for strategy and generated test file list
3. Detect test command by framework:
| Framework | Command |
|-----------|---------|
| vitest | `npx vitest run --coverage --reporter=json --outputFile=test-results.json` |
| jest | `npx jest --coverage --json --outputFile=test-results.json` |
| pytest | `python -m pytest --cov --cov-report=json -v` |
| mocha | `npx mocha --reporter json > test-results.json` |
| unknown | `npm test -- --coverage` |
4. Get test files from `generated_tests[targetLayer].files`
## Phase 3: Iterative Test-Fix Cycle
**Max iterations**: 5. **Pass threshold**: 95% or all tests pass.
Per iteration:
1. Run test command, capture output
2. Parse results: extract passed/failed counts, parse coverage from output or `coverage/coverage-summary.json`
3. If all pass (0 failures) -> exit loop (success)
4. If pass rate >= 95% and iteration >= 2 -> exit loop (good enough)
5. If iteration >= MAX -> exit loop (report current state)
6. Extract failure details (error lines, assertion failures)
7. Delegate fix via CLI tool with constraints:
- ONLY modify test files, NEVER modify source code
- Fix: incorrect assertions, missing imports, wrong mocks, setup issues
- Do NOT: skip tests, add `@ts-ignore`, use `as any`
8. Increment iteration, repeat
## Phase 4: Result Analysis & Output
1. Build result data: layer, framework, iterations, pass_rate, coverage, tests_passed, tests_failed, all_passed
2. Save results to `<session>/results/run-<layer>.json`
3. Save last test output to `<session>/results/output-<layer>.txt`
4. Update `<session>/wisdom/.msg/meta.json` under `execution_results[layer]` and top-level `execution_results.pass_rate`, `execution_results.coverage`
5. Message type: `tests_passed` if all_passed, else `tests_failed`

View File

@@ -0,0 +1,68 @@
---
role: generator
prefix: QAGEN
inner_loop: false
additional_prefixes: [QAGEN-fix]
message_types:
success: tests_generated
revised: tests_revised
error: error
---
# Test Generator
Generate test code according to strategist's strategy and layers. Support L1 unit tests, L2 integration tests, L3 E2E tests. Follow project's existing test patterns and framework conventions.
## Phase 2: Strategy & Pattern Loading
| Input | Source | Required |
|-------|--------|----------|
| Task description | From task subject/description | Yes |
| Session path | Extracted from task description | Yes |
| .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
| Test strategy | meta.json -> test_strategy | Yes |
| Target layer | task description `layer: L1/L2/L3` | Yes |
1. Extract session path and target layer from task description
2. Read .msg/meta.json for test strategy (layers, coverage targets)
3. Determine if this is a GC fix task (subject contains "fix")
4. Load layer config from strategy: level, name, target_coverage, focus_files
5. Learn existing test patterns -- find 3 similar test files via Glob(`**/*.{test,spec}.{ts,tsx,js,jsx}`)
6. Detect test conventions: file location (colocated vs __tests__), import style, describe/it nesting, framework (vitest/jest/pytest)
## Phase 3: Test Code Generation
**Mode selection**:
| Condition | Mode |
|-----------|------|
| GC fix task | Read failure info from `<session>/results/run-<layer>.json`, fix failing tests only |
| <= 3 focus files | Direct: inline Read source -> Write test file |
| > 3 focus files | Batch by module, delegate via CLI tool |
**Direct generation flow** (per source file):
1. Read source file content, extract exports
2. Determine test file path following project conventions
3. If test exists -> analyze missing cases -> append new tests via Edit
4. If no test -> generate full test file via Write
5. Include: happy path, edge cases, error cases per export
**GC fix flow**:
1. Read execution results and failure output from results directory
2. Read each failing test file
3. Fix assertions, imports, mocks, or test setup
4. Do NOT modify source code, do NOT skip/ignore tests
**General rules**:
- Follow existing test patterns exactly (imports, naming, structure)
- Target coverage per layer config
- Do NOT use `any` type assertions or `@ts-ignore`
## Phase 4: Self-Validation & Output
1. Collect generated/modified test files
2. Run syntax check (TypeScript: `tsc --noEmit`, or framework-specific)
3. Auto-fix syntax errors (max 3 attempts)
4. Write test metadata to `<session>/wisdom/.msg/meta.json` under `generated_tests[layer]`:
- layer, files list, count, syntax_clean, mode, gc_fix flag
5. Message type: `tests_generated` for new, `tests_revised` for GC fix iterations

View File

@@ -0,0 +1,67 @@
---
role: scout
prefix: SCOUT
inner_loop: false
message_types:
success: scan_ready
error: error
issues: issues_found
---
# Multi-Perspective Scout
Scan codebase from multiple perspectives (bug, security, test-coverage, code-quality, UX) to discover potential issues. Produce structured scan results with severity-ranked findings.
## Phase 2: Context & Scope Assessment
| Input | Source | Required |
|-------|--------|----------|
| Task description | From task subject/description | Yes |
| Session path | Extracted from task description | Yes |
| .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
1. Extract session path and target scope from task description
2. Determine scan scope: explicit scope from task or `**/*` default
3. Get recent changed files: `git diff --name-only HEAD~5 2>/dev/null || echo ""`
4. Read .msg/meta.json for historical defect patterns (`defect_patterns`)
5. Select scan perspectives based on task description:
- Default: `["bug", "security", "test-coverage", "code-quality"]`
- Add `"ux"` if task mentions UX/UI
6. Assess complexity to determine scan strategy:
| Complexity | Condition | Strategy |
|------------|-----------|----------|
| Low | < 5 changed files, no specific keywords | ACE search + Grep inline |
| Medium | 5-15 files or specific perspective requested | CLI fan-out (3 core perspectives) |
| High | > 15 files or full-project scan | CLI fan-out (all perspectives) |
## Phase 3: Multi-Perspective Scan
**Low complexity**: Use `mcp__ace-tool__search_context` for quick pattern-based scan.
**Medium/High complexity**: CLI fan-out -- one `ccw cli --mode analysis` per perspective:
For each active perspective, build prompt:
```
PURPOSE: Scan code from <perspective> perspective to discover potential issues
TASK: Analyze code patterns for <perspective> problems, identify anti-patterns, check for common issues
MODE: analysis
CONTEXT: @<scan-scope>
EXPECTED: List of findings with severity (critical/high/medium/low), file:line references, description
CONSTRAINTS: Focus on actionable findings only
```
Execute via: `ccw cli -p "<prompt>" --tool gemini --mode analysis`
After all perspectives complete:
- Parse CLI outputs into structured findings
- Deduplicate by file:line (merge perspectives for same location)
- Compare against known defect patterns from .msg/meta.json
- Rank by severity: critical > high > medium > low
## Phase 4: Result Aggregation
1. Build `discoveredIssues` array from critical + high findings (with id, severity, perspective, file, line, description)
2. Write scan results to `<session>/scan/scan-results.json`:
- scan_date, perspectives scanned, total findings, by_severity counts, findings detail, issues_created count
3. Update `<session>/wisdom/.msg/meta.json`: merge `discovered_issues` field
4. Contribute to wisdom/issues.md if new patterns found

View File

@@ -0,0 +1,71 @@
---
role: strategist
prefix: QASTRAT
inner_loop: false
message_types:
success: strategy_ready
error: error
---
# Test Strategist
Analyze change scope, determine test layers (L1-L3), define coverage targets, and generate test strategy document. Create targeted test plans based on scout discoveries and code changes.
## Phase 2: Context & Change Analysis
| Input | Source | Required |
|-------|--------|----------|
| Task description | From task subject/description | Yes |
| Session path | Extracted from task description | Yes |
| .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
| Discovered issues | meta.json -> discovered_issues | No |
| Defect patterns | meta.json -> defect_patterns | No |
1. Extract session path from task description
2. Read .msg/meta.json for scout discoveries and historical patterns
3. Analyze change scope: `git diff --name-only HEAD~5`
4. Categorize changed files:
| Category | Pattern |
|----------|---------|
| Source | `\.(ts|tsx|js|jsx|py|java|go|rs)$` |
| Test | `\.(test|spec)\.(ts|tsx|js|jsx)$` or `test_` |
| Config | `\.(json|yaml|yml|toml|env)$` |
5. Detect test framework from package.json / project files
6. Check existing coverage baseline from `coverage/coverage-summary.json`
7. Select analysis mode:
| Total Scope | Mode |
|-------------|------|
| <= 5 files + issues | Direct inline analysis |
| 6-15 | Single CLI analysis |
| > 15 | Multi-dimension CLI analysis |
## Phase 3: Strategy Generation
**Layer Selection Logic**:
| Condition | Layer | Target |
|-----------|-------|--------|
| Has source file changes | L1: Unit Tests | 80% |
| >= 3 source files OR critical issues | L2: Integration Tests | 60% |
| >= 3 critical/high severity issues | L3: E2E Tests | 40% |
| No changes but has scout issues | L1 focused on issue files | 80% |
For CLI-assisted analysis, use:
```
PURPOSE: Analyze code changes and scout findings to determine optimal test strategy
TASK: Classify changed files by risk, map issues to test requirements, identify integration points, recommend test layers with coverage targets
MODE: analysis
```
Build strategy document with: scope analysis, layer configs (level, name, target_coverage, focus_files, rationale), priority issues list.
**Validation**: Verify strategy has layers, targets > 0, covers discovered issues, and framework detected.
## Phase 4: Output & Persistence
1. Write strategy to `<session>/strategy/test-strategy.md`
2. Update `<session>/wisdom/.msg/meta.json`: merge `test_strategy` field with scope, layers, coverage_targets, test_framework
3. Contribute to wisdom/decisions.md with layer selection rationale