feat: migrate all codex team skills from spawn_agents_on_csv to spawn_agent + wait_agent architecture

- Delete 21 old team skill directories using CSV-wave pipeline pattern (~100+ files)
- Delete old team-lifecycle (v3) and team-planex-v2
- Create generic team-worker.toml and team-supervisor.toml (replacing tlv4-specific TOMLs)
- Convert 19 team skills from Claude Code format (Agent/SendMessage/TaskCreate)
  to Codex format (spawn_agent/wait_agent/tasks.json/request_user_input)
- Update team-lifecycle-v4 to use generic agent types (team_worker/team_supervisor)
- Convert all coordinator role files: dispatch.md, monitor.md, role.md
- Convert all worker role files: remove run_in_background, fix Bash syntax
- Convert all specs/pipelines.md references
- Final state: 20 team skills, 217 .md files, zero Claude Code API residuals

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
catlog22
2026-03-24 16:54:48 +08:00
parent 54283e5dbb
commit 1e560ab8e8
334 changed files with 28996 additions and 35516 deletions

View File

@@ -1,813 +1,148 @@
---
name: team-quality-assurance
description: Full closed-loop QA combining issue discovery and software testing. Scout -> Strategist -> Generator -> Executor -> Analyst with multi-perspective scanning, progressive test layers, GC loops, and quality scoring. Supports discovery, testing, and full QA modes.
argument-hint: "[-y|--yes] [-c|--concurrency N] [--continue] [--mode=discovery|testing|full] \"task description\""
allowed-tools: spawn_agents_on_csv, spawn_agent, wait, send_input, close_agent, Read, Write, Edit, Bash, Glob, Grep, request_user_input
description: Unified team skill for quality assurance. Full closed-loop QA combining issue discovery and software testing. Triggers on "team quality-assurance", "team qa".
allowed-tools: spawn_agent(*), wait_agent(*), send_input(*), close_agent(*), report_agent_job_result(*), request_user_input(*), Read(*), Write(*), Edit(*), Bash(*), Glob(*), Grep(*)
---
## Auto Mode
When `--yes` or `-y`: Auto-confirm task decomposition, skip interactive validation, use defaults.
# Team Quality Assurance
## Usage
Orchestrate multi-agent QA: scout -> strategist -> generator -> executor -> analyst. Supports discovery, testing, and full closed-loop modes with parallel generation and GC loops.
```bash
$team-quality-assurance "Full QA for the authentication module"
$team-quality-assurance --mode=discovery "Scan codebase for security and bug issues"
$team-quality-assurance --mode=testing "Test recent changes with progressive coverage"
$team-quality-assurance -c 4 --mode=full "Complete QA cycle with regression scanning"
$team-quality-assurance -y "QA all changed files since last commit"
$team-quality-assurance --continue "qa-auth-module-20260308"
```
**Flags**:
- `-y, --yes`: Skip all confirmations (auto mode)
- `-c, --concurrency N`: Max concurrent agents within each wave (default: 3)
- `--continue`: Resume existing session
- `--mode=discovery|testing|full`: Force QA mode (default: auto-detect or full)
**Output Directory**: `.workflow/.csv-wave/{session-id}/`
**Core Output**: `tasks.csv` (master state) + `results.csv` (final) + `discoveries.ndjson` (shared exploration) + `context.md` (human-readable report)
---
## Overview
Orchestrate multi-agent QA pipeline: scout -> strategist -> generator -> executor -> analyst. Supports three modes: **discovery** (issue scanning), **testing** (progressive test coverage), and **full** (closed-loop QA with regression). Multi-perspective scanning from bug, security, test-coverage, code-quality, and UX viewpoints. Progressive layer coverage (L1/L2/L3) with Generator-Critic loops for coverage convergence.
**Execution Model**: Hybrid -- CSV wave pipeline (primary) + individual agent spawn (secondary)
## Architecture
```
+-------------------------------------------------------------------+
| TEAM QUALITY ASSURANCE WORKFLOW |
+-------------------------------------------------------------------+
| |
| Phase 0: Pre-Wave Interactive (Requirement Clarification) |
| +- Parse task description, detect QA mode |
| +- Mode selection (discovery/testing/full) |
| +- Output: refined requirements for decomposition |
| |
| Phase 1: Requirement -> CSV + Classification |
| +- Select pipeline based on QA mode |
| +- Build dependency chain with appropriate roles |
| +- Classify tasks: csv-wave | interactive (exec_mode) |
| +- Compute dependency waves (topological sort) |
| +- Generate tasks.csv with wave + exec_mode columns |
| +- User validates task breakdown (skip if -y) |
| |
| Phase 2: Wave Execution Engine (Extended) |
| +- For each wave (1..N): |
| | +- Execute pre-wave interactive tasks (if any) |
| | +- Build wave CSV (filter csv-wave tasks for this wave) |
| | +- Inject previous findings into prev_context column |
| | +- spawn_agents_on_csv(wave CSV) |
| | +- Execute post-wave interactive tasks (if any) |
| | +- Merge all results into master tasks.csv |
| | +- GC Loop Check: coverage < target? -> spawn fix tasks |
| | +- Check: any failed? -> skip dependents |
| +- discoveries.ndjson shared across all modes (append-only) |
| |
| Phase 3: Post-Wave Interactive (Completion Action) |
| +- Pipeline completion report with quality score |
| +- Interactive completion choice (Archive/Keep/Export) |
| +- Final aggregation / report |
| |
| Phase 4: Results Aggregation |
| +- Export final results.csv |
| +- Generate context.md with all findings |
| +- Display summary: completed/failed/skipped per wave |
| +- Offer: view results | retry failed | done |
| |
+-------------------------------------------------------------------+
Skill(skill="team-quality-assurance", args="task description")
|
SKILL.md (this file) = Router
|
+--------------+--------------+
| |
no --role flag --role <name>
| |
Coordinator Worker
roles/coordinator/role.md roles/<name>/role.md
|
+-- analyze -> dispatch -> spawn workers -> STOP
|
+-------+-------+-------+-------+-------+
v v v v v
[scout] [strat] [gen] [exec] [analyst]
team-worker agents, each loads roles/<role>/role.md
```
---
## Role Registry
## Task Classification Rules
| Role | Path | Prefix | Inner Loop |
|------|------|--------|------------|
| coordinator | [roles/coordinator/role.md](roles/coordinator/role.md) | — | — |
| scout | [roles/scout/role.md](roles/scout/role.md) | SCOUT-* | false |
| strategist | [roles/strategist/role.md](roles/strategist/role.md) | QASTRAT-* | false |
| generator | [roles/generator/role.md](roles/generator/role.md) | QAGEN-* | false |
| executor | [roles/executor/role.md](roles/executor/role.md) | QARUN-* | true |
| analyst | [roles/analyst/role.md](roles/analyst/role.md) | QAANA-* | false |
Each task is classified by `exec_mode`:
## Role Router
| exec_mode | Mechanism | Criteria |
|-----------|-----------|----------|
| `csv-wave` | `spawn_agents_on_csv` | One-shot, structured I/O, no multi-round interaction |
| `interactive` | `spawn_agent`/`wait`/`send_input`/`close_agent` | Multi-round, needs iterative fix-verify cycles |
Parse `$ARGUMENTS`:
- Has `--role <name>` -> Read `roles/<name>/role.md`, execute Phase 2-4
- No `--role` -> `roles/coordinator/role.md`, execute entry router
**Classification Decision**:
## Shared Constants
| Task Property | Classification |
|---------------|---------------|
| Multi-perspective code scanning (scout) | `csv-wave` |
| Strategy formulation (single-pass analysis) | `csv-wave` |
| Test generation (single-pass code creation) | `csv-wave` |
| Test execution with auto-fix cycle | `interactive` |
| Quality analysis (single-pass report) | `csv-wave` |
| GC loop fix-verify iteration | `interactive` |
| Regression scanning (post-fix) | `csv-wave` |
- **Session prefix**: `QA`
- **Session path**: `.workflow/.team/QA-<slug>-<date>/`
- **Team name**: `quality-assurance`
- **CLI tools**: `ccw cli --mode analysis` (read-only), `ccw cli --mode write` (modifications)
- **Message bus**: `mcp__ccw-tools__team_msg(session_id=<session-id>, ...)`
---
## Worker Spawn Template
## CSV Schema
### tasks.csv (Master State)
```csv
id,title,description,role,perspective,layer,coverage_target,deps,context_from,exec_mode,wave,status,findings,issues_found,pass_rate,coverage_achieved,test_files,quality_score,error
"SCOUT-001","Multi-perspective code scan","Scan codebase from bug, security, test-coverage, code-quality perspectives. Produce severity-ranked findings with file:line references.","scout","bug;security;test-coverage;code-quality","","","","","csv-wave","1","pending","","","","","","",""
"QASTRAT-001","Test strategy formulation","Analyze scout findings and code changes. Determine test layers, define coverage targets, generate test strategy document.","strategist","","","","SCOUT-001","SCOUT-001","csv-wave","2","pending","","","","","","",""
"QAGEN-L1-001","Generate L1 unit tests","Generate L1 unit tests based on strategy. Cover priority files, include happy path, edge cases, error handling.","generator","","L1","80","QASTRAT-001","QASTRAT-001","csv-wave","3","pending","","","","","","",""
```
**Columns**:
| Column | Phase | Description |
|--------|-------|-------------|
| `id` | Input | Unique task identifier (PREFIX-NNN format) |
| `title` | Input | Short task title |
| `description` | Input | Detailed task description (self-contained) |
| `role` | Input | Worker role: `scout`, `strategist`, `generator`, `executor`, `analyst` |
| `perspective` | Input | Scan perspectives (semicolon-separated, scout only) |
| `layer` | Input | Test layer: `L1`, `L2`, `L3`, or empty for non-layer tasks |
| `coverage_target` | Input | Target coverage percentage for this layer (empty if N/A) |
| `deps` | Input | Semicolon-separated dependency task IDs |
| `context_from` | Input | Semicolon-separated task IDs whose findings this task needs |
| `exec_mode` | Input | `csv-wave` or `interactive` |
| `wave` | Computed | Wave number (computed by topological sort, 1-based) |
| `status` | Output | `pending` -> `completed` / `failed` / `skipped` |
| `findings` | Output | Key discoveries or implementation notes (max 500 chars) |
| `issues_found` | Output | Count of issues discovered (scout/analyst) |
| `pass_rate` | Output | Test pass rate as decimal (executor only) |
| `coverage_achieved` | Output | Actual coverage percentage achieved (executor only) |
| `test_files` | Output | Semicolon-separated paths of test files (generator only) |
| `quality_score` | Output | Quality score 0-100 (analyst only) |
| `error` | Output | Error message if failed (empty if success) |
### Per-Wave CSV (Temporary)
Each wave generates a temporary `wave-{N}.csv` with extra `prev_context` column (csv-wave tasks only).
---
## Agent Registry (Interactive Agents)
| Agent | Role File | Pattern | Responsibility | Position |
|-------|-----------|---------|----------------|----------|
| Test Executor | agents/executor.md | 2.3 (send_input cycle) | Execute tests with iterative fix cycle, report pass rate and coverage | per-wave |
| GC Loop Handler | agents/gc-loop-handler.md | 2.3 (send_input cycle) | Manage Generator-Critic loop: evaluate coverage, trigger fix rounds | post-wave |
> **COMPACT PROTECTION**: Agent files are execution documents. When context compression occurs, **you MUST immediately `Read` the corresponding agent.md** to reload.
---
## Output Artifacts
| File | Purpose | Lifecycle |
|------|---------|-----------|
| `tasks.csv` | Master state -- all tasks with status/findings | Updated after each wave |
| `wave-{N}.csv` | Per-wave input (temporary, csv-wave tasks only) | Created before wave, deleted after |
| `results.csv` | Final export of all task results | Created in Phase 4 |
| `discoveries.ndjson` | Shared exploration board (all agents, both modes) | Append-only, carries across waves |
| `context.md` | Human-readable execution report | Created in Phase 4 |
| `scan/scan-results.json` | Scout output: multi-perspective scan results | Created in scout wave |
| `strategy/test-strategy.md` | Strategist output: test strategy document | Created in strategy wave |
| `tests/L1-unit/` | Generator output: L1 unit test files | Created in L1 wave |
| `tests/L2-integration/` | Generator output: L2 integration test files | Created in L2 wave |
| `tests/L3-e2e/` | Generator output: L3 E2E test files | Created in L3 wave |
| `results/run-{layer}.json` | Executor output: per-layer test results | Created per execution |
| `analysis/quality-report.md` | Analyst output: quality analysis report | Created in final wave |
| `interactive/{id}-result.json` | Results from interactive tasks | Created per interactive task |
---
## Session Structure
Coordinator spawns workers using this template:
```
.workflow/.csv-wave/{session-id}/
+-- tasks.csv # Master state (all tasks, both modes)
+-- results.csv # Final results export
+-- discoveries.ndjson # Shared discovery board (all agents)
+-- context.md # Human-readable report
+-- wave-{N}.csv # Temporary per-wave input (csv-wave only)
+-- scan/ # Scout output
| +-- scan-results.json
+-- strategy/ # Strategist output
| +-- test-strategy.md
+-- tests/ # Generator output
| +-- L1-unit/
| +-- L2-integration/
| +-- L3-e2e/
+-- results/ # Executor output
| +-- run-L1.json
| +-- run-L2.json
+-- analysis/ # Analyst output
| +-- quality-report.md
+-- wisdom/ # Cross-task knowledge
| +-- learnings.md
| +-- conventions.md
| +-- decisions.md
| +-- issues.md
+-- interactive/ # Interactive task artifacts
| +-- {id}-result.json
+-- gc-state.json # GC loop tracking state
spawn_agent({
agent_type: "team_worker",
items: [
{ type: "text", text: `## Role Assignment
role: <role>
role_spec: <skill_root>/roles/<role>/role.md
session: <session-folder>
session_id: <session-id>
requirement: <task-description>
inner_loop: <true|false>
Read role_spec file (<skill_root>/roles/<role>/role.md) to load Phase 2-4 domain instructions.` },
{ type: "text", text: `## Task Context
task_id: <task-id>
title: <task-title>
description: <task-description>
pipeline_phase: <pipeline-phase>` },
{ type: "text", text: `## Upstream Context
<prev_context>` }
]
})
```
---
After spawning, use `wait_agent({ ids: [...], timeout_ms: 900000 })` to collect results, then `close_agent({ id })` each worker.
## Implementation
## User Commands
### Session Initialization
| Command | Action |
|---------|--------|
| `check` / `status` | View pipeline status graph |
| `resume` / `continue` | Advance to next step |
| `--mode=discovery` | Force discovery mode |
| `--mode=testing` | Force testing mode |
| `--mode=full` | Force full QA mode |
```javascript
const getUtc8ISOString = () => new Date(Date.now() + 8 * 60 * 60 * 1000).toISOString()
## Completion Action
const AUTO_YES = $ARGUMENTS.includes('--yes') || $ARGUMENTS.includes('-y')
const continueMode = $ARGUMENTS.includes('--continue')
const concurrencyMatch = $ARGUMENTS.match(/(?:--concurrency|-c)\s+(\d+)/)
const maxConcurrency = concurrencyMatch ? parseInt(concurrencyMatch[1]) : 3
// Parse QA mode flag
const modeMatch = $ARGUMENTS.match(/--mode=(\w+)/)
const explicitMode = modeMatch ? modeMatch[1] : null
const requirement = $ARGUMENTS
.replace(/--yes|-y|--continue|--concurrency\s+\d+|-c\s+\d+|--mode=\w+/g, '')
.trim()
const slug = requirement.toLowerCase()
.replace(/[^a-z0-9\u4e00-\u9fa5]+/g, '-')
.substring(0, 40)
const dateStr = getUtc8ISOString().substring(0, 10).replace(/-/g, '')
const sessionId = `qa-${slug}-${dateStr}`
const sessionFolder = `.workflow/.csv-wave/${sessionId}`
Bash(`mkdir -p ${sessionFolder}/scan ${sessionFolder}/strategy ${sessionFolder}/tests/L1-unit ${sessionFolder}/tests/L2-integration ${sessionFolder}/tests/L3-e2e ${sessionFolder}/results ${sessionFolder}/analysis ${sessionFolder}/wisdom ${sessionFolder}/interactive`)
// Initialize discoveries.ndjson
Write(`${sessionFolder}/discoveries.ndjson`, '')
// Initialize wisdom files
Write(`${sessionFolder}/wisdom/learnings.md`, '# Learnings\n')
Write(`${sessionFolder}/wisdom/conventions.md`, '# Conventions\n')
Write(`${sessionFolder}/wisdom/decisions.md`, '# Decisions\n')
Write(`${sessionFolder}/wisdom/issues.md`, '# Issues\n')
// Initialize GC state
Write(`${sessionFolder}/gc-state.json`, JSON.stringify({
rounds: {}, coverage_history: [], max_rounds_per_layer: 3
}, null, 2))
```
---
### Phase 0: Pre-Wave Interactive (Requirement Clarification)
**Objective**: Parse task description, detect QA mode, prepare for decomposition.
**Workflow**:
1. **Parse user task description** from $ARGUMENTS
2. **Check for existing sessions** (continue mode):
- Scan `.workflow/.csv-wave/qa-*/tasks.csv` for sessions with pending tasks
- If `--continue`: resume the specified or most recent session, skip to Phase 2
- If active session found: ask user whether to resume or start new
3. **QA Mode Selection**:
| Condition | Mode | Description |
|-----------|------|-------------|
| Explicit `--mode=discovery` | discovery | Scout-first: issue discovery then testing |
| Explicit `--mode=testing` | testing | Skip scout, direct test pipeline |
| Explicit `--mode=full` | full | Complete QA closed loop + regression scan |
| Keywords: discovery, scan, issue, audit | discovery | Auto-detected discovery mode |
| Keywords: test, coverage, TDD, verify | testing | Auto-detected testing mode |
| No explicit flag and no keyword match | full | Default to full QA |
4. **Clarify if ambiguous** (skip if AUTO_YES):
```javascript
request_user_input({
questions: [{
question: `Detected QA mode: '${qaMode}'. Approve or override?`,
header: "QA Mode",
id: "qa_mode",
options: [
{ label: "Approve (Recommended)", description: `Use ${qaMode} mode as detected` },
{ label: "Discovery", description: "Scout-first: scan for issues, then test" },
{ label: "Testing/Full", description: "Direct testing or complete QA closed loop" }
]
}]
})
```
5. **Output**: Refined requirement, QA mode, scope
**Success Criteria**:
- QA mode selected
- Refined requirements available for Phase 1 decomposition
---
### Phase 1: Requirement -> CSV + Classification
**Objective**: Decompose QA task into dependency-ordered CSV tasks based on selected mode.
**Decomposition Rules**:
1. **Select pipeline based on QA mode**:
| Mode | Pipeline |
|------|----------|
| discovery | SCOUT-001 -> QASTRAT-001 -> QAGEN-001 -> QARUN-001 -> QAANA-001 |
| testing | QASTRAT-001 -> QAGEN-L1-001 -> QARUN-L1-001 -> QAGEN-L2-001 -> QARUN-L2-001 -> QAANA-001 |
| full | SCOUT-001 -> QASTRAT-001 -> [QAGEN-L1-001, QAGEN-L2-001] -> [QARUN-L1-001, QARUN-L2-001] -> QAANA-001 -> SCOUT-002 |
2. **Assign roles, layers, perspectives, and coverage targets** per task
3. **Assign exec_mode**:
- Scout, Strategist, Generator, Analyst tasks: `csv-wave` (single-pass)
- Executor tasks: `interactive` (iterative fix cycle)
**Classification Rules**:
| Task Property | exec_mode |
|---------------|-----------|
| Multi-perspective scanning (single-pass) | `csv-wave` |
| Strategy analysis (single-pass read + write) | `csv-wave` |
| Test code generation (single-pass write) | `csv-wave` |
| Test execution with fix loop (multi-round) | `interactive` |
| Quality analysis (single-pass read + write) | `csv-wave` |
| Regression scanning (single-pass) | `csv-wave` |
**Wave Computation**: Kahn's BFS topological sort with depth tracking.
**User Validation**: Display task breakdown with wave + exec_mode + role assignment (skip if AUTO_YES).
**Success Criteria**:
- tasks.csv created with valid schema, wave, and exec_mode assignments
- No circular dependencies
- User approved (or AUTO_YES)
---
### Phase 2: Wave Execution Engine (Extended)
**Objective**: Execute tasks wave-by-wave with hybrid mechanism support, GC loop handling, and cross-wave context propagation.
```javascript
const masterCsv = Read(`${sessionFolder}/tasks.csv`)
let tasks = parseCsv(masterCsv)
const maxWave = Math.max(...tasks.map(t => t.wave))
for (let wave = 1; wave <= maxWave; wave++) {
console.log(`\nWave ${wave}/${maxWave}`)
// 1. Separate tasks by exec_mode
const waveTasks = tasks.filter(t => t.wave === wave && t.status === 'pending')
const csvTasks = waveTasks.filter(t => t.exec_mode === 'csv-wave')
const interactiveTasks = waveTasks.filter(t => t.exec_mode === 'interactive')
// 2. Check dependencies -- skip tasks whose deps failed
for (const task of waveTasks) {
const depIds = (task.deps || '').split(';').filter(Boolean)
const depStatuses = depIds.map(id => tasks.find(t => t.id === id)?.status)
if (depStatuses.some(s => s === 'failed' || s === 'skipped')) {
task.status = 'skipped'
task.error = `Dependency failed: ${depIds.filter((id, i) =>
['failed','skipped'].includes(depStatuses[i])).join(', ')}`
}
}
// 3. Execute csv-wave tasks
const pendingCsvTasks = csvTasks.filter(t => t.status === 'pending')
if (pendingCsvTasks.length > 0) {
for (const task of pendingCsvTasks) {
task.prev_context = buildPrevContext(task, tasks)
}
Write(`${sessionFolder}/wave-${wave}.csv`, toCsv(pendingCsvTasks))
// Read instruction template
Read(`instructions/agent-instruction.md`)
// Build instruction with session folder baked in
const instruction = buildQAInstruction(sessionFolder, wave)
spawn_agents_on_csv({
csv_path: `${sessionFolder}/wave-${wave}.csv`,
id_column: "id",
instruction: instruction,
max_concurrency: maxConcurrency,
max_runtime_seconds: 900,
output_csv_path: `${sessionFolder}/wave-${wave}-results.csv`,
output_schema: {
type: "object",
properties: {
id: { type: "string" },
status: { type: "string", enum: ["completed", "failed"] },
findings: { type: "string" },
issues_found: { type: "string" },
pass_rate: { type: "string" },
coverage_achieved: { type: "string" },
test_files: { type: "string" },
quality_score: { type: "string" },
error: { type: "string" }
}
}
})
// Merge results
const results = parseCsv(Read(`${sessionFolder}/wave-${wave}-results.csv`))
for (const r of results) {
const t = tasks.find(t => t.id === r.id)
if (t) Object.assign(t, r)
}
}
// 4. Execute interactive tasks (executor with fix cycle)
const pendingInteractive = interactiveTasks.filter(t => t.status === 'pending')
for (const task of pendingInteractive) {
Read(`agents/executor.md`)
const prevContext = buildPrevContext(task, tasks)
const agent = spawn_agent({
message: `## TASK ASSIGNMENT\n\n### MANDATORY FIRST STEPS\n1. Read: agents/executor.md\n2. Read: ${sessionFolder}/discoveries.ndjson\n3. Read: .workflow/project-tech.json (if exists)\n\n---\n\nGoal: ${task.description}\nLayer: ${task.layer}\nCoverage Target: ${task.coverage_target}%\nSession: ${sessionFolder}\n\n### Previous Context\n${prevContext}`
})
const result = wait({ ids: [agent], timeout_ms: 900000 })
if (result.timed_out) {
send_input({ id: agent, message: "Please finalize current test results and report." })
wait({ ids: [agent], timeout_ms: 120000 })
}
Write(`${sessionFolder}/interactive/${task.id}-result.json`, JSON.stringify({
task_id: task.id, status: "completed", findings: parseFindings(result),
timestamp: getUtc8ISOString()
}))
close_agent({ id: agent })
task.status = result.success ? 'completed' : 'failed'
task.findings = parseFindings(result)
}
// 5. GC Loop Check (after executor completes)
for (const task of pendingInteractive.filter(t => t.role === 'executor')) {
const gcState = JSON.parse(Read(`${sessionFolder}/gc-state.json`))
const layer = task.layer
const rounds = gcState.rounds[layer] || 0
const coverageAchieved = parseFloat(task.coverage_achieved || '0')
const coverageTarget = parseFloat(task.coverage_target || '80')
const passRate = parseFloat(task.pass_rate || '0')
if (coverageAchieved < coverageTarget && passRate < 0.95 && rounds < 3) {
gcState.rounds[layer] = rounds + 1
Write(`${sessionFolder}/gc-state.json`, JSON.stringify(gcState, null, 2))
Read(`agents/gc-loop-handler.md`)
const gcAgent = spawn_agent({
message: `## GC LOOP ROUND ${rounds + 1}\n\n### MANDATORY FIRST STEPS\n1. Read: agents/gc-loop-handler.md\n2. Read: ${sessionFolder}/discoveries.ndjson\n\nLayer: ${layer}\nRound: ${rounds + 1}/3\nCurrent Coverage: ${coverageAchieved}%\nTarget: ${coverageTarget}%\nPass Rate: ${passRate}\nSession: ${sessionFolder}\nPrevious Results: ${sessionFolder}/results/run-${layer}.json\nTest Directory: ${sessionFolder}/tests/${layer === 'L1' ? 'L1-unit' : layer === 'L2' ? 'L2-integration' : 'L3-e2e'}/`
})
const gcResult = wait({ ids: [gcAgent], timeout_ms: 900000 })
close_agent({ id: gcAgent })
}
}
// 6. Update master CSV
Write(`${sessionFolder}/tasks.csv`, toCsv(tasks))
// 7. Cleanup temp files
Bash(`rm -f ${sessionFolder}/wave-${wave}.csv ${sessionFolder}/wave-${wave}-results.csv`)
// 8. Display wave summary
const completed = waveTasks.filter(t => t.status === 'completed').length
const failed = waveTasks.filter(t => t.status === 'failed').length
const skipped = waveTasks.filter(t => t.status === 'skipped').length
console.log(`Wave ${wave} Complete: ${completed} completed, ${failed} failed, ${skipped} skipped`)
}
```
**Success Criteria**:
- All waves executed in order
- Both csv-wave and interactive tasks handled per wave
- Each wave's results merged into master CSV before next wave starts
- GC loops triggered when coverage below target (max 3 rounds per layer)
- Dependent tasks skipped when predecessor failed
- discoveries.ndjson accumulated across all waves and mechanisms
---
### Phase 3: Post-Wave Interactive (Completion Action)
**Objective**: Pipeline completion report with quality score and interactive completion choice.
```javascript
const tasks = parseCsv(Read(`${sessionFolder}/tasks.csv`))
const completed = tasks.filter(t => t.status === 'completed')
const failed = tasks.filter(t => t.status === 'failed')
// Quality score from analyst
const analystTask = tasks.find(t => t.role === 'analyst' && t.status === 'completed')
const qualityScore = analystTask?.quality_score || 'N/A'
// Scout issues count
const scoutTasks = tasks.filter(t => t.role === 'scout' && t.status === 'completed')
const totalIssues = scoutTasks.reduce((sum, t) => sum + parseInt(t.issues_found || '0'), 0)
// Coverage summary per layer
const layerSummary = ['L1', 'L2', 'L3'].map(layer => {
const execTask = tasks.find(t => t.role === 'executor' && t.layer === layer && t.status === 'completed')
return execTask ? ` ${layer}: ${execTask.coverage_achieved}% coverage, ${execTask.pass_rate} pass rate` : null
}).filter(Boolean).join('\n')
console.log(`
============================================
QA PIPELINE COMPLETE
Quality Score: ${qualityScore}/100
Issues Discovered: ${totalIssues}
Deliverables:
${completed.map(t => ` - ${t.id}: ${t.title} (${t.role})`).join('\n')}
Coverage:
${layerSummary}
Pipeline: ${completed.length}/${tasks.length} tasks
Session: ${sessionFolder}
============================================
`)
if (!AUTO_YES) {
request_user_input({
questions: [{
question: "Quality Assurance pipeline complete. Choose next action.",
header: "Done",
id: "completion",
options: [
{ label: "Archive (Recommended)", description: "Archive session, output final summary" },
{ label: "Keep Active", description: "Keep session for follow-up work" },
{ label: "Export Results", description: "Export deliverables to target directory" }
]
}]
})
}
```
**Success Criteria**:
- Post-wave interactive processing complete
- Quality score and coverage metrics displayed
- User informed of results
---
### Phase 4: Results Aggregation
**Objective**: Generate final results and human-readable report.
```javascript
// 1. Export results.csv
Bash(`cp ${sessionFolder}/tasks.csv ${sessionFolder}/results.csv`)
// 2. Generate context.md
const tasks = parseCsv(Read(`${sessionFolder}/tasks.csv`))
const gcState = JSON.parse(Read(`${sessionFolder}/gc-state.json`))
const analystTask = tasks.find(t => t.role === 'analyst' && t.status === 'completed')
let contextMd = `# Team Quality Assurance Report\n\n`
contextMd += `**Session**: ${sessionId}\n`
contextMd += `**Date**: ${getUtc8ISOString().substring(0, 10)}\n`
contextMd += `**QA Mode**: ${explicitMode || 'full'}\n`
contextMd += `**Quality Score**: ${analystTask?.quality_score || 'N/A'}/100\n\n`
contextMd += `## Summary\n`
contextMd += `| Status | Count |\n|--------|-------|\n`
contextMd += `| Completed | ${tasks.filter(t => t.status === 'completed').length} |\n`
contextMd += `| Failed | ${tasks.filter(t => t.status === 'failed').length} |\n`
contextMd += `| Skipped | ${tasks.filter(t => t.status === 'skipped').length} |\n\n`
// Scout findings
const scoutTasks = tasks.filter(t => t.role === 'scout' && t.status === 'completed')
if (scoutTasks.length > 0) {
contextMd += `## Scout Findings\n\n`
for (const t of scoutTasks) {
contextMd += `**${t.title}**: ${t.issues_found || 0} issues found\n${t.findings || ''}\n\n`
}
}
// Coverage results
contextMd += `## Coverage Results\n\n`
contextMd += `| Layer | Coverage | Target | Pass Rate | GC Rounds |\n`
contextMd += `|-------|----------|--------|-----------|----------|\n`
for (const layer of ['L1', 'L2', 'L3']) {
const execTask = tasks.find(t => t.role === 'executor' && t.layer === layer)
if (execTask) {
contextMd += `| ${layer} | ${execTask.coverage_achieved || 'N/A'}% | ${execTask.coverage_target}% | ${execTask.pass_rate || 'N/A'} | ${gcState.rounds[layer] || 0} |\n`
}
}
contextMd += '\n'
// Wave execution details
const maxWave = Math.max(...tasks.map(t => t.wave))
contextMd += `## Wave Execution\n\n`
for (let w = 1; w <= maxWave; w++) {
const waveTasks = tasks.filter(t => t.wave === w)
contextMd += `### Wave ${w}\n\n`
for (const t of waveTasks) {
const icon = t.status === 'completed' ? '[DONE]' : t.status === 'failed' ? '[FAIL]' : '[SKIP]'
contextMd += `${icon} **${t.title}** [${t.role}/${t.layer || '-'}] ${t.findings || ''}\n\n`
}
}
Write(`${sessionFolder}/context.md`, contextMd)
console.log(`Results exported to: ${sessionFolder}/results.csv`)
console.log(`Report generated at: ${sessionFolder}/context.md`)
```
**Success Criteria**:
- results.csv exported (all tasks, both modes)
- context.md generated with quality score, scout findings, and coverage breakdown
- Summary displayed to user
---
## Shared Discovery Board Protocol
All agents (csv-wave and interactive) share a single `discoveries.ndjson` file for cross-task knowledge exchange.
**Format**: One JSON object per line (NDJSON):
```jsonl
{"ts":"2026-03-08T10:00:00Z","worker":"SCOUT-001","type":"issue_found","data":{"file":"src/auth.ts","line":42,"severity":"high","perspective":"security","description":"Hardcoded secret key in auth module"}}
{"ts":"2026-03-08T10:05:00Z","worker":"QASTRAT-001","type":"framework_detected","data":{"framework":"vitest","config_file":"vitest.config.ts","test_pattern":"**/*.test.ts"}}
{"ts":"2026-03-08T10:10:00Z","worker":"QAGEN-L1-001","type":"test_generated","data":{"file":"tests/L1-unit/auth.test.ts","source_file":"src/auth.ts","test_count":8}}
{"ts":"2026-03-08T10:15:00Z","worker":"QARUN-L1-001","type":"defect_found","data":{"file":"src/auth.ts","line":42,"pattern":"null_reference","description":"Missing null check on token payload"}}
```
**Discovery Types**:
| Type | Data Schema | Description |
|------|-------------|-------------|
| `issue_found` | `{file, line, severity, perspective, description}` | Issue discovered by scout |
| `framework_detected` | `{framework, config_file, test_pattern}` | Test framework identified |
| `test_generated` | `{file, source_file, test_count}` | Test file created |
| `defect_found` | `{file, line, pattern, description}` | Defect pattern discovered during testing |
| `coverage_gap` | `{file, current, target, gap}` | Coverage gap identified |
| `convention_found` | `{pattern, example_file, description}` | Test convention detected |
| `fix_applied` | `{test_file, fix_type, description}` | Test fix during GC loop |
| `quality_metric` | `{dimension, score, details}` | Quality dimension score |
**Protocol**:
1. Agents MUST read discoveries.ndjson at start of execution
2. Agents MUST append relevant discoveries during execution
3. Agents MUST NOT modify or delete existing entries
4. Deduplication by `{type, data.file, data.line}` key (where applicable)
---
## Pipeline Definitions
### Discovery Mode (5 tasks, serial)
When pipeline completes, coordinator presents:
```
SCOUT-001 -> QASTRAT-001 -> QAGEN-001 -> QARUN-001 -> QAANA-001
request_user_input({
questions: [{
question: "Quality Assurance pipeline complete. What would you like to do?",
header: "Completion",
multiSelect: false,
options: [
{ label: "Archive & Clean (Recommended)", description: "Archive session, clean up" },
{ label: "Keep Active", description: "Keep session for follow-up work" },
{ label: "Export Results", description: "Export deliverables to target directory" }
]
}]
})
```
| Task ID | Role | Layer | Wave | exec_mode |
|---------|------|-------|------|-----------|
| SCOUT-001 | scout | - | 1 | csv-wave |
| QASTRAT-001 | strategist | - | 2 | csv-wave |
| QAGEN-001 | generator | L1 | 3 | csv-wave |
| QARUN-001 | executor | L1 | 4 | interactive |
| QAANA-001 | analyst | - | 5 | csv-wave |
### Testing Mode (6 tasks, progressive layers)
## Session Directory
```
QASTRAT-001 -> QAGEN-L1-001 -> QARUN-L1-001 -> QAGEN-L2-001 -> QARUN-L2-001 -> QAANA-001
.workflow/.team/QA-<slug>-<date>/
├── .msg/messages.jsonl # Team message bus
├── .msg/meta.json # Session state + shared memory
├── wisdom/ # Cross-task knowledge
├── scan/ # Scout output
├── strategy/ # Strategist output
├── tests/ # Generator output (L1/, L2/, L3/)
├── results/ # Executor output
└── analysis/ # Analyst output
```
| Task ID | Role | Layer | Wave | exec_mode |
|---------|------|-------|------|-----------|
| QASTRAT-001 | strategist | - | 1 | csv-wave |
| QAGEN-L1-001 | generator | L1 | 2 | csv-wave |
| QARUN-L1-001 | executor | L1 | 3 | interactive |
| QAGEN-L2-001 | generator | L2 | 4 | csv-wave |
| QARUN-L2-001 | executor | L2 | 5 | interactive |
| QAANA-001 | analyst | - | 6 | csv-wave |
## Specs Reference
### Full Mode (8 tasks, parallel windows + regression)
```
SCOUT-001 -> QASTRAT-001 -> [QAGEN-L1-001 // QAGEN-L2-001] -> [QARUN-L1-001 // QARUN-L2-001] -> QAANA-001 -> SCOUT-002
```
| Task ID | Role | Layer | Wave | exec_mode |
|---------|------|-------|------|-----------|
| SCOUT-001 | scout | - | 1 | csv-wave |
| QASTRAT-001 | strategist | - | 2 | csv-wave |
| QAGEN-L1-001 | generator | L1 | 3 | csv-wave |
| QAGEN-L2-001 | generator | L2 | 3 | csv-wave |
| QARUN-L1-001 | executor | L1 | 4 | interactive |
| QARUN-L2-001 | executor | L2 | 4 | interactive |
| QAANA-001 | analyst | - | 5 | csv-wave |
| SCOUT-002 | scout | - | 6 | csv-wave |
---
## GC Loop (Generator-Critic)
Generator and executor iterate per test layer until coverage converges:
```
QAGEN -> QARUN -> (if coverage < target) -> GC Loop Handler
(if coverage >= target) -> next wave
```
- Max iterations: 3 per layer
- After 3 iterations: accept current coverage with warning
- GC loop runs as interactive agent (gc-loop-handler.md) which internally generates fixes and re-runs tests
---
## Scan Perspectives (Scout)
| Perspective | Focus |
|-------------|-------|
| bug | Logic errors, crash paths, null references |
| security | Vulnerabilities, auth bypass, data exposure |
| test-coverage | Untested code paths, missing assertions |
| code-quality | Anti-patterns, complexity, maintainability |
| ux | User-facing issues, accessibility (optional, when task mentions UX/UI) |
---
- [specs/pipelines.md](specs/pipelines.md) — Pipeline definitions and task registry
- [specs/team-config.json](specs/team-config.json) — Team configuration and shared memory schema
## Error Handling
| Error | Resolution |
|-------|------------|
| Circular dependency | Detect in wave computation, abort with error message |
| CSV agent timeout | Mark as failed in results, continue with wave |
| CSV agent failed | Mark as failed, skip dependent tasks in later waves |
| Interactive agent timeout | Urge convergence via send_input, then close if still timed out |
| Interactive agent failed | Mark as failed, skip dependents |
| All agents in wave failed | Log error, offer retry or abort |
| CSV parse error | Validate CSV format before execution, show line number |
| discoveries.ndjson corrupt | Ignore malformed lines, continue with valid entries |
| Scout finds no issues | Report clean scan, proceed to testing (skip discovery-specific tasks) |
| GC loop exceeded (3 rounds) | Accept current coverage with warning, proceed to next layer |
| Test framework not detected | Default to Jest patterns |
| Coverage tool unavailable | Degrade to pass rate judgment |
| quality_score < 60 | Report with WARNING, suggest re-run with deeper coverage |
| Continue mode: no session found | List available sessions, prompt user to select |
---
## Core Rules
1. **Start Immediately**: First action is session initialization, then Phase 0/1
2. **Wave Order is Sacred**: Never execute wave N before wave N-1 completes and results are merged
3. **CSV is Source of Truth**: Master tasks.csv holds all state (both csv-wave and interactive)
4. **CSV First**: Default to csv-wave for tasks; only use interactive when multi-round interaction is required
5. **Context Propagation**: prev_context built from master CSV, not from memory
6. **Discovery Board is Append-Only**: Never clear, modify, or recreate discoveries.ndjson
7. **Skip on Failure**: If a dependency failed, skip the dependent task
8. **GC Loop Discipline**: Max 3 rounds per layer; never infinite-loop on coverage
9. **Scout Feeds Strategy**: Scout findings flow into strategist via prev_context and discoveries.ndjson
10. **Cleanup Temp Files**: Remove wave-{N}.csv after results are merged
11. **DO NOT STOP**: Continuous execution until all waves complete or all remaining tasks are skipped
---
## Coordinator Role Constraints (Main Agent)
**CRITICAL**: The coordinator (main agent executing this skill) is responsible for **orchestration only**, NOT implementation.
15. **Coordinator Does NOT Execute Code**: The main agent MUST NOT write, modify, or implement any code directly. All implementation work is delegated to spawned team agents. The coordinator only:
- Spawns agents with task assignments
- Waits for agent callbacks
- Merges results and coordinates workflow
- Manages workflow transitions between phases
16. **Patient Waiting is Mandatory**: Agent execution takes significant time (typically 10-30 minutes per phase, sometimes longer). The coordinator MUST:
- Wait patiently for `wait()` calls to complete
- NOT skip workflow steps due to perceived delays
- NOT assume agents have failed just because they're taking time
- Trust the timeout mechanisms defined in the skill
17. **Use send_input for Clarification**: When agents need guidance or appear stuck, the coordinator MUST:
- Use `send_input()` to ask questions or provide clarification
- NOT skip the agent or move to next phase prematurely
- Give agents opportunity to respond before escalating
- Example: `send_input({ id: agent_id, message: "Please provide status update or clarify blockers" })`
18. **No Workflow Shortcuts**: The coordinator MUST NOT:
- Skip phases or stages defined in the workflow
- Bypass required approval or review steps
- Execute dependent tasks before prerequisites complete
- Assume task completion without explicit agent callback
- Make up or fabricate agent results
19. **Respect Long-Running Processes**: This is a complex multi-agent workflow that requires patience:
- Total execution time may range from 30-90 minutes or longer
- Each phase may take 10-30 minutes depending on complexity
- The coordinator must remain active and attentive throughout the entire process
- Do not terminate or skip steps due to time concerns
| Scenario | Resolution |
|----------|------------|
| Unknown --role value | Error with available role list |
| Role not found | Error with expected path (roles/<name>/role.md) |
| CLI tool fails | Worker fallback to direct implementation |
| Scout finds no issues | Report clean scan, skip to testing mode |
| GC loop exceeded | Accept current coverage with warning |
| Fast-advance conflict | Coordinator reconciles on next callback |
| Completion action fails | Default to Keep Active |

View File

@@ -1,192 +0,0 @@
# Test Executor Agent
Interactive agent that executes test suites, collects coverage, and performs iterative auto-fix cycles. Acts as the Critic in the Generator-Critic loop within the QA pipeline.
## Identity
- **Type**: `interactive`
- **Responsibility**: Validation (test execution with fix cycles)
## Boundaries
### MUST
- Load role definition via MANDATORY FIRST STEPS pattern
- Run test suites using the correct framework command
- Collect coverage data from test output or coverage reports
- Attempt auto-fix for failing tests (max 5 iterations per invocation)
- Only modify test files, NEVER modify source code
- Save results to session results directory
- Share defect discoveries to discoveries.ndjson
- Report pass rate and coverage in structured output
### MUST NOT
- Skip the MANDATORY FIRST STEPS role loading
- Modify source code (only test files may be changed)
- Use `@ts-ignore`, `as any`, or skip/ignore test annotations
- Exceed 5 fix iterations without reporting current state
- Delete or disable existing passing tests
---
## Toolbox
### Available Tools
| Tool | Type | Purpose |
|------|------|---------|
| `Read` | file-read | Load test files, source files, strategy, results |
| `Write` | file-write | Save test results, update test files |
| `Edit` | file-edit | Fix test assertions, imports, mocks |
| `Bash` | shell | Run test commands, collect coverage |
| `Glob` | search | Find test files in session directory |
| `Grep` | search | Find patterns in test output |
---
## Execution
### Phase 1: Context Loading
**Objective**: Detect test framework and locate test files.
**Input**:
| Source | Required | Description |
|--------|----------|-------------|
| Session folder | Yes | Path to session directory |
| Layer | Yes | Target test layer (L1/L2/L3) |
| Coverage target | Yes | Minimum coverage percentage |
| Previous context | No | Findings from generator and scout |
**Steps**:
1. Read discoveries.ndjson for framework detection info
2. Determine layer directory:
- L1 -> tests/L1-unit/
- L2 -> tests/L2-integration/
- L3 -> tests/L3-e2e/
3. Find test files in the layer directory
4. Determine test framework command:
| Framework | Command Template |
|-----------|-----------------|
| vitest | `npx vitest run --coverage --reporter=json <test-dir>` |
| jest | `npx jest --coverage --json --outputFile=<results-path> <test-dir>` |
| pytest | `python -m pytest --cov --cov-report=json -v <test-dir>` |
| mocha | `npx mocha --reporter json > test-results.json` |
| default | `npm test -- --coverage` |
**Output**: Framework, test command, test file list
---
### Phase 2: Iterative Test-Fix Cycle
**Objective**: Run tests and fix failures up to 5 iterations.
**Input**:
| Source | Required | Description |
|--------|----------|-------------|
| Test command | Yes | From Phase 1 |
| Test files | Yes | From Phase 1 |
| Coverage target | Yes | From spawn message |
**Steps**:
For each iteration (1..5):
1. Run test command, capture stdout/stderr
2. Parse results: extract passed/failed counts, parse coverage
3. Evaluate exit condition:
| Condition | Action |
|-----------|--------|
| All tests pass (0 failures) | Exit loop: SUCCESS |
| pass_rate >= 0.95 AND iteration >= 2 | Exit loop: GOOD ENOUGH |
| iteration >= 5 | Exit loop: MAX ITERATIONS |
4. If not exiting, extract failure details:
- Error messages and stack traces
- Failing test file:line references
- Assertion mismatches
5. Apply targeted fixes:
- Fix incorrect assertions (expected vs actual)
- Fix missing imports or broken module paths
- Fix mock setup issues
- Fix async/await handling
- Do NOT skip tests, do NOT add type suppressions
6. Share defect discoveries:
```bash
echo '{"ts":"<ISO>","worker":"<task-id>","type":"defect_found","data":{"file":"<src>","line":<N>,"pattern":"<type>","description":"<desc>"}}' >> <session>/discoveries.ndjson
```
**Output**: Final pass rate, coverage achieved, iteration count
---
### Phase 3: Result Recording
**Objective**: Save execution results and update state.
**Steps**:
1. Build result data:
```json
{
"layer": "<L1|L2|L3>",
"framework": "<detected>",
"iterations": <N>,
"pass_rate": <decimal>,
"coverage": <percentage>,
"tests_passed": <N>,
"tests_failed": <N>,
"all_passed": <boolean>,
"defect_patterns": [...]
}
```
2. Save results to `<session>/results/run-<layer>.json`
3. Save last test output to `<session>/results/output-<layer>.txt`
---
## Structured Output Template
```
## Summary
- Test execution for <layer>: <pass_rate> pass rate, <coverage>% coverage after <N> iterations
## Findings
- Finding 1: specific test result with file:line reference
- Finding 2: defect pattern discovered
## Defect Patterns
- Pattern: type, frequency, severity
- Pattern: type, frequency, severity
## Coverage
- Overall: <N>%
- Target: <N>%
- Gap files: file1 (<N>%), file2 (<N>%)
## Open Questions
1. Any unresolvable test failures (if any)
```
---
## Error Handling
| Scenario | Resolution |
|----------|------------|
| Test command not found | Try alternative commands (npx, npm test), report if all fail |
| No test files found | Report in findings, status = failed |
| Coverage tool unavailable | Degrade to pass rate only, report in findings |
| All tests timeout | Report with partial results, status = failed |
| Import resolution fails after fix | Report remaining failures, continue with other tests |
| Timeout approaching | Output current findings with "PARTIAL" status |

View File

@@ -1,163 +0,0 @@
# GC Loop Handler Agent
Interactive agent that manages Generator-Critic loop iterations within the QA pipeline. When coverage is below target after executor completes, this agent generates test fixes and re-runs tests.
## Identity
- **Type**: `interactive`
- **Responsibility**: Orchestration (fix-verify cycle within GC loop)
## Boundaries
### MUST
- Read previous execution results to understand failures
- Generate targeted test fixes based on failure details
- Re-run tests after fixes to verify improvement
- Track coverage improvement across iterations
- Only modify test files, NEVER modify source code
- Report final coverage and pass rate
- Share fix discoveries to discoveries.ndjson
- Consider scout findings when generating fixes (available in discoveries.ndjson)
### MUST NOT
- Skip the MANDATORY FIRST STEPS role loading
- Modify source code (only test files)
- Use `@ts-ignore`, `as any`, or test skip annotations
- Run more than 1 fix-verify cycle per invocation (coordinator manages round count)
- Delete or disable passing tests
---
## Toolbox
### Available Tools
| Tool | Type | Purpose |
|------|------|---------|
| `Read` | file-read | Load test results, test files, source files, scan results |
| `Write` | file-write | Write fixed test files |
| `Edit` | file-edit | Apply targeted test fixes |
| `Bash` | shell | Run test commands |
| `Glob` | search | Find test files |
| `Grep` | search | Search test output for patterns |
---
## Execution
### Phase 1: Failure Analysis
**Objective**: Understand why tests failed or coverage was insufficient.
**Input**:
| Source | Required | Description |
|--------|----------|-------------|
| Session folder | Yes | Path to session directory |
| Layer | Yes | Target test layer (L1/L2/L3) |
| Round number | Yes | Current GC round (1-3) |
| Previous results | Yes | Path to run-{layer}.json |
**Steps**:
1. Read previous execution results from results/run-{layer}.json
2. Read test output from results/output-{layer}.txt
3. Read discoveries.ndjson for scout-found issues (may inform additional test cases)
4. Categorize failures:
| Failure Type | Detection | Fix Strategy |
|--------------|-----------|--------------|
| Assertion mismatch | "expected X, received Y" | Correct expected values |
| Missing import | "Cannot find module" | Fix import paths |
| Null reference | "Cannot read property of null" | Add null guards in tests |
| Async issue | "timeout", "not resolved" | Fix async/await patterns |
| Mock issue | "mock not called" | Fix mock setup/teardown |
| Type error | "Type X is not assignable" | Fix type annotations |
5. Identify uncovered files from coverage report
6. Cross-reference with scout findings for targeted coverage improvement
**Output**: Failure categories, fix targets, uncovered areas
---
### Phase 2: Fix Generation + Re-execution
**Objective**: Apply fixes and verify improvement.
**Steps**:
1. For each failing test file:
- Read the test file content
- Apply targeted fixes based on failure category
- Verify fix does not break other tests conceptually
2. For coverage gaps:
- Read uncovered source files
- Cross-reference with scout-discovered issues for high-value test targets
- Generate additional test cases targeting uncovered paths
- Append to existing test files or create new ones
3. Re-run test suite with coverage:
```bash
<test-command> 2>&1 || true
```
4. Parse new results: pass rate, coverage
5. Calculate improvement delta
6. Share discoveries:
```bash
echo '{"ts":"<ISO>","worker":"gc-loop-<layer>-R<N>","type":"fix_applied","data":{"test_file":"<path>","fix_type":"<type>","description":"<desc>"}}' >> <session>/discoveries.ndjson
```
**Output**: Updated pass rate, coverage, improvement delta
---
### Phase 3: Result Update
**Objective**: Save updated results for coordinator evaluation.
**Steps**:
1. Overwrite results/run-{layer}.json with new data
2. Save test output to results/output-{layer}.txt
3. Report improvement delta in findings
---
## Structured Output Template
```
## Summary
- GC Loop Round <N> for <layer>: coverage <before>% -> <after>% (delta: +<N>%)
## Fixes Applied
- Fix 1: <test-file> - <fix-type> - <description>
- Fix 2: <test-file> - <fix-type> - <description>
## Coverage Update
- Before: <N>%, After: <N>%, Target: <N>%
- Pass Rate: <before> -> <after>
## Scout-Informed Additions
- Added test for scout issue #<N>: <description> (if applicable)
## Remaining Issues
- Issue 1: <description> (if any)
```
---
## Error Handling
| Scenario | Resolution |
|----------|------------|
| No previous results found | Report error, cannot proceed without baseline |
| All fixes cause new failures | Revert fixes, report inability to improve |
| Coverage tool unavailable | Use pass rate as proxy metric |
| Scout findings not available | Proceed without scout context |
| Timeout approaching | Output partial results with current state |

View File

@@ -1,185 +0,0 @@
# Agent Instruction Template -- Team Quality Assurance
Base instruction template for CSV wave agents in the QA pipeline. Used by scout, strategist, generator, and analyst roles (csv-wave tasks).
## Purpose
| Phase | Usage |
|-------|-------|
| Phase 1 | Coordinator builds instruction from this template with session folder baked in |
| Phase 2 | Injected as `instruction` parameter to `spawn_agents_on_csv` |
---
## Base Instruction Template
```markdown
## TASK ASSIGNMENT -- Team Quality Assurance
### MANDATORY FIRST STEPS
1. Read shared discoveries: <session-folder>/discoveries.ndjson (if exists, skip if not)
2. Read project context: .workflow/project-tech.json (if exists)
3. Read scan results: <session-folder>/scan/scan-results.json (if exists, for non-scout roles)
4. Read test strategy: <session-folder>/strategy/test-strategy.md (if exists, for generator/analyst)
---
## Your Task
**Task ID**: {id}
**Title**: {title}
**Role**: {role}
**Perspectives**: {perspective}
**Layer**: {layer}
**Coverage Target**: {coverage_target}%
### Task Description
{description}
### Previous Tasks' Findings (Context)
{prev_context}
---
## Execution Protocol
### If Role = scout
1. **Determine scan scope**: Use git diff and task description to identify target files
```bash
git diff --name-only HEAD~5 2>/dev/null || echo ""
```
2. **Load historical patterns**: Read discoveries.ndjson for known defect patterns
3. **Execute multi-perspective scan**: For each perspective in {perspective} (semicolon-separated):
- **bug**: Scan for logic errors, crash paths, null references, unhandled exceptions
- **security**: Scan for vulnerabilities, hardcoded secrets, auth bypass, data exposure
- **test-coverage**: Identify untested code paths, missing assertions, uncovered branches
- **code-quality**: Detect anti-patterns, high complexity, duplicated logic, maintainability issues
- **ux** (if present): Check for user-facing issues, accessibility problems
4. **Aggregate and rank**: Deduplicate by file:line, rank by severity (critical > high > medium > low)
5. **Write scan results**: Save to <session-folder>/scan/scan-results.json:
```json
{
"scan_date": "<ISO8601>",
"perspectives": ["bug", "security", ...],
"total_findings": <N>,
"by_severity": { "critical": <N>, "high": <N>, "medium": <N>, "low": <N> },
"findings": [{ "id": "<N>", "severity": "<level>", "perspective": "<name>", "file": "<path>", "line": <N>, "description": "<text>" }]
}
```
6. **Share discoveries**: For each critical/high finding:
```bash
echo '{"ts":"<ISO8601>","worker":"{id}","type":"issue_found","data":{"file":"<path>","line":<N>,"severity":"<level>","perspective":"<name>","description":"<text>"}}' >> <session-folder>/discoveries.ndjson
```
### If Role = strategist
1. **Read scout results**: Load <session-folder>/scan/scan-results.json (if discovery or full mode)
2. **Analyze change scope**: Run `git diff --name-only HEAD~5` to identify changed files
3. **Detect test framework**: Check for vitest.config.ts, jest.config.js, pytest.ini, pyproject.toml
4. **Categorize files**: Source, Test, Config patterns
5. **Select test layers**:
| Condition | Layer | Target |
|-----------|-------|--------|
| Has source file changes | L1: Unit Tests | 80% |
| >= 3 source files OR critical issues | L2: Integration Tests | 60% |
| >= 3 critical/high severity issues | L3: E2E Tests | 40% |
6. **Generate strategy**: Write to <session-folder>/strategy/test-strategy.md with scope analysis, layer configs, priority issues, risk assessment
7. **Share discoveries**: Append framework detection to board:
```bash
echo '{"ts":"<ISO8601>","worker":"{id}","type":"framework_detected","data":{"framework":"<name>","config_file":"<path>","test_pattern":"<pattern>"}}' >> <session-folder>/discoveries.ndjson
```
### If Role = generator
1. **Read strategy**: Load <session-folder>/strategy/test-strategy.md for layer config and priority files
2. **Read source files**: Load files listed in strategy for the target layer (limit 20 files)
3. **Learn test patterns**: Find 3 existing test files to understand conventions (imports, structure, naming)
4. **Detect if GC fix mode**: If task description contains "fix" -> read failure info from results/run-{layer}.json, fix failing tests only
5. **Generate tests**: For each priority source file:
- Determine test file path following project conventions
- Generate test cases: happy path, edge cases, error handling
- Use proper test framework API
- Include proper imports and mocks
6. **Write test files**: Save to <session-folder>/tests/<layer-dir>/
- L1 -> tests/L1-unit/
- L2 -> tests/L2-integration/
- L3 -> tests/L3-e2e/
7. **Syntax check**: Run `tsc --noEmit` or equivalent to verify syntax
8. **Share discoveries**: Append test generation info to discoveries board
### If Role = analyst
1. **Read all results**: Load <session-folder>/results/run-*.json for execution data
2. **Read scan results**: Load <session-folder>/scan/scan-results.json (if exists)
3. **Read strategy**: Load <session-folder>/strategy/test-strategy.md
4. **Read discoveries**: Parse <session-folder>/discoveries.ndjson for all findings
5. **Analyze five dimensions**:
- **Defect patterns**: Group issues by type, identify patterns with >= 2 occurrences
- **Coverage gaps**: Compare achieved vs target per layer, identify per-file gaps
- **Test effectiveness**: Per layer -- pass rate, iterations, coverage achieved
- **Quality trend**: Compare against coverage_history if available
- **Quality score** (0-100): Start from 100, deduct for issues, gaps, failures; bonus for effective layers
6. **Score-based recommendations**:
| Score | Recommendation |
|-------|----------------|
| >= 80 | Quality is GOOD. Maintain current practices. |
| 60-79 | Quality needs IMPROVEMENT. Focus on gaps and patterns. |
| < 60 | Quality is CONCERNING. Recommend comprehensive review. |
7. **Generate report**: Write to <session-folder>/analysis/quality-report.md
8. **Share discoveries**: Append quality metrics to board
---
## Output (report_agent_job_result)
Return JSON:
{
"id": "{id}",
"status": "completed" | "failed",
"findings": "Key discoveries and implementation notes (max 500 chars)",
"issues_found": "count of issues discovered (scout/analyst, empty for others)",
"pass_rate": "test pass rate as decimal (empty for non-executor tasks)",
"coverage_achieved": "actual coverage percentage (empty for non-executor tasks)",
"test_files": "semicolon-separated paths of test files (empty for non-generator tasks)",
"quality_score": "quality score 0-100 (analyst only, empty for others)",
"error": ""
}
```
---
## Quality Requirements
All agents must verify before reporting complete:
| Requirement | Criteria |
|-------------|----------|
| Scan results written | Verify scan-results.json exists (scout) |
| Strategy written | Verify test-strategy.md exists (strategist) |
| Tests generated | Verify test files exist in correct layer dir (generator) |
| Syntax clean | No compilation errors in generated tests (generator) |
| Report written | Verify quality-report.md exists (analyst) |
| Findings accuracy | Findings reflect actual work done |
| Discovery sharing | At least 1 discovery shared to board |
| Error reporting | Non-empty error field if status is failed |
---
## Placeholder Reference
| Placeholder | Resolved By | When |
|-------------|------------|------|
| `<session-folder>` | Skill designer (Phase 1) | Literal path baked into instruction |
| `{id}` | spawn_agents_on_csv | Runtime from CSV row |
| `{title}` | spawn_agents_on_csv | Runtime from CSV row |
| `{description}` | spawn_agents_on_csv | Runtime from CSV row |
| `{role}` | spawn_agents_on_csv | Runtime from CSV row |
| `{perspective}` | spawn_agents_on_csv | Runtime from CSV row |
| `{layer}` | spawn_agents_on_csv | Runtime from CSV row |
| `{coverage_target}` | spawn_agents_on_csv | Runtime from CSV row |
| `{prev_context}` | spawn_agents_on_csv | Runtime from CSV row |

View File

@@ -0,0 +1,80 @@
---
role: analyst
prefix: QAANA
inner_loop: false
message_types:
success: analysis_ready
report: quality_report
error: error
---
# Quality Analyst
Analyze defect patterns, coverage gaps, test effectiveness, and generate comprehensive quality reports. Maintain defect pattern database and provide quality scoring.
## Phase 2: Context Loading
| Input | Source | Required |
|-------|--------|----------|
| Task description | From task subject/description | Yes |
| Session path | Extracted from task description | Yes |
| .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
| Discovered issues | meta.json -> discovered_issues | No |
| Test strategy | meta.json -> test_strategy | No |
| Generated tests | meta.json -> generated_tests | No |
| Execution results | meta.json -> execution_results | No |
| Historical patterns | meta.json -> defect_patterns | No |
1. Extract session path from task description
2. Read .msg/meta.json for all accumulated QA data
3. Read coverage data from `coverage/coverage-summary.json` if available
4. Read layer execution results from `<session>/results/run-*.json`
5. Select analysis mode:
| Data Points | Mode |
|-------------|------|
| <= 5 issues + results | Direct inline analysis |
| > 5 | CLI-assisted deep analysis via gemini |
## Phase 3: Multi-Dimensional Analysis
**Five analysis dimensions**:
1. **Defect Pattern Analysis**: Group issues by type/perspective, identify patterns with >= 2 occurrences, record type/count/files/description
2. **Coverage Gap Analysis**: Compare actual coverage vs layer targets, identify per-file gaps (< 50% coverage), severity: critical (< 20%) / high (< 50%)
3. **Test Effectiveness**: Per layer -- files generated, pass rate, iterations needed, coverage achieved. Effective = pass_rate >= 95% AND iterations <= 2
4. **Quality Trend**: Compare against coverage_history. Trend: improving (delta > 5%), declining (delta < -5%), stable
5. **Quality Score** (0-100 starting from 100):
| Factor | Impact |
|--------|--------|
| Security issues | -10 per issue |
| Bug issues | -5 per issue |
| Coverage gap | -0.5 per gap percentage |
| Test failures | -(100 - pass_rate) * 0.3 per layer |
| Effective test layers | +5 per layer |
| Improving trend | +3 |
For CLI-assisted mode:
```
PURPOSE: Deep quality analysis on QA results to identify defect patterns and improvement opportunities
TASK: Classify defects by root cause, identify high-density files, analyze coverage gaps vs risk, generate recommendations
MODE: analysis
```
## Phase 4: Report Generation & Output
1. Generate quality report markdown with: score, defect patterns, coverage analysis, test effectiveness, quality trend, recommendations
2. Write report to `<session>/analysis/quality-report.md`
3. Update `<session>/wisdom/.msg/meta.json`:
- `defect_patterns`: identified patterns array
- `quality_score`: calculated score
- `coverage_history`: append new data point (date, coverage, quality_score, issues)
**Score-based recommendations**:
| Score | Recommendation |
|-------|----------------|
| >= 80 | Quality is GOOD. Maintain current testing practices. |
| 60-79 | Quality needs IMPROVEMENT. Focus on coverage gaps and recurring patterns. |
| < 60 | Quality is CONCERNING. Recommend comprehensive review and testing effort. |

View File

@@ -0,0 +1,72 @@
# Analyze Task
Parse user task -> detect QA capabilities -> build dependency graph -> design roles.
**CONSTRAINT**: Text-level analysis only. NO source code reading, NO codebase exploration.
## Signal Detection
| Keywords | Capability | Prefix |
|----------|------------|--------|
| scan, discover, find issues, audit | scout | SCOUT |
| strategy, plan, test layers, coverage | strategist | QASTRAT |
| generate tests, write tests, create tests | generator | QAGEN |
| run tests, execute, fix tests | executor | QARUN |
| analyze, report, quality score | analyst | QAANA |
## QA Mode Detection
| Condition | Mode |
|-----------|------|
| Keywords: discovery, scan, issues, bug-finding | discovery |
| Keywords: test, coverage, TDD, unit, integration | testing |
| Both keyword types OR no clear match | full |
## Dependency Graph
Natural ordering tiers for QA pipeline:
- Tier 0: scout (issue discovery)
- Tier 1: strategist (strategy requires scout discoveries)
- Tier 2: generator (generation requires strategy)
- Tier 3: executor (execution requires generated tests)
- Tier 4: analyst (analysis requires execution results)
## Pipeline Definitions
```
Discovery Mode: SCOUT -> QASTRAT -> QAGEN(L1) -> QARUN(L1) -> QAANA
Testing Mode: QASTRAT -> QAGEN(L1) -> QARUN(L1) -> QAGEN(L2) -> QARUN(L2) -> QAANA
Full Mode: SCOUT -> QASTRAT -> [QAGEN(L1) || QAGEN(L2)] -> [QARUN(L1) || QARUN(L2)] -> QAANA -> SCOUT(regression)
```
## Complexity Scoring
| Factor | Points |
|--------|--------|
| Per capability | +1 |
| Cross-domain (test + discovery) | +2 |
| Parallel tracks | +1 per track |
| Serial depth > 3 | +1 |
Results: 1-3 Low, 4-6 Medium, 7+ High
## Role Minimization
- Cap at 6 roles (coordinator + 5 workers)
- Merge overlapping capabilities
- Absorb trivial single-step roles
## Output
Write <session>/task-analysis.json:
```json
{
"task_description": "<original>",
"pipeline_mode": "<discovery|testing|full>",
"capabilities": [{ "name": "<cap>", "prefix": "<PREFIX>", "keywords": ["..."] }],
"dependency_graph": { "<TASK-ID>": { "role": "<role>", "addBlockedBy": ["..."], "priority": "P0|P1|P2" } },
"roles": [{ "name": "<role>", "prefix": "<PREFIX>", "inner_loop": false }],
"complexity": { "score": 0, "level": "Low|Medium|High" },
"gc_loop_enabled": true
}
```

View File

@@ -0,0 +1,108 @@
# Dispatch Tasks
Create task chains from dependency graph with proper blockedBy relationships.
## Workflow
1. Read task-analysis.json -> extract pipeline_mode and dependency_graph
2. Read specs/pipelines.md -> get task registry for selected pipeline
3. Topological sort tasks (respect blockedBy)
4. Validate all owners exist in role registry (SKILL.md)
5. For each task (in order):
- Build JSON entry with structured description (see template below)
- Set blockedBy and owner fields in the entry
6. Write all entries to `<session>/tasks.json`
7. Update session.json with pipeline.tasks_total
8. Validate chain (no orphans, no cycles, all refs valid)
## Task Description Template
Each task is a JSON entry in the tasks array:
```json
{
"id": "<TASK-ID>",
"subject": "<TASK-ID>",
"description": "PURPOSE: <goal> | Success: <criteria>\nTASK:\n - <step 1>\n - <step 2>\nCONTEXT:\n - Session: <session-folder>\n - Layer: <L1-unit|L2-integration|L3-e2e> (if applicable)\n - Upstream artifacts: <list>\n - Shared memory: <session>/wisdom/.msg/meta.json\nEXPECTED: <artifact path> + <quality criteria>\nCONSTRAINTS: <scope limits>\n---\nInnerLoop: <true|false>\nRoleSpec: ~ or <project>/.codex/skills/team-quality-assurance/roles/<role>/role.md",
"status": "pending",
"owner": "<role>",
"blockedBy": ["<dependency-list>"]
}
```
## Pipeline Task Registry
### Discovery Mode
```
SCOUT-001 (scout): Multi-perspective issue scanning
blockedBy: []
QASTRAT-001 (strategist): Test strategy formulation
blockedBy: [SCOUT-001]
QAGEN-001 (generator): L1 unit test generation
blockedBy: [QASTRAT-001], meta: layer=L1
QARUN-001 (executor): L1 test execution + fix cycles
blockedBy: [QAGEN-001], inner_loop: true, meta: layer=L1
QAANA-001 (analyst): Quality analysis report
blockedBy: [QARUN-001]
```
### Testing Mode
```
QASTRAT-001 (strategist): Test strategy formulation
blockedBy: []
QAGEN-L1-001 (generator): L1 unit test generation
blockedBy: [QASTRAT-001], meta: layer=L1
QARUN-L1-001 (executor): L1 test execution + fix cycles
blockedBy: [QAGEN-L1-001], inner_loop: true, meta: layer=L1
QAGEN-L2-001 (generator): L2 integration test generation
blockedBy: [QARUN-L1-001], meta: layer=L2
QARUN-L2-001 (executor): L2 test execution + fix cycles
blockedBy: [QAGEN-L2-001], inner_loop: true, meta: layer=L2
QAANA-001 (analyst): Quality analysis report
blockedBy: [QARUN-L2-001]
```
### Full Mode
```
SCOUT-001 (scout): Multi-perspective issue scanning
blockedBy: []
QASTRAT-001 (strategist): Test strategy formulation
blockedBy: [SCOUT-001]
QAGEN-L1-001 (generator-1): L1 unit test generation
blockedBy: [QASTRAT-001], meta: layer=L1
QAGEN-L2-001 (generator-2): L2 integration test generation
blockedBy: [QASTRAT-001], meta: layer=L2
QARUN-L1-001 (executor-1): L1 test execution + fix cycles
blockedBy: [QAGEN-L1-001], inner_loop: true, meta: layer=L1
QARUN-L2-001 (executor-2): L2 test execution + fix cycles
blockedBy: [QAGEN-L2-001], inner_loop: true, meta: layer=L2
QAANA-001 (analyst): Quality analysis report
blockedBy: [QARUN-L1-001, QARUN-L2-001]
SCOUT-002 (scout): Regression scan after fixes
blockedBy: [QAANA-001]
```
## InnerLoop Flag Rules
- true: executor roles (run-fix cycles)
- false: scout, strategist, generator, analyst roles
## Dependency Validation
- No orphan tasks (all tasks have valid owner)
- No circular dependencies
- All blockedBy references exist
- Session reference in every task description
- RoleSpec reference in every task description
## Log After Creation
```
mcp__ccw-tools__team_msg({
operation: "log",
session_id: <session-id>,
from: "coordinator",
type: "pipeline_selected",
data: { pipeline: "<mode>", task_count: <N> }
})
```

View File

@@ -0,0 +1,209 @@
# Monitor Pipeline
Event-driven pipeline coordination. Beat model: coordinator wake -> process -> spawn -> STOP.
## Constants
- SPAWN_MODE: background
- ONE_STEP_PER_INVOCATION: true
- FAST_ADVANCE_AWARE: true
- WORKER_AGENT: team-worker
- MAX_GC_ROUNDS: 3
## Handler Router
| Source | Handler |
|--------|---------|
| Message contains [scout], [strategist], [generator], [executor], [analyst] | handleCallback |
| "capability_gap" | handleAdapt |
| "check" or "status" | handleCheck |
| "resume" or "continue" | handleResume |
| All tasks completed | handleComplete |
| Default | handleSpawnNext |
## handleCallback
Worker completed. Process and advance.
1. Parse message to identify role and task ID:
| Message Pattern | Role Detection |
|----------------|---------------|
| `[scout]` or task ID `SCOUT-*` | scout |
| `[strategist]` or task ID `QASTRAT-*` | strategist |
| `[generator]` or task ID `QAGEN-*` | generator |
| `[executor]` or task ID `QARUN-*` | executor |
| `[analyst]` or task ID `QAANA-*` | analyst |
2. Check if progress update (inner loop) or final completion
3. Progress -> update session state, STOP
4. Completion -> mark task done (read `<session>/tasks.json`, set status to "completed", write back), remove from active_workers
5. Check for checkpoints:
- QARUN-* completes -> read meta.json for coverage:
- coverage >= target OR gc_rounds >= MAX_GC_ROUNDS -> proceed to handleSpawnNext
- coverage < target AND gc_rounds < MAX_GC_ROUNDS -> create GC fix tasks, increment gc_rounds
**GC Fix Task Creation** (when coverage below target) -- add new entries to `<session>/tasks.json`:
```json
{
"id": "QAGEN-fix-<round>",
"subject": "QAGEN-fix-<round>: Fix tests for <layer> (GC #<round>)",
"description": "PURPOSE: Fix failing tests and improve coverage | Success: Coverage meets target\nTASK:\n - Load execution results and failing test details\n - Fix broken tests and add missing coverage\nCONTEXT:\n - Session: <session-folder>\n - Layer: <layer>\n - Previous results: <session>/results/run-<layer>.json\nEXPECTED: Fixed test files | Improved coverage\nCONSTRAINTS: Only modify test files | No source changes\n---\nInnerLoop: false\nRoleSpec: ~ or <project>/.codex/skills/team-quality-assurance/roles/generator/role.md",
"status": "pending",
"owner": "generator",
"blockedBy": []
}
```
```json
{
"id": "QARUN-gc-<round>",
"subject": "QARUN-gc-<round>: Re-execute <layer> (GC #<round>)",
"description": "PURPOSE: Re-execute tests after fixes | Success: Coverage >= target\nTASK: Execute test suite, measure coverage, report results\nCONTEXT:\n - Session: <session-folder>\n - Layer: <layer>\nEXPECTED: <session>/results/run-<layer>-gc-<round>.json\nCONSTRAINTS: Read-only execution\n---\nInnerLoop: false\nRoleSpec: ~ or <project>/.codex/skills/team-quality-assurance/roles/executor/role.md",
"status": "pending",
"owner": "executor",
"blockedBy": ["QAGEN-fix-<round>"]
}
```
6. -> handleSpawnNext
## handleCheck
Read-only status report, then STOP.
Output:
```
[coordinator] QA Pipeline Status
[coordinator] Mode: <pipeline_mode>
[coordinator] Progress: <done>/<total> (<pct>%)
[coordinator] GC Rounds: <gc_rounds>/3
[coordinator] Pipeline Graph:
SCOUT-001: <done|run|wait> <summary>
QASTRAT-001: <done|run|wait> <summary>
QAGEN-001: <done|run|wait> <summary>
QARUN-001: <done|run|wait> <summary>
QAANA-001: <done|run|wait> <summary>
[coordinator] Active Workers: <list with elapsed time>
[coordinator] Ready: <pending tasks with resolved deps>
[coordinator] Commands: 'resume' to advance | 'check' to refresh
```
Then STOP.
## handleResume
1. No active workers -> handleSpawnNext
2. Has active -> check each status
- completed -> mark done (update tasks.json)
- in_progress -> still running
3. Some completed -> handleSpawnNext
4. All running -> report status, STOP
## handleSpawnNext
Find ready tasks, spawn workers, STOP.
1. Collect from `<session>/tasks.json`:
- completedSubjects: status = completed
- inProgressSubjects: status = in_progress
- readySubjects: status = pending AND all blockedBy in completedSubjects
2. No ready + work in progress -> report waiting, STOP
3. No ready + nothing in progress -> handleComplete
4. Has ready -> for each:
a. Determine role from task prefix:
| Prefix | Role | inner_loop |
|--------|------|------------|
| SCOUT-* | scout | false |
| QASTRAT-* | strategist | false |
| QAGEN-* | generator | false |
| QARUN-* | executor | true |
| QAANA-* | analyst | false |
b. Check if inner loop role with active worker -> skip (worker picks up next task)
c. Update task status to "in_progress" in tasks.json
d. team_msg log -> task_unblocked
e. Spawn team-worker:
```
spawn_agent({
agent_type: "team_worker",
items: [{
description: "Spawn <role> worker for <subject>",
team_name: "quality-assurance",
name: "<role>",
prompt: `## Role Assignment
role: <role>
role_spec: ~ or <project>/.codex/skills/team-quality-assurance/roles/<role>/role.md
session: <session-folder>
session_id: <session-id>
team_name: quality-assurance
requirement: <task-description>
inner_loop: <true|false>
## Current Task
- Task ID: <task-id>
- Task: <subject>
Read role_spec file to load Phase 2-4 domain instructions.
Execute built-in Phase 1 (task discovery) -> role Phase 2-4 -> built-in Phase 5 (report).`
}]
})
```
f. Add to active_workers
5. Update session, output summary, STOP
6. Use `wait_agent({ ids: [<spawned-agent-ids>] })` to wait for callbacks. Workers use `report_agent_job_result()` to send results back.
## handleComplete
Pipeline done. Generate report and completion action.
1. Verify all tasks (including GC fix/recheck tasks) have status "completed" or "deleted" in tasks.json
2. If any tasks incomplete -> return to handleSpawnNext
3. If all complete:
- Read final state from meta.json (quality_score, coverage, gc_rounds)
- Generate summary (deliverables, stats, discussions)
4. Read session.completion_action:
- interactive -> request_user_input (Archive/Keep/Export)
- auto_archive -> Archive & Clean (status=completed, remove/archive session folder)
- auto_keep -> Keep Active (status=paused)
## handleAdapt
Capability gap reported mid-pipeline.
1. Parse gap description
2. Check if existing role covers it -> redirect
3. Role count < 6 -> generate dynamic role-spec in <session>/role-specs/
4. Add new task entry to tasks.json, spawn worker
5. Role count >= 6 -> merge or pause
## Fast-Advance Reconciliation
On every coordinator wake:
1. Read team_msg entries with type="fast_advance"
2. Sync active_workers with spawned successors
3. No duplicate spawns
## Phase 4: State Persistence
After every handler execution:
1. Reconcile active_workers with actual tasks.json states
2. Remove entries for completed/deleted tasks
3. Write updated meta.json
4. STOP (wait for next callback)
## Error Handling
| Scenario | Resolution |
|----------|------------|
| Session file not found | Error, suggest re-initialization |
| Worker callback from unknown role | Log info, scan for other completions |
| Pipeline stall (no ready, no running, has pending) | Check blockedBy chains, report to user |
| GC loop exceeded | Accept current coverage with warning, proceed |
| Scout finds 0 issues | Skip to testing mode, proceed to QASTRAT |

View File

@@ -0,0 +1,143 @@
# Coordinator Role
Orchestrate team-quality-assurance: analyze -> dispatch -> spawn -> monitor -> report.
## Identity
- Name: coordinator | Tag: [coordinator]
- Responsibility: Parse requirements -> Mode selection -> Create team -> Dispatch tasks -> Monitor progress -> Report results
## Boundaries
### MUST
- Parse task description and detect QA mode
- Create team and spawn team-worker agents in background
- Dispatch tasks with proper dependency chains
- Monitor progress via callbacks and route messages
- Maintain session state
- Handle GC loop (generator-executor coverage cycles)
- Execute completion action when pipeline finishes
### MUST NOT
- Read source code or explore codebase (delegate to workers)
- Execute scan, test, or analysis work directly
- Modify test files or source code
- Spawn workers with general-purpose agent (MUST use team-worker)
- Generate more than 6 worker roles
## Command Execution Protocol
When coordinator needs to execute a specific phase:
1. Read `commands/<command>.md`
2. Follow the workflow defined in the command
3. Commands are inline execution guides, NOT separate agents
4. Execute synchronously, complete before proceeding
## Entry Router
| Detection | Condition | Handler |
|-----------|-----------|---------|
| Worker callback | Message contains [scout], [strategist], [generator], [executor], [analyst] | -> handleCallback (monitor.md) |
| Status check | Args contain "check" or "status" | -> handleCheck (monitor.md) |
| Manual resume | Args contain "resume" or "continue" | -> handleResume (monitor.md) |
| Capability gap | Message contains "capability_gap" | -> handleAdapt (monitor.md) |
| Pipeline complete | All tasks completed | -> handleComplete (monitor.md) |
| Interrupted session | Active session in .workflow/.team/QA-* | -> Phase 0 |
| New session | None of above | -> Phase 1 |
For callback/check/resume/adapt/complete: load @commands/monitor.md, execute handler, STOP.
## Phase 0: Session Resume Check
1. Scan .workflow/.team/QA-*/session.json for active/paused sessions
2. No sessions -> Phase 1
3. Single session -> reconcile (audit tasks.json, reset in_progress->pending, rebuild team, kick first ready task)
4. Multiple -> request_user_input for selection
## Phase 1: Requirement Clarification
TEXT-LEVEL ONLY. No source code reading.
1. Parse task description and extract flags
2. **QA Mode Selection**:
| Condition | Mode |
|-----------|------|
| Explicit `--mode=discovery` flag | discovery |
| Explicit `--mode=testing` flag | testing |
| Explicit `--mode=full` flag | full |
| Task description contains: discovery/scan/issue keywords | discovery |
| Task description contains: test/coverage/TDD keywords | testing |
| No explicit flag and no keyword match | full (default) |
3. Clarify if ambiguous (request_user_input: scope, deliverables, constraints)
4. Delegate to @commands/analyze.md
5. Output: task-analysis.json
6. CRITICAL: Always proceed to Phase 2, never skip team workflow
## Phase 2: Create Team + Initialize Session
1. Resolve workspace paths (MUST do first):
- `project_root` = result of `Bash({ command: "pwd" })`
- `skill_root` = `<project_root>/.claude/skills/team-quality-assurance`
2. Generate session ID: QA-<slug>-<date>
3. Create session folder structure
4. Initialize session folder structure (replaces TeamCreate)
5. Read specs/pipelines.md -> select pipeline based on mode
6. Register roles in session.json
7. Initialize shared infrastructure (wisdom/*.md)
8. Initialize pipeline via team_msg state_update:
```
mcp__ccw-tools__team_msg({
operation: "log", session_id: "<id>", from: "coordinator",
type: "state_update", summary: "Session initialized",
data: {
pipeline_mode: "<discovery|testing|full>",
pipeline_stages: [...],
team_name: "quality-assurance",
discovered_issues: [],
test_strategy: {},
generated_tests: {},
execution_results: {},
defect_patterns: [],
coverage_history: [],
quality_score: null
}
})
```
9. Write session.json
## Phase 3: Create Task Chain
Delegate to @commands/dispatch.md:
1. Read dependency graph from task-analysis.json
2. Read specs/pipelines.md for selected pipeline's task registry
3. Topological sort tasks
4. Build tasks array as JSON entries in `<session>/tasks.json`; set deps via `blockedBy` field in each entry
5. Update session.json
## Phase 4: Spawn-and-Stop
Delegate to @commands/monitor.md#handleSpawnNext:
1. Find ready tasks (pending + all addBlockedBy dependencies resolved)
2. Spawn team-worker agents (see SKILL.md Spawn Template)
3. Output status summary
4. STOP
## Phase 5: Report + Completion Action
1. Generate summary (deliverables, pipeline stats, quality score, GC rounds)
2. Execute completion action per session.completion_action:
- interactive -> request_user_input (Archive/Keep/Export)
- auto_archive -> Archive & Clean
- auto_keep -> Keep Active
## Error Handling
| Error | Resolution |
|-------|------------|
| Task too vague | request_user_input for clarification |
| Session corruption | Attempt recovery, fallback to manual |
| Worker crash | Reset task to pending, respawn |
| Dependency cycle | Detect in analysis, halt |
| Scout finds nothing | Skip to testing mode |
| GC loop stuck > 3 | Accept current coverage with warning |
| quality_score < 60 | Report with WARNING, suggest re-run |

View File

@@ -0,0 +1,66 @@
---
role: executor
prefix: QARUN
inner_loop: true
additional_prefixes: [QARUN-gc]
message_types:
success: tests_passed
failure: tests_failed
coverage: coverage_report
error: error
---
# Test Executor
Run test suites, collect coverage data, and perform automatic fix cycles when tests fail. Implements the execution side of the Generator-Executor (GC) loop.
## Phase 2: Environment Detection
| Input | Source | Required |
|-------|--------|----------|
| Task description | From task subject/description | Yes |
| Session path | Extracted from task description | Yes |
| .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
| Test strategy | meta.json -> test_strategy | Yes |
| Generated tests | meta.json -> generated_tests | Yes |
| Target layer | task description `layer: L1/L2/L3` | Yes |
1. Extract session path and target layer from task description
2. Load validation specs: Run `ccw spec load --category validation` for verification rules and acceptance criteria
3. Read .msg/meta.json for strategy and generated test file list
3. Detect test command by framework:
| Framework | Command |
|-----------|---------|
| vitest | `npx vitest run --coverage --reporter=json --outputFile=test-results.json` |
| jest | `npx jest --coverage --json --outputFile=test-results.json` |
| pytest | `python -m pytest --cov --cov-report=json -v` |
| mocha | `npx mocha --reporter json > test-results.json` |
| unknown | `npm test -- --coverage` |
4. Get test files from `generated_tests[targetLayer].files`
## Phase 3: Iterative Test-Fix Cycle
**Max iterations**: 5. **Pass threshold**: 95% or all tests pass.
Per iteration:
1. Run test command, capture output
2. Parse results: extract passed/failed counts, parse coverage from output or `coverage/coverage-summary.json`
3. If all pass (0 failures) -> exit loop (success)
4. If pass rate >= 95% and iteration >= 2 -> exit loop (good enough)
5. If iteration >= MAX -> exit loop (report current state)
6. Extract failure details (error lines, assertion failures)
7. Delegate fix via CLI tool with constraints:
- ONLY modify test files, NEVER modify source code
- Fix: incorrect assertions, missing imports, wrong mocks, setup issues
- Do NOT: skip tests, add `@ts-ignore`, use `as any`
8. Increment iteration, repeat
## Phase 4: Result Analysis & Output
1. Build result data: layer, framework, iterations, pass_rate, coverage, tests_passed, tests_failed, all_passed
2. Save results to `<session>/results/run-<layer>.json`
3. Save last test output to `<session>/results/output-<layer>.txt`
4. Update `<session>/wisdom/.msg/meta.json` under `execution_results[layer]` and top-level `execution_results.pass_rate`, `execution_results.coverage`
5. Message type: `tests_passed` if all_passed, else `tests_failed`

View File

@@ -0,0 +1,68 @@
---
role: generator
prefix: QAGEN
inner_loop: false
additional_prefixes: [QAGEN-fix]
message_types:
success: tests_generated
revised: tests_revised
error: error
---
# Test Generator
Generate test code according to strategist's strategy and layers. Support L1 unit tests, L2 integration tests, L3 E2E tests. Follow project's existing test patterns and framework conventions.
## Phase 2: Strategy & Pattern Loading
| Input | Source | Required |
|-------|--------|----------|
| Task description | From task subject/description | Yes |
| Session path | Extracted from task description | Yes |
| .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
| Test strategy | meta.json -> test_strategy | Yes |
| Target layer | task description `layer: L1/L2/L3` | Yes |
1. Extract session path and target layer from task description
2. Read .msg/meta.json for test strategy (layers, coverage targets)
3. Determine if this is a GC fix task (subject contains "fix")
4. Load layer config from strategy: level, name, target_coverage, focus_files
5. Learn existing test patterns -- find 3 similar test files via Glob(`**/*.{test,spec}.{ts,tsx,js,jsx}`)
6. Detect test conventions: file location (colocated vs __tests__), import style, describe/it nesting, framework (vitest/jest/pytest)
## Phase 3: Test Code Generation
**Mode selection**:
| Condition | Mode |
|-----------|------|
| GC fix task | Read failure info from `<session>/results/run-<layer>.json`, fix failing tests only |
| <= 3 focus files | Direct: inline Read source -> Write test file |
| > 3 focus files | Batch by module, delegate via CLI tool |
**Direct generation flow** (per source file):
1. Read source file content, extract exports
2. Determine test file path following project conventions
3. If test exists -> analyze missing cases -> append new tests via Edit
4. If no test -> generate full test file via Write
5. Include: happy path, edge cases, error cases per export
**GC fix flow**:
1. Read execution results and failure output from results directory
2. Read each failing test file
3. Fix assertions, imports, mocks, or test setup
4. Do NOT modify source code, do NOT skip/ignore tests
**General rules**:
- Follow existing test patterns exactly (imports, naming, structure)
- Target coverage per layer config
- Do NOT use `any` type assertions or `@ts-ignore`
## Phase 4: Self-Validation & Output
1. Collect generated/modified test files
2. Run syntax check (TypeScript: `tsc --noEmit`, or framework-specific)
3. Auto-fix syntax errors (max 3 attempts)
4. Write test metadata to `<session>/wisdom/.msg/meta.json` under `generated_tests[layer]`:
- layer, files list, count, syntax_clean, mode, gc_fix flag
5. Message type: `tests_generated` for new, `tests_revised` for GC fix iterations

View File

@@ -0,0 +1,67 @@
---
role: scout
prefix: SCOUT
inner_loop: false
message_types:
success: scan_ready
error: error
issues: issues_found
---
# Multi-Perspective Scout
Scan codebase from multiple perspectives (bug, security, test-coverage, code-quality, UX) to discover potential issues. Produce structured scan results with severity-ranked findings.
## Phase 2: Context & Scope Assessment
| Input | Source | Required |
|-------|--------|----------|
| Task description | From task subject/description | Yes |
| Session path | Extracted from task description | Yes |
| .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
1. Extract session path and target scope from task description
2. Determine scan scope: explicit scope from task or `**/*` default
3. Get recent changed files: `git diff --name-only HEAD~5 2>/dev/null || echo ""`
4. Read .msg/meta.json for historical defect patterns (`defect_patterns`)
5. Select scan perspectives based on task description:
- Default: `["bug", "security", "test-coverage", "code-quality"]`
- Add `"ux"` if task mentions UX/UI
6. Assess complexity to determine scan strategy:
| Complexity | Condition | Strategy |
|------------|-----------|----------|
| Low | < 5 changed files, no specific keywords | ACE search + Grep inline |
| Medium | 5-15 files or specific perspective requested | CLI fan-out (3 core perspectives) |
| High | > 15 files or full-project scan | CLI fan-out (all perspectives) |
## Phase 3: Multi-Perspective Scan
**Low complexity**: Use `mcp__ace-tool__search_context` for quick pattern-based scan.
**Medium/High complexity**: CLI fan-out -- one `ccw cli --mode analysis` per perspective:
For each active perspective, build prompt:
```
PURPOSE: Scan code from <perspective> perspective to discover potential issues
TASK: Analyze code patterns for <perspective> problems, identify anti-patterns, check for common issues
MODE: analysis
CONTEXT: @<scan-scope>
EXPECTED: List of findings with severity (critical/high/medium/low), file:line references, description
CONSTRAINTS: Focus on actionable findings only
```
Execute via: `ccw cli -p "<prompt>" --tool gemini --mode analysis`
After all perspectives complete:
- Parse CLI outputs into structured findings
- Deduplicate by file:line (merge perspectives for same location)
- Compare against known defect patterns from .msg/meta.json
- Rank by severity: critical > high > medium > low
## Phase 4: Result Aggregation
1. Build `discoveredIssues` array from critical + high findings (with id, severity, perspective, file, line, description)
2. Write scan results to `<session>/scan/scan-results.json`:
- scan_date, perspectives scanned, total findings, by_severity counts, findings detail, issues_created count
3. Update `<session>/wisdom/.msg/meta.json`: merge `discovered_issues` field
4. Contribute to wisdom/issues.md if new patterns found

View File

@@ -0,0 +1,71 @@
---
role: strategist
prefix: QASTRAT
inner_loop: false
message_types:
success: strategy_ready
error: error
---
# Test Strategist
Analyze change scope, determine test layers (L1-L3), define coverage targets, and generate test strategy document. Create targeted test plans based on scout discoveries and code changes.
## Phase 2: Context & Change Analysis
| Input | Source | Required |
|-------|--------|----------|
| Task description | From task subject/description | Yes |
| Session path | Extracted from task description | Yes |
| .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
| Discovered issues | meta.json -> discovered_issues | No |
| Defect patterns | meta.json -> defect_patterns | No |
1. Extract session path from task description
2. Read .msg/meta.json for scout discoveries and historical patterns
3. Analyze change scope: `git diff --name-only HEAD~5`
4. Categorize changed files:
| Category | Pattern |
|----------|---------|
| Source | `\.(ts|tsx|js|jsx|py|java|go|rs)$` |
| Test | `\.(test|spec)\.(ts|tsx|js|jsx)$` or `test_` |
| Config | `\.(json|yaml|yml|toml|env)$` |
5. Detect test framework from package.json / project files
6. Check existing coverage baseline from `coverage/coverage-summary.json`
7. Select analysis mode:
| Total Scope | Mode |
|-------------|------|
| <= 5 files + issues | Direct inline analysis |
| 6-15 | Single CLI analysis |
| > 15 | Multi-dimension CLI analysis |
## Phase 3: Strategy Generation
**Layer Selection Logic**:
| Condition | Layer | Target |
|-----------|-------|--------|
| Has source file changes | L1: Unit Tests | 80% |
| >= 3 source files OR critical issues | L2: Integration Tests | 60% |
| >= 3 critical/high severity issues | L3: E2E Tests | 40% |
| No changes but has scout issues | L1 focused on issue files | 80% |
For CLI-assisted analysis, use:
```
PURPOSE: Analyze code changes and scout findings to determine optimal test strategy
TASK: Classify changed files by risk, map issues to test requirements, identify integration points, recommend test layers with coverage targets
MODE: analysis
```
Build strategy document with: scope analysis, layer configs (level, name, target_coverage, focus_files, rationale), priority issues list.
**Validation**: Verify strategy has layers, targets > 0, covers discovered issues, and framework detected.
## Phase 4: Output & Persistence
1. Write strategy to `<session>/strategy/test-strategy.md`
2. Update `<session>/wisdom/.msg/meta.json`: merge `test_strategy` field with scope, layers, coverage_targets, test_framework
3. Contribute to wisdom/decisions.md with layer selection rationale

View File

@@ -1,190 +0,0 @@
# Team Quality Assurance -- CSV Schema
## Master CSV: tasks.csv
### Column Definitions
#### Input Columns (Set by Decomposer)
| Column | Type | Required | Description | Example |
|--------|------|----------|-------------|---------|
| `id` | string | Yes | Unique task identifier (PREFIX-NNN) | `"SCOUT-001"` |
| `title` | string | Yes | Short task title | `"Multi-perspective code scan"` |
| `description` | string | Yes | Detailed task description (self-contained) | `"Scan codebase from multiple perspectives..."` |
| `role` | enum | Yes | Worker role: `scout`, `strategist`, `generator`, `executor`, `analyst` | `"scout"` |
| `perspective` | string | No | Scan perspectives (semicolon-separated, scout only) | `"bug;security;test-coverage;code-quality"` |
| `layer` | string | No | Test layer: `L1`, `L2`, `L3`, or empty | `"L1"` |
| `coverage_target` | string | No | Target coverage percentage for this layer | `"80"` |
| `deps` | string | No | Semicolon-separated dependency task IDs | `"SCOUT-001"` |
| `context_from` | string | No | Semicolon-separated task IDs for context | `"SCOUT-001"` |
| `exec_mode` | enum | Yes | Execution mechanism: `csv-wave` or `interactive` | `"csv-wave"` |
#### Computed Columns (Set by Wave Engine)
| Column | Type | Description | Example |
|--------|------|-------------|---------|
| `wave` | integer | Wave number (1-based, from topological sort) | `2` |
| `prev_context` | string | Aggregated findings from context_from tasks (per-wave CSV only) | `"[SCOUT-001] Found 5 security issues..."` |
#### Output Columns (Set by Agent)
| Column | Type | Description | Example |
|--------|------|-------------|---------|
| `status` | enum | `pending` -> `completed` / `failed` / `skipped` | `"completed"` |
| `findings` | string | Key discoveries (max 500 chars) | `"Found 3 critical security issues..."` |
| `issues_found` | string | Count of issues discovered (scout/analyst) | `"5"` |
| `pass_rate` | string | Test pass rate as decimal (executor only) | `"0.95"` |
| `coverage_achieved` | string | Actual coverage percentage (executor only) | `"82"` |
| `test_files` | string | Semicolon-separated test file paths (generator only) | `"tests/L1-unit/auth.test.ts"` |
| `quality_score` | string | Quality score 0-100 (analyst only) | `"78"` |
| `error` | string | Error message if failed | `""` |
---
### exec_mode Values
| Value | Mechanism | Description |
|-------|-----------|-------------|
| `csv-wave` | `spawn_agents_on_csv` | One-shot batch execution within wave |
| `interactive` | `spawn_agent`/`wait`/`send_input`/`close_agent` | Multi-round individual execution (executor fix cycles) |
Interactive tasks appear in master CSV for dependency tracking but are NOT included in wave-{N}.csv files.
---
### Role Prefixes
| Role | Prefix | Responsibility Type |
|------|--------|---------------------|
| scout | SCOUT | read-only analysis (multi-perspective scan) |
| strategist | QASTRAT | read-only analysis (strategy formulation) |
| generator | QAGEN | code-gen (test file generation) |
| executor | QARUN | validation (test execution + fix cycles) |
| analyst | QAANA | read-only analysis (quality reporting) |
---
### Example Data
```csv
id,title,description,role,perspective,layer,coverage_target,deps,context_from,exec_mode,wave,status,findings,issues_found,pass_rate,coverage_achieved,test_files,quality_score,error
"SCOUT-001","Multi-perspective code scan","Scan codebase from bug, security, test-coverage, code-quality perspectives. Identify issues with severity ranking (critical/high/medium/low) and file:line references. Write scan results to <session>/scan/scan-results.json","scout","bug;security;test-coverage;code-quality","","","","","csv-wave","1","pending","","","","","","",""
"QASTRAT-001","Test strategy formulation","Analyze scout findings and code changes. Determine test layers (L1/L2/L3), define coverage targets, detect test framework, identify priority files. Write strategy to <session>/strategy/test-strategy.md","strategist","","","","SCOUT-001","SCOUT-001","csv-wave","2","pending","","","","","","",""
"QAGEN-L1-001","Generate L1 unit tests","Generate L1 unit tests based on strategy. Read source files, identify exports, generate test cases for happy path, edge cases, error handling. Follow project test conventions. Write tests to <session>/tests/L1-unit/","generator","","L1","80","QASTRAT-001","QASTRAT-001","csv-wave","3","pending","","","","","","",""
"QAGEN-L2-001","Generate L2 integration tests","Generate L2 integration tests based on strategy. Focus on module interaction points and integration boundaries. Write tests to <session>/tests/L2-integration/","generator","","L2","60","QASTRAT-001","QASTRAT-001","csv-wave","3","pending","","","","","","",""
"QARUN-L1-001","Execute L1 tests and collect coverage","Run L1 test suite with coverage collection. Parse results for pass rate and coverage. If pass_rate < 0.95 or coverage < 80%, attempt auto-fix (max 3 iterations). Save results to <session>/results/run-L1.json","executor","","L1","80","QAGEN-L1-001","QAGEN-L1-001","interactive","4","pending","","","","","","",""
"QARUN-L2-001","Execute L2 tests and collect coverage","Run L2 integration test suite with coverage. Auto-fix up to 3 iterations. Save results to <session>/results/run-L2.json","executor","","L2","60","QAGEN-L2-001","QAGEN-L2-001","interactive","4","pending","","","","","","",""
"QAANA-001","Quality analysis report","Analyze defect patterns, coverage gaps, test effectiveness. Calculate quality score (0-100). Generate comprehensive report with recommendations. Write to <session>/analysis/quality-report.md","analyst","","","","QARUN-L1-001;QARUN-L2-001","QARUN-L1-001;QARUN-L2-001","csv-wave","5","pending","","","","","","",""
"SCOUT-002","Regression scan","Post-fix regression scan. Verify no new issues introduced by test fixes. Focus on areas modified during GC loops.","scout","bug;security;code-quality","","","QAANA-001","QAANA-001","csv-wave","6","pending","","","","","","",""
```
---
### Column Lifecycle
```
Decomposer (Phase 1) Wave Engine (Phase 2) Agent (Execution)
--------------------- -------------------- -----------------
id ----------> id ----------> id
title ----------> title ----------> (reads)
description ----------> description ----------> (reads)
role ----------> role ----------> (reads)
perspective ----------> perspective ----------> (reads)
layer ----------> layer ----------> (reads)
coverage_target -------> coverage_target -------> (reads)
deps ----------> deps ----------> (reads)
context_from----------> context_from----------> (reads)
exec_mode ----------> exec_mode ----------> (reads)
wave ----------> (reads)
prev_context ----------> (reads)
status
findings
issues_found
pass_rate
coverage_achieved
test_files
quality_score
error
```
---
## Output Schema (JSON)
Agent output via `report_agent_job_result` (csv-wave tasks):
```json
{
"id": "SCOUT-001",
"status": "completed",
"findings": "Multi-perspective scan found 5 issues: 2 security (hardcoded keys, missing auth), 1 bug (null reference), 2 code-quality (duplicated logic, high complexity). All issues logged to discoveries.ndjson.",
"issues_found": "5",
"pass_rate": "",
"coverage_achieved": "",
"test_files": "",
"quality_score": "",
"error": ""
}
```
Interactive tasks output via structured text or JSON written to `interactive/{id}-result.json`.
---
## Discovery Types
| Type | Dedup Key | Data Schema | Description |
|------|-----------|-------------|-------------|
| `issue_found` | `data.file+data.line` | `{file, line, severity, perspective, description}` | Issue discovered by scout |
| `framework_detected` | `data.framework` | `{framework, config_file, test_pattern}` | Test framework identified |
| `test_generated` | `data.file` | `{file, source_file, test_count}` | Test file created |
| `defect_found` | `data.file+data.line` | `{file, line, pattern, description}` | Defect found during testing |
| `coverage_gap` | `data.file` | `{file, current, target, gap}` | Coverage gap identified |
| `convention_found` | `data.pattern` | `{pattern, example_file, description}` | Test convention detected |
| `fix_applied` | `data.test_file+data.fix_type` | `{test_file, fix_type, description}` | Test fix during GC loop |
| `quality_metric` | `data.dimension` | `{dimension, score, details}` | Quality dimension score |
### Discovery NDJSON Format
```jsonl
{"ts":"2026-03-08T10:00:00Z","worker":"SCOUT-001","type":"issue_found","data":{"file":"src/auth.ts","line":42,"severity":"high","perspective":"security","description":"Hardcoded secret key in auth module"}}
{"ts":"2026-03-08T10:02:00Z","worker":"SCOUT-001","type":"issue_found","data":{"file":"src/user.ts","line":15,"severity":"medium","perspective":"bug","description":"Missing null check on user object"}}
{"ts":"2026-03-08T10:05:00Z","worker":"QASTRAT-001","type":"framework_detected","data":{"framework":"vitest","config_file":"vitest.config.ts","test_pattern":"**/*.test.ts"}}
{"ts":"2026-03-08T10:10:00Z","worker":"QAGEN-L1-001","type":"test_generated","data":{"file":"tests/L1-unit/auth.test.ts","source_file":"src/auth.ts","test_count":8}}
{"ts":"2026-03-08T10:15:00Z","worker":"QARUN-L1-001","type":"defect_found","data":{"file":"src/auth.ts","line":42,"pattern":"null_reference","description":"Missing null check on token payload"}}
{"ts":"2026-03-08T10:20:00Z","worker":"QAANA-001","type":"quality_metric","data":{"dimension":"coverage_achievement","score":85,"details":"L1: 82%, L2: 68%"}}
```
> Both csv-wave and interactive agents read/write the same discoveries.ndjson file.
---
## Cross-Mechanism Context Flow
| Source | Target | Mechanism |
|--------|--------|-----------|
| Scout findings | Strategist prev_context | CSV context_from column |
| CSV task findings | Interactive task | Injected via spawn message |
| Interactive task result | CSV task prev_context | Read from interactive/{id}-result.json |
| Any agent discovery | Any agent | Shared via discoveries.ndjson |
| Executor coverage data | GC loop handler | Read from results/run-{layer}.json |
| Analyst quality score | Regression scout | Injected via prev_context |
---
## Validation Rules
| Rule | Check | Error |
|------|-------|-------|
| Unique IDs | No duplicate `id` values | "Duplicate task ID: {id}" |
| Valid deps | All dep IDs exist in tasks | "Unknown dependency: {dep_id}" |
| No self-deps | Task cannot depend on itself | "Self-dependency: {id}" |
| No circular deps | Topological sort completes | "Circular dependency detected involving: {ids}" |
| context_from valid | All context IDs exist and in earlier waves | "Invalid context_from: {id}" |
| exec_mode valid | Value is `csv-wave` or `interactive` | "Invalid exec_mode: {value}" |
| Description non-empty | Every task has description | "Empty description for task: {id}" |
| Status enum | status in {pending, completed, failed, skipped} | "Invalid status: {status}" |
| Role valid | role in {scout, strategist, generator, executor, analyst} | "Invalid role: {role}" |
| Layer valid | layer in {L1, L2, L3, ""} | "Invalid layer: {layer}" |
| Perspective valid | If scout, perspective contains valid values | "Invalid perspective: {value}" |
| Coverage target valid | If layer present, coverage_target is numeric | "Invalid coverage target: {value}" |

View File

@@ -0,0 +1,115 @@
# QA Pipelines
Pipeline definitions and task registry for team-quality-assurance.
## Pipeline Modes
| Mode | Description | Entry Role |
|------|-------------|------------|
| discovery | Scout-first: issue discovery then testing | scout |
| testing | Skip scout, direct test pipeline | strategist |
| full | Complete QA closed loop + regression scan | scout |
## Pipeline Definitions
### Discovery Mode (5 tasks, serial)
```
SCOUT-001 -> QASTRAT-001 -> QAGEN-001 -> QARUN-001 -> QAANA-001
```
| Task ID | Role | Dependencies | Description |
|---------|------|-------------|-------------|
| SCOUT-001 | scout | (none) | Multi-perspective issue scanning |
| QASTRAT-001 | strategist | SCOUT-001 | Change scope analysis + test strategy |
| QAGEN-001 | generator | QASTRAT-001 | L1 unit test generation |
| QARUN-001 | executor | QAGEN-001 | L1 test execution + fix cycles |
| QAANA-001 | analyst | QARUN-001 | Defect pattern analysis + quality report |
### Testing Mode (6 tasks, progressive layers)
```
QASTRAT-001 -> QAGEN-L1-001 -> QARUN-L1-001 -> QAGEN-L2-001 -> QARUN-L2-001 -> QAANA-001
```
| Task ID | Role | Dependencies | Layer | Description |
|---------|------|-------------|-------|-------------|
| QASTRAT-001 | strategist | (none) | — | Test strategy formulation |
| QAGEN-L1-001 | generator | QASTRAT-001 | L1 | L1 unit test generation |
| QARUN-L1-001 | executor | QAGEN-L1-001 | L1 | L1 test execution + fix cycles |
| QAGEN-L2-001 | generator | QARUN-L1-001 | L2 | L2 integration test generation |
| QARUN-L2-001 | executor | QAGEN-L2-001 | L2 | L2 test execution + fix cycles |
| QAANA-001 | analyst | QARUN-L2-001 | — | Quality analysis report |
### Full Mode (8 tasks, parallel windows + regression)
```
SCOUT-001 -> QASTRAT-001 -> [QAGEN-L1-001 || QAGEN-L2-001] -> [QARUN-L1-001 || QARUN-L2-001] -> QAANA-001 -> SCOUT-002
```
| Task ID | Role | Dependencies | Layer | Description |
|---------|------|-------------|-------|-------------|
| SCOUT-001 | scout | (none) | — | Multi-perspective issue scanning |
| QASTRAT-001 | strategist | SCOUT-001 | — | Test strategy formulation |
| QAGEN-L1-001 | generator-1 | QASTRAT-001 | L1 | L1 unit test generation (parallel) |
| QAGEN-L2-001 | generator-2 | QASTRAT-001 | L2 | L2 integration test generation (parallel) |
| QARUN-L1-001 | executor-1 | QAGEN-L1-001 | L1 | L1 test execution + fix cycles (parallel) |
| QARUN-L2-001 | executor-2 | QAGEN-L2-001 | L2 | L2 test execution + fix cycles (parallel) |
| QAANA-001 | analyst | QARUN-L1-001, QARUN-L2-001 | — | Quality analysis report |
| SCOUT-002 | scout | QAANA-001 | — | Regression scan after fixes |
## GC Loop
Generator-Executor iterate per test layer until coverage targets are met:
```
QAGEN -> QARUN -> (if coverage < target) -> QAGEN-fix -> QARUN-gc
(if coverage >= target) -> next layer or QAANA
```
- Max iterations: 3 per layer
- After 3 iterations: accept current coverage with warning
## Coverage Targets
| Layer | Name | Default Target |
|-------|------|----------------|
| L1 | Unit Tests | 80% |
| L2 | Integration Tests | 60% |
| L3 | E2E Tests | 40% |
## Scan Perspectives
| Perspective | Focus |
|-------------|-------|
| bug | Logic errors, crash paths, null references |
| security | Vulnerabilities, auth bypass, data exposure |
| test-coverage | Untested code paths, missing assertions |
| code-quality | Anti-patterns, complexity, maintainability |
| ux | User-facing issues, accessibility (optional) |
## Session Directory
```
.workflow/.team/QA-<slug>-<YYYY-MM-DD>/
├── .msg/messages.jsonl # Message bus log
├── .msg/meta.json # Session state + cross-role state
├── wisdom/ # Cross-task knowledge
│ ├── learnings.md
│ ├── decisions.md
│ ├── conventions.md
│ └── issues.md
├── scan/ # Scout output
│ └── scan-results.json
├── strategy/ # Strategist output
│ └── test-strategy.md
├── tests/ # Generator output
│ ├── L1-unit/
│ ├── L2-integration/
│ └── L3-e2e/
├── results/ # Executor output
│ ├── run-001.json
│ └── coverage-001.json
└── analysis/ # Analyst output
└── quality-report.md
```

View File

@@ -0,0 +1,131 @@
{
"team_name": "quality-assurance",
"version": "1.0.0",
"description": "质量保障团队 - 融合\"软件测试\"和\"问题发现\"两大能力域,形成发现→验证→修复→回归的闭环",
"skill_entry": "team-quality-assurance",
"invocation": "Skill(skill=\"team-quality-assurance\", args=\"--role=coordinator ...\")",
"roles": {
"coordinator": {
"name": "coordinator",
"responsibility": "Orchestration",
"task_prefix": null,
"description": "QA 团队协调者。编排 pipeline需求澄清 → 模式选择 → 团队创建 → 任务分发 → 监控协调 → 质量门控 → 结果汇报",
"message_types_sent": ["mode_selected", "gc_loop_trigger", "quality_gate", "task_unblocked", "error", "shutdown"],
"message_types_received": ["scan_ready", "issues_found", "strategy_ready", "tests_generated", "tests_revised", "tests_passed", "tests_failed", "analysis_ready", "quality_report", "error"],
"commands": ["dispatch", "monitor"]
},
"scout": {
"name": "scout",
"responsibility": "Orchestration (多视角扫描编排)",
"task_prefix": "SCOUT-*",
"description": "多视角问题侦察员。主动扫描代码库,从 bug、安全、UX、测试覆盖、代码质量等多个视角发现潜在问题",
"message_types_sent": ["scan_ready", "issues_found", "error"],
"message_types_received": [],
"commands": ["scan"],
"cli_tools": ["gemini"],
"cli_tools": ["gemini"]
},
"strategist": {
"name": "strategist",
"responsibility": "Orchestration (策略制定)",
"task_prefix": "QASTRAT-*",
"description": "测试策略师。分析变更范围确定测试层级L1-L3定义覆盖率目标",
"message_types_sent": ["strategy_ready", "error"],
"message_types_received": [],
"commands": ["analyze-scope"],
"cli_tools": ["gemini"],
"cli_tools": ["gemini"]
},
"generator": {
"name": "generator",
"responsibility": "Code generation (测试代码生成)",
"task_prefix": "QAGEN-*",
"description": "测试用例生成器。按策略和层级生成测试代码,支持 L1/L2/L3",
"message_types_sent": ["tests_generated", "tests_revised", "error"],
"message_types_received": [],
"commands": ["generate-tests"],
"cli_tools": ["gemini"],
"cli_tools": ["gemini"]
},
"executor": {
"name": "executor",
"responsibility": "Validation (测试执行与修复)",
"task_prefix": "QARUN-*",
"description": "测试执行者。运行测试套件,收集覆盖率数据,失败时自动修复循环",
"message_types_sent": ["tests_passed", "tests_failed", "coverage_report", "error"],
"message_types_received": [],
"commands": ["run-fix-cycle"],
"cli_tools": ["gemini"]
},
"analyst": {
"name": "analyst",
"responsibility": "Read-only analysis (质量分析)",
"task_prefix": "QAANA-*",
"description": "质量分析师。分析缺陷模式、覆盖率差距、测试有效性,生成综合质量报告",
"message_types_sent": ["analysis_ready", "quality_report", "error"],
"message_types_received": [],
"commands": ["quality-report"],
"cli_tools": ["gemini"]
}
},
"pipeline_modes": {
"discovery": {
"description": "Scout先行扫描 → 全流程",
"stages": ["SCOUT", "QASTRAT", "QAGEN", "QARUN", "QAANA"],
"entry_role": "scout"
},
"testing": {
"description": "跳过 Scout → 直接测试",
"stages": ["QASTRAT", "QAGEN-L1", "QARUN-L1", "QAGEN-L2", "QARUN-L2", "QAANA"],
"entry_role": "strategist"
},
"full": {
"description": "完整 QA 闭环 + 回归扫描",
"stages": ["SCOUT", "QASTRAT", "QAGEN-L1", "QAGEN-L2", "QARUN-L1", "QARUN-L2", "QAANA", "SCOUT-REG"],
"entry_role": "scout",
"parallel_stages": [["QAGEN-L1", "QAGEN-L2"], ["QARUN-L1", "QARUN-L2"]]
}
},
"gc_loop": {
"max_iterations": 3,
"trigger": "coverage < target",
"participants": ["generator", "executor"],
"flow": "QAGEN-fix → QARUN-gc → evaluate"
},
"shared_memory": {
"file": "shared-memory.json",
"fields": {
"discovered_issues": { "owner": "scout", "type": "array" },
"test_strategy": { "owner": "strategist", "type": "object" },
"generated_tests": { "owner": "generator", "type": "object" },
"execution_results": { "owner": "executor", "type": "object" },
"defect_patterns": { "owner": "analyst", "type": "array" },
"coverage_history": { "owner": "analyst", "type": "array" },
"quality_score": { "owner": "analyst", "type": "number" }
}
},
"collaboration_patterns": [
"CP-1: Linear Pipeline (Discovery/Testing mode)",
"CP-2: Review-Fix Cycle (GC loop: Generator ↔ Executor)",
"CP-3: Fan-out (Scout multi-perspective scan)",
"CP-5: Escalation (Worker → Coordinator → User)",
"CP-9: Dual-Track (Full mode: L1 + L2 parallel)",
"CP-10: Post-Mortem (Analyst quality report)"
],
"session_directory": {
"pattern": ".workflow/.team/QA-{slug}-{date}",
"subdirectories": ["scan", "strategy", "results", "analysis"]
},
"test_layers": {
"L1": { "name": "Unit Tests", "default_target": 80 },
"L2": { "name": "Integration Tests", "default_target": 60 },
"L3": { "name": "E2E Tests", "default_target": 40 }
}
}