mirror of
https://github.com/catlog22/Claude-Code-Workflow.git
synced 2026-03-26 19:56:37 +08:00
feat: migrate all codex team skills from spawn_agents_on_csv to spawn_agent + wait_agent architecture
- Delete 21 old team skill directories using CSV-wave pipeline pattern (~100+ files) - Delete old team-lifecycle (v3) and team-planex-v2 - Create generic team-worker.toml and team-supervisor.toml (replacing tlv4-specific TOMLs) - Convert 19 team skills from Claude Code format (Agent/SendMessage/TaskCreate) to Codex format (spawn_agent/wait_agent/tasks.json/request_user_input) - Update team-lifecycle-v4 to use generic agent types (team_worker/team_supervisor) - Convert all coordinator role files: dispatch.md, monitor.md, role.md - Convert all worker role files: remove run_in_background, fix Bash syntax - Convert all specs/pipelines.md references - Final state: 20 team skills, 217 .md files, zero Claude Code API residuals Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -1,769 +1,144 @@
|
||||
---
|
||||
name: team-testing
|
||||
description: Multi-agent test pipeline with progressive layer coverage (L1/L2/L3), Generator-Critic loops for coverage convergence, and shared defect memory. Strategist -> Generator -> Executor -> Analyst with dynamic pipeline selection.
|
||||
argument-hint: "[-y|--yes] [-c|--concurrency N] [--continue] \"task description or scope\""
|
||||
allowed-tools: spawn_agents_on_csv, spawn_agent, wait, send_input, close_agent, Read, Write, Edit, Bash, Glob, Grep, request_user_input
|
||||
description: Unified team skill for testing team. Progressive test coverage through Generator-Critic loops, shared memory, and dynamic layer selection. Triggers on "team testing".
|
||||
allowed-tools: spawn_agent(*), wait_agent(*), send_input(*), close_agent(*), report_agent_job_result(*), request_user_input(*), Read(*), Write(*), Edit(*), Bash(*), Glob(*), Grep(*)
|
||||
---
|
||||
|
||||
## Auto Mode
|
||||
|
||||
When `--yes` or `-y`: Auto-confirm task decomposition, skip interactive validation, use defaults.
|
||||
|
||||
# Team Testing
|
||||
|
||||
## Usage
|
||||
Orchestrate multi-agent test pipeline: strategist -> generator -> executor -> analyst. Progressive layer coverage (L1/L2/L3) with Generator-Critic loops for coverage convergence.
|
||||
|
||||
```bash
|
||||
$team-testing "Generate tests for the authentication module"
|
||||
$team-testing -c 4 "Progressive testing for recent changes with L1+L2 coverage"
|
||||
$team-testing -y "Test all changed files since last commit"
|
||||
$team-testing --continue "tst-auth-module-20260308"
|
||||
```
|
||||
|
||||
**Flags**:
|
||||
- `-y, --yes`: Skip all confirmations (auto mode)
|
||||
- `-c, --concurrency N`: Max concurrent agents within each wave (default: 3)
|
||||
- `--continue`: Resume existing session
|
||||
|
||||
**Output Directory**: `.workflow/.csv-wave/{session-id}/`
|
||||
**Core Output**: `tasks.csv` (master state) + `results.csv` (final) + `discoveries.ndjson` (shared exploration) + `context.md` (human-readable report)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Orchestrate multi-agent test pipeline: strategist -> generator -> executor -> analyst. Progressive layer coverage (L1 unit / L2 integration / L3 E2E) with Generator-Critic (GC) loops for coverage convergence. Dynamic pipeline selection based on change scope (targeted / standard / comprehensive).
|
||||
|
||||
**Execution Model**: Hybrid -- CSV wave pipeline (primary) + individual agent spawn (secondary)
|
||||
## Architecture
|
||||
|
||||
```
|
||||
+-------------------------------------------------------------------+
|
||||
| TEAM TESTING WORKFLOW |
|
||||
+-------------------------------------------------------------------+
|
||||
| |
|
||||
| Phase 0: Pre-Wave Interactive (Requirement Clarification) |
|
||||
| +- Parse task description, detect change scope |
|
||||
| +- Select pipeline (targeted/standard/comprehensive) |
|
||||
| +- Output: refined requirements for decomposition |
|
||||
| |
|
||||
| Phase 1: Requirement -> CSV + Classification |
|
||||
| +- Analyze git diff for changed files |
|
||||
| +- Map files to test layers (L1/L2/L3) |
|
||||
| +- Build dependency chain with GC loop tasks |
|
||||
| +- Classify tasks: csv-wave | interactive (exec_mode) |
|
||||
| +- Compute dependency waves (topological sort) |
|
||||
| +- Generate tasks.csv with wave + exec_mode columns |
|
||||
| +- User validates task breakdown (skip if -y) |
|
||||
| |
|
||||
| Phase 2: Wave Execution Engine (Extended) |
|
||||
| +- For each wave (1..N): |
|
||||
| | +- Execute pre-wave interactive tasks (if any) |
|
||||
| | +- Build wave CSV (filter csv-wave tasks for this wave) |
|
||||
| | +- Inject previous findings into prev_context column |
|
||||
| | +- spawn_agents_on_csv(wave CSV) |
|
||||
| | +- Execute post-wave interactive tasks (if any) |
|
||||
| | +- Merge all results into master tasks.csv |
|
||||
| | +- GC Loop Check: coverage < target? -> spawn fix tasks |
|
||||
| | +- Check: any failed? -> skip dependents |
|
||||
| +- discoveries.ndjson shared across all modes (append-only) |
|
||||
| |
|
||||
| Phase 3: Post-Wave Interactive (Completion Action) |
|
||||
| +- Pipeline completion report with coverage metrics |
|
||||
| +- Interactive completion choice (Archive/Keep/Deepen) |
|
||||
| +- Final aggregation / report |
|
||||
| |
|
||||
| Phase 4: Results Aggregation |
|
||||
| +- Export final results.csv |
|
||||
| +- Generate context.md with all findings |
|
||||
| +- Display summary: completed/failed/skipped per wave |
|
||||
| +- Offer: view results | retry failed | done |
|
||||
| |
|
||||
+-------------------------------------------------------------------+
|
||||
Skill(skill="team-testing", args="task description")
|
||||
|
|
||||
SKILL.md (this file) = Router
|
||||
|
|
||||
+--------------+--------------+
|
||||
| |
|
||||
no --role flag --role <name>
|
||||
| |
|
||||
Coordinator Worker
|
||||
roles/coordinator/role.md roles/<name>/role.md
|
||||
|
|
||||
+-- analyze -> dispatch -> spawn workers -> STOP
|
||||
|
|
||||
+-------+-------+-------+-------+
|
||||
v v v v
|
||||
[strat] [gen] [exec] [analyst]
|
||||
team-worker agents, each loads roles/<role>/role.md
|
||||
```
|
||||
|
||||
---
|
||||
## Role Registry
|
||||
|
||||
## Task Classification Rules
|
||||
| Role | Path | Prefix | Inner Loop |
|
||||
|------|------|--------|------------|
|
||||
| coordinator | [roles/coordinator/role.md](roles/coordinator/role.md) | — | — |
|
||||
| strategist | [roles/strategist/role.md](roles/strategist/role.md) | STRATEGY-* | false |
|
||||
| generator | [roles/generator/role.md](roles/generator/role.md) | TESTGEN-* | true |
|
||||
| executor | [roles/executor/role.md](roles/executor/role.md) | TESTRUN-* | true |
|
||||
| analyst | [roles/analyst/role.md](roles/analyst/role.md) | TESTANA-* | false |
|
||||
|
||||
Each task is classified by `exec_mode`:
|
||||
## Role Router
|
||||
|
||||
| exec_mode | Mechanism | Criteria |
|
||||
|-----------|-----------|----------|
|
||||
| `csv-wave` | `spawn_agents_on_csv` | One-shot, structured I/O, no multi-round interaction |
|
||||
| `interactive` | `spawn_agent`/`wait`/`send_input`/`close_agent` | Multi-round, needs iterative fix-verify cycles |
|
||||
Parse `$ARGUMENTS`:
|
||||
- Has `--role <name>` -> Read `roles/<name>/role.md`, execute Phase 2-4
|
||||
- No `--role` -> `roles/coordinator/role.md`, execute entry router
|
||||
|
||||
**Classification Decision**:
|
||||
## Shared Constants
|
||||
|
||||
| Task Property | Classification |
|
||||
|---------------|---------------|
|
||||
| Strategy formulation (single-pass analysis) | `csv-wave` |
|
||||
| Test generation (single-pass code creation) | `csv-wave` |
|
||||
| Test execution with auto-fix cycle | `interactive` |
|
||||
| Quality analysis (single-pass report) | `csv-wave` |
|
||||
| GC loop fix-verify iteration | `interactive` |
|
||||
| Coverage gate decision (coordinator) | `interactive` |
|
||||
- **Session prefix**: `TST`
|
||||
- **Session path**: `.workflow/.team/TST-<slug>-<date>/`
|
||||
- **Team name**: `testing`
|
||||
- **CLI tools**: `ccw cli --mode analysis` (read-only), `ccw cli --mode write` (modifications)
|
||||
- **Message bus**: `mcp__ccw-tools__team_msg(session_id=<session-id>, ...)`
|
||||
|
||||
---
|
||||
## Worker Spawn Template
|
||||
|
||||
## CSV Schema
|
||||
|
||||
### tasks.csv (Master State)
|
||||
|
||||
```csv
|
||||
id,title,description,role,layer,coverage_target,deps,context_from,exec_mode,wave,status,findings,pass_rate,coverage_achieved,test_files,error
|
||||
"STRATEGY-001","Analyze changes and define test strategy","Analyze git diff, detect test framework, determine test layers, define coverage targets, formulate prioritized test strategy","strategist","","","","","csv-wave","1","pending","","","","",""
|
||||
"TESTGEN-001","Generate L1 unit tests","Generate L1 unit tests for priority files based on test strategy. Follow project test conventions, include happy path, edge cases, error handling","generator","L1","80","STRATEGY-001","STRATEGY-001","csv-wave","2","pending","","","","",""
|
||||
"TESTRUN-001","Execute L1 tests and collect coverage","Run L1 test suite, collect coverage data, auto-fix failures up to 3 iterations. Report pass rate and coverage percentage","executor","L1","80","TESTGEN-001","TESTGEN-001","interactive","3","pending","","","","",""
|
||||
```
|
||||
|
||||
**Columns**:
|
||||
|
||||
| Column | Phase | Description |
|
||||
|--------|-------|-------------|
|
||||
| `id` | Input | Unique task identifier (PREFIX-NNN format) |
|
||||
| `title` | Input | Short task title |
|
||||
| `description` | Input | Detailed task description (self-contained) |
|
||||
| `role` | Input | Worker role: `strategist`, `generator`, `executor`, `analyst` |
|
||||
| `layer` | Input | Test layer: `L1`, `L2`, `L3`, or empty for non-layer tasks |
|
||||
| `coverage_target` | Input | Target coverage percentage for this layer (empty if N/A) |
|
||||
| `deps` | Input | Semicolon-separated dependency task IDs |
|
||||
| `context_from` | Input | Semicolon-separated task IDs whose findings this task needs |
|
||||
| `exec_mode` | Input | `csv-wave` or `interactive` |
|
||||
| `wave` | Computed | Wave number (computed by topological sort, 1-based) |
|
||||
| `status` | Output | `pending` -> `completed` / `failed` / `skipped` |
|
||||
| `findings` | Output | Key discoveries or implementation notes (max 500 chars) |
|
||||
| `pass_rate` | Output | Test pass rate as decimal (e.g., "0.95") |
|
||||
| `coverage_achieved` | Output | Actual coverage percentage achieved |
|
||||
| `test_files` | Output | Semicolon-separated paths of test files produced |
|
||||
| `error` | Output | Error message if failed (empty if success) |
|
||||
|
||||
### Per-Wave CSV (Temporary)
|
||||
|
||||
Each wave generates a temporary `wave-{N}.csv` with extra `prev_context` column (csv-wave tasks only).
|
||||
|
||||
---
|
||||
|
||||
## Agent Registry (Interactive Agents)
|
||||
|
||||
| Agent | Role File | Pattern | Responsibility | Position |
|
||||
|-------|-----------|---------|----------------|----------|
|
||||
| Test Executor | agents/executor.md | 2.3 (send_input cycle) | Execute tests with iterative fix cycle, report pass rate and coverage | per-wave |
|
||||
| GC Loop Handler | agents/gc-loop-handler.md | 2.3 (send_input cycle) | Manage Generator-Critic loop: evaluate coverage, trigger fix rounds | post-wave |
|
||||
|
||||
> **COMPACT PROTECTION**: Agent files are execution documents. When context compression occurs, **you MUST immediately `Read` the corresponding agent.md** to reload.
|
||||
|
||||
---
|
||||
|
||||
## Output Artifacts
|
||||
|
||||
| File | Purpose | Lifecycle |
|
||||
|------|---------|-----------|
|
||||
| `tasks.csv` | Master state -- all tasks with status/findings | Updated after each wave |
|
||||
| `wave-{N}.csv` | Per-wave input (temporary, csv-wave tasks only) | Created before wave, deleted after |
|
||||
| `results.csv` | Final export of all task results | Created in Phase 4 |
|
||||
| `discoveries.ndjson` | Shared exploration board (all agents, both modes) | Append-only, carries across waves |
|
||||
| `context.md` | Human-readable execution report | Created in Phase 4 |
|
||||
| `strategy/test-strategy.md` | Strategist output: test strategy document | Created in wave 1 |
|
||||
| `tests/L1-unit/` | Generator output: L1 unit test files | Created in L1 wave |
|
||||
| `tests/L2-integration/` | Generator output: L2 integration test files | Created in L2 wave |
|
||||
| `tests/L3-e2e/` | Generator output: L3 E2E test files | Created in L3 wave |
|
||||
| `results/run-{layer}.json` | Executor output: per-layer test results | Created per execution |
|
||||
| `analysis/quality-report.md` | Analyst output: quality analysis report | Created in final wave |
|
||||
| `interactive/{id}-result.json` | Results from interactive tasks | Created per interactive task |
|
||||
|
||||
---
|
||||
|
||||
## Session Structure
|
||||
Coordinator spawns workers using this template:
|
||||
|
||||
```
|
||||
.workflow/.csv-wave/{session-id}/
|
||||
+-- tasks.csv # Master state (all tasks, both modes)
|
||||
+-- results.csv # Final results export
|
||||
+-- discoveries.ndjson # Shared discovery board (all agents)
|
||||
+-- context.md # Human-readable report
|
||||
+-- wave-{N}.csv # Temporary per-wave input (csv-wave only)
|
||||
+-- strategy/ # Strategist output
|
||||
| +-- test-strategy.md
|
||||
+-- tests/ # Generator output
|
||||
| +-- L1-unit/
|
||||
| +-- L2-integration/
|
||||
| +-- L3-e2e/
|
||||
+-- results/ # Executor output
|
||||
| +-- run-L1.json
|
||||
| +-- run-L2.json
|
||||
| +-- run-L3.json
|
||||
+-- analysis/ # Analyst output
|
||||
| +-- quality-report.md
|
||||
+-- wisdom/ # Cross-task knowledge
|
||||
| +-- learnings.md
|
||||
| +-- conventions.md
|
||||
| +-- decisions.md
|
||||
+-- interactive/ # Interactive task artifacts
|
||||
| +-- {id}-result.json
|
||||
+-- gc-state.json # GC loop tracking state
|
||||
spawn_agent({
|
||||
agent_type: "team_worker",
|
||||
items: [
|
||||
{ type: "text", text: `## Role Assignment
|
||||
role: <role>
|
||||
role_spec: <skill_root>/roles/<role>/role.md
|
||||
session: <session-folder>
|
||||
session_id: <session-id>
|
||||
requirement: <task-description>
|
||||
inner_loop: <true|false>
|
||||
|
||||
Read role_spec file (<skill_root>/roles/<role>/role.md) to load Phase 2-4 domain instructions.` },
|
||||
|
||||
{ type: "text", text: `## Task Context
|
||||
task_id: <task-id>
|
||||
title: <task-title>
|
||||
description: <task-description>
|
||||
pipeline_phase: <pipeline-phase>` },
|
||||
|
||||
{ type: "text", text: `## Upstream Context
|
||||
<prev_context>` }
|
||||
]
|
||||
})
|
||||
```
|
||||
|
||||
---
|
||||
After spawning, use `wait_agent({ ids: [...], timeout_ms: 900000 })` to collect results, then `close_agent({ id })` each worker.
|
||||
|
||||
## Implementation
|
||||
## User Commands
|
||||
|
||||
### Session Initialization
|
||||
| Command | Action |
|
||||
|---------|--------|
|
||||
| `check` / `status` | View pipeline status graph |
|
||||
| `resume` / `continue` | Advance to next step |
|
||||
| `revise <TASK-ID>` | Revise specific task |
|
||||
| `feedback <text>` | Inject feedback for revision |
|
||||
|
||||
```javascript
|
||||
const getUtc8ISOString = () => new Date(Date.now() + 8 * 60 * 60 * 1000).toISOString()
|
||||
## Completion Action
|
||||
|
||||
const AUTO_YES = $ARGUMENTS.includes('--yes') || $ARGUMENTS.includes('-y')
|
||||
const continueMode = $ARGUMENTS.includes('--continue')
|
||||
const concurrencyMatch = $ARGUMENTS.match(/(?:--concurrency|-c)\s+(\d+)/)
|
||||
const maxConcurrency = concurrencyMatch ? parseInt(concurrencyMatch[1]) : 3
|
||||
|
||||
const requirement = $ARGUMENTS
|
||||
.replace(/--yes|-y|--continue|--concurrency\s+\d+|-c\s+\d+/g, '')
|
||||
.trim()
|
||||
|
||||
const slug = requirement.toLowerCase()
|
||||
.replace(/[^a-z0-9\u4e00-\u9fa5]+/g, '-')
|
||||
.substring(0, 40)
|
||||
const dateStr = getUtc8ISOString().substring(0, 10).replace(/-/g, '')
|
||||
const sessionId = `tst-${slug}-${dateStr}`
|
||||
const sessionFolder = `.workflow/.csv-wave/${sessionId}`
|
||||
|
||||
Bash(`mkdir -p ${sessionFolder}/strategy ${sessionFolder}/tests/L1-unit ${sessionFolder}/tests/L2-integration ${sessionFolder}/tests/L3-e2e ${sessionFolder}/results ${sessionFolder}/analysis ${sessionFolder}/wisdom ${sessionFolder}/interactive`)
|
||||
|
||||
// Initialize discoveries.ndjson
|
||||
Write(`${sessionFolder}/discoveries.ndjson`, '')
|
||||
|
||||
// Initialize wisdom files
|
||||
Write(`${sessionFolder}/wisdom/learnings.md`, '# Learnings\n')
|
||||
Write(`${sessionFolder}/wisdom/conventions.md`, '# Conventions\n')
|
||||
Write(`${sessionFolder}/wisdom/decisions.md`, '# Decisions\n')
|
||||
|
||||
// Initialize GC state
|
||||
Write(`${sessionFolder}/gc-state.json`, JSON.stringify({
|
||||
rounds: {}, coverage_history: [], max_rounds_per_layer: 3
|
||||
}, null, 2))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Phase 0: Pre-Wave Interactive (Requirement Clarification)
|
||||
|
||||
**Objective**: Parse task description, analyze change scope, select pipeline mode.
|
||||
|
||||
**Workflow**:
|
||||
|
||||
1. **Parse user task description** from $ARGUMENTS
|
||||
|
||||
2. **Check for existing sessions** (continue mode):
|
||||
- Scan `.workflow/.csv-wave/tst-*/tasks.csv` for sessions with pending tasks
|
||||
- If `--continue`: resume the specified or most recent session, skip to Phase 2
|
||||
- If active session found: ask user whether to resume or start new
|
||||
|
||||
3. **Analyze change scope**:
|
||||
```bash
|
||||
git diff --name-only HEAD~1 2>/dev/null || git diff --name-only --cached
|
||||
```
|
||||
|
||||
4. **Select pipeline**:
|
||||
|
||||
| Condition | Pipeline | Stages |
|
||||
|-----------|----------|--------|
|
||||
| fileCount <= 3 AND moduleCount <= 1 | targeted | strategy -> gen-L1 -> run-L1 |
|
||||
| fileCount <= 10 AND moduleCount <= 3 | standard | strategy -> gen-L1 -> run-L1 -> gen-L2 -> run-L2 -> analysis |
|
||||
| Otherwise | comprehensive | strategy -> [gen-L1 // gen-L2] -> [run-L1 // run-L2] -> gen-L3 -> run-L3 -> analysis |
|
||||
|
||||
5. **Clarify if ambiguous** (skip if AUTO_YES):
|
||||
```javascript
|
||||
request_user_input({
|
||||
questions: [{
|
||||
question: `Detected scope suggests '${pipeline}' pipeline. Approve or override?`,
|
||||
header: "Pipeline",
|
||||
id: "pipeline_select",
|
||||
options: [
|
||||
{ label: "Approve (Recommended)", description: `Use ${pipeline} pipeline as detected` },
|
||||
{ label: "Targeted", description: "Minimal: L1 only" },
|
||||
{ label: "Standard/Full", description: "Progressive L1+L2 or comprehensive L1+L2+L3" }
|
||||
]
|
||||
}]
|
||||
})
|
||||
```
|
||||
|
||||
6. **Output**: Refined requirement, pipeline mode, changed file list
|
||||
|
||||
**Success Criteria**:
|
||||
- Pipeline mode selected
|
||||
- Changed files identified
|
||||
- Refined requirements available for Phase 1 decomposition
|
||||
|
||||
---
|
||||
|
||||
### Phase 1: Requirement -> CSV + Classification
|
||||
|
||||
**Objective**: Decompose testing task into dependency-ordered CSV tasks with wave assignments.
|
||||
|
||||
**Decomposition Rules**:
|
||||
|
||||
1. **Detect test framework** from project files:
|
||||
|
||||
| Signal File | Framework |
|
||||
|-------------|-----------|
|
||||
| vitest.config.ts/js | Vitest |
|
||||
| jest.config.js/ts | Jest |
|
||||
| pytest.ini / pyproject.toml | Pytest |
|
||||
| No detection | Default to Jest |
|
||||
|
||||
2. **Build pipeline task chain** from selected pipeline:
|
||||
|
||||
| Pipeline | Task Chain |
|
||||
|----------|------------|
|
||||
| targeted | STRATEGY-001 -> TESTGEN-001 -> TESTRUN-001 |
|
||||
| standard | STRATEGY-001 -> TESTGEN-001 -> TESTRUN-001 -> TESTGEN-002 -> TESTRUN-002 -> TESTANA-001 |
|
||||
| comprehensive | STRATEGY-001 -> [TESTGEN-001, TESTGEN-002] -> [TESTRUN-001, TESTRUN-002] -> TESTGEN-003 -> TESTRUN-003 -> TESTANA-001 |
|
||||
|
||||
3. **Assign roles, layers, and coverage targets** per task
|
||||
|
||||
4. **Assign exec_mode**:
|
||||
- Strategist, Generator, Analyst tasks: `csv-wave` (single-pass)
|
||||
- Executor tasks: `interactive` (iterative fix cycle)
|
||||
|
||||
**Classification Rules**:
|
||||
|
||||
| Task Property | exec_mode |
|
||||
|---------------|-----------|
|
||||
| Strategy analysis (single-pass read + write) | `csv-wave` |
|
||||
| Test code generation (single-pass write) | `csv-wave` |
|
||||
| Test execution with fix loop (multi-round) | `interactive` |
|
||||
| Quality analysis (single-pass read + write) | `csv-wave` |
|
||||
|
||||
**Wave Computation**: Kahn's BFS topological sort with depth tracking.
|
||||
|
||||
**User Validation**: Display task breakdown with wave + exec_mode + layer assignment (skip if AUTO_YES).
|
||||
|
||||
**Success Criteria**:
|
||||
- tasks.csv created with valid schema, wave, and exec_mode assignments
|
||||
- No circular dependencies
|
||||
- User approved (or AUTO_YES)
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Wave Execution Engine (Extended)
|
||||
|
||||
**Objective**: Execute tasks wave-by-wave with hybrid mechanism support, GC loop handling, and cross-wave context propagation.
|
||||
|
||||
```javascript
|
||||
const masterCsv = Read(`${sessionFolder}/tasks.csv`)
|
||||
let tasks = parseCsv(masterCsv)
|
||||
const maxWave = Math.max(...tasks.map(t => t.wave))
|
||||
|
||||
for (let wave = 1; wave <= maxWave; wave++) {
|
||||
console.log(`\nWave ${wave}/${maxWave}`)
|
||||
|
||||
// 1. Separate tasks by exec_mode
|
||||
const waveTasks = tasks.filter(t => t.wave === wave && t.status === 'pending')
|
||||
const csvTasks = waveTasks.filter(t => t.exec_mode === 'csv-wave')
|
||||
const interactiveTasks = waveTasks.filter(t => t.exec_mode === 'interactive')
|
||||
|
||||
// 2. Check dependencies -- skip tasks whose deps failed
|
||||
for (const task of waveTasks) {
|
||||
const depIds = (task.deps || '').split(';').filter(Boolean)
|
||||
const depStatuses = depIds.map(id => tasks.find(t => t.id === id)?.status)
|
||||
if (depStatuses.some(s => s === 'failed' || s === 'skipped')) {
|
||||
task.status = 'skipped'
|
||||
task.error = `Dependency failed: ${depIds.filter((id, i) =>
|
||||
['failed','skipped'].includes(depStatuses[i])).join(', ')}`
|
||||
}
|
||||
}
|
||||
|
||||
// 3. Execute csv-wave tasks
|
||||
const pendingCsvTasks = csvTasks.filter(t => t.status === 'pending')
|
||||
if (pendingCsvTasks.length > 0) {
|
||||
for (const task of pendingCsvTasks) {
|
||||
task.prev_context = buildPrevContext(task, tasks)
|
||||
}
|
||||
|
||||
Write(`${sessionFolder}/wave-${wave}.csv`, toCsv(pendingCsvTasks))
|
||||
|
||||
// Read instruction template
|
||||
Read(`instructions/agent-instruction.md`)
|
||||
|
||||
// Build instruction with session folder baked in
|
||||
const instruction = buildTestingInstruction(sessionFolder, wave)
|
||||
|
||||
spawn_agents_on_csv({
|
||||
csv_path: `${sessionFolder}/wave-${wave}.csv`,
|
||||
id_column: "id",
|
||||
instruction: instruction,
|
||||
max_concurrency: maxConcurrency,
|
||||
max_runtime_seconds: 900,
|
||||
output_csv_path: `${sessionFolder}/wave-${wave}-results.csv`,
|
||||
output_schema: {
|
||||
type: "object",
|
||||
properties: {
|
||||
id: { type: "string" },
|
||||
status: { type: "string", enum: ["completed", "failed"] },
|
||||
findings: { type: "string" },
|
||||
pass_rate: { type: "string" },
|
||||
coverage_achieved: { type: "string" },
|
||||
test_files: { type: "string" },
|
||||
error: { type: "string" }
|
||||
}
|
||||
}
|
||||
})
|
||||
|
||||
// Merge results
|
||||
const results = parseCsv(Read(`${sessionFolder}/wave-${wave}-results.csv`))
|
||||
for (const r of results) {
|
||||
const t = tasks.find(t => t.id === r.id)
|
||||
if (t) Object.assign(t, r)
|
||||
}
|
||||
}
|
||||
|
||||
// 4. Execute interactive tasks (executor with fix cycle)
|
||||
const pendingInteractive = interactiveTasks.filter(t => t.status === 'pending')
|
||||
for (const task of pendingInteractive) {
|
||||
Read(`agents/executor.md`)
|
||||
|
||||
const prevContext = buildPrevContext(task, tasks)
|
||||
const agent = spawn_agent({
|
||||
message: `## TASK ASSIGNMENT\n\n### MANDATORY FIRST STEPS\n1. Read: agents/executor.md\n2. Read: ${sessionFolder}/discoveries.ndjson\n3. Read: .workflow/project-tech.json (if exists)\n\n---\n\nGoal: ${task.description}\nLayer: ${task.layer}\nCoverage Target: ${task.coverage_target}%\nSession: ${sessionFolder}\n\n### Previous Context\n${prevContext}`
|
||||
})
|
||||
const result = wait({ ids: [agent], timeout_ms: 900000 })
|
||||
if (result.timed_out) {
|
||||
send_input({ id: agent, message: "Please finalize current test results and report." })
|
||||
wait({ ids: [agent], timeout_ms: 120000 })
|
||||
}
|
||||
Write(`${sessionFolder}/interactive/${task.id}-result.json`, JSON.stringify({
|
||||
task_id: task.id, status: "completed", findings: parseFindings(result),
|
||||
timestamp: getUtc8ISOString()
|
||||
}))
|
||||
close_agent({ id: agent })
|
||||
task.status = result.success ? 'completed' : 'failed'
|
||||
task.findings = parseFindings(result)
|
||||
}
|
||||
|
||||
// 5. GC Loop Check (after executor completes)
|
||||
for (const task of pendingInteractive.filter(t => t.role === 'executor')) {
|
||||
const gcState = JSON.parse(Read(`${sessionFolder}/gc-state.json`))
|
||||
const layer = task.layer
|
||||
const rounds = gcState.rounds[layer] || 0
|
||||
const coverageAchieved = parseFloat(task.coverage_achieved || '0')
|
||||
const coverageTarget = parseFloat(task.coverage_target || '80')
|
||||
const passRate = parseFloat(task.pass_rate || '0')
|
||||
|
||||
if (coverageAchieved < coverageTarget && passRate < 0.95 && rounds < 3) {
|
||||
// Trigger GC fix round
|
||||
gcState.rounds[layer] = rounds + 1
|
||||
Write(`${sessionFolder}/gc-state.json`, JSON.stringify(gcState, null, 2))
|
||||
|
||||
// Insert fix tasks into tasks array for a subsequent micro-wave
|
||||
// TESTGEN-fix task + TESTRUN-fix task
|
||||
// These are spawned inline, not added to CSV
|
||||
Read(`agents/gc-loop-handler.md`)
|
||||
const gcAgent = spawn_agent({
|
||||
message: `## GC LOOP ROUND ${rounds + 1}\n\n### MANDATORY FIRST STEPS\n1. Read: agents/gc-loop-handler.md\n2. Read: ${sessionFolder}/discoveries.ndjson\n\nLayer: ${layer}\nRound: ${rounds + 1}/3\nCurrent Coverage: ${coverageAchieved}%\nTarget: ${coverageTarget}%\nPass Rate: ${passRate}\nSession: ${sessionFolder}\nPrevious Results: ${sessionFolder}/results/run-${layer}.json\nTest Directory: ${sessionFolder}/tests/${layer === 'L1' ? 'L1-unit' : layer === 'L2' ? 'L2-integration' : 'L3-e2e'}/`
|
||||
})
|
||||
const gcResult = wait({ ids: [gcAgent], timeout_ms: 900000 })
|
||||
close_agent({ id: gcAgent })
|
||||
}
|
||||
}
|
||||
|
||||
// 6. Update master CSV
|
||||
Write(`${sessionFolder}/tasks.csv`, toCsv(tasks))
|
||||
|
||||
// 7. Cleanup temp files
|
||||
Bash(`rm -f ${sessionFolder}/wave-${wave}.csv ${sessionFolder}/wave-${wave}-results.csv`)
|
||||
|
||||
// 8. Display wave summary
|
||||
const completed = waveTasks.filter(t => t.status === 'completed').length
|
||||
const failed = waveTasks.filter(t => t.status === 'failed').length
|
||||
const skipped = waveTasks.filter(t => t.status === 'skipped').length
|
||||
console.log(`Wave ${wave} Complete: ${completed} completed, ${failed} failed, ${skipped} skipped`)
|
||||
}
|
||||
```
|
||||
|
||||
**Success Criteria**:
|
||||
- All waves executed in order
|
||||
- Both csv-wave and interactive tasks handled per wave
|
||||
- Each wave's results merged into master CSV before next wave starts
|
||||
- GC loops triggered when coverage below target (max 3 rounds per layer)
|
||||
- Dependent tasks skipped when predecessor failed
|
||||
- discoveries.ndjson accumulated across all waves and mechanisms
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Post-Wave Interactive (Completion Action)
|
||||
|
||||
**Objective**: Pipeline completion report with coverage metrics and interactive completion choice.
|
||||
|
||||
```javascript
|
||||
const tasks = parseCsv(Read(`${sessionFolder}/tasks.csv`))
|
||||
const completed = tasks.filter(t => t.status === 'completed')
|
||||
const failed = tasks.filter(t => t.status === 'failed')
|
||||
const gcState = JSON.parse(Read(`${sessionFolder}/gc-state.json`))
|
||||
|
||||
// Coverage summary per layer
|
||||
const layerSummary = ['L1', 'L2', 'L3'].map(layer => {
|
||||
const execTask = tasks.find(t => t.role === 'executor' && t.layer === layer && t.status === 'completed')
|
||||
return execTask ? ` ${layer}: ${execTask.coverage_achieved}% coverage, ${execTask.pass_rate} pass rate` : null
|
||||
}).filter(Boolean).join('\n')
|
||||
|
||||
console.log(`
|
||||
============================================
|
||||
TESTING PIPELINE COMPLETE
|
||||
|
||||
Deliverables:
|
||||
${completed.map(t => ` - ${t.id}: ${t.title} (${t.role})`).join('\n')}
|
||||
|
||||
Coverage:
|
||||
${layerSummary}
|
||||
|
||||
GC Rounds: ${JSON.stringify(gcState.rounds)}
|
||||
Pipeline: ${completed.length}/${tasks.length} tasks
|
||||
Session: ${sessionFolder}
|
||||
============================================
|
||||
`)
|
||||
|
||||
if (!AUTO_YES) {
|
||||
request_user_input({
|
||||
questions: [{
|
||||
question: "Testing pipeline complete. Choose next action.",
|
||||
header: "Done",
|
||||
id: "completion",
|
||||
options: [
|
||||
{ label: "Archive (Recommended)", description: "Archive session, output final summary" },
|
||||
{ label: "Keep Active", description: "Keep session for follow-up work" },
|
||||
{ label: "Deepen Coverage", description: "Add more test layers or increase coverage targets" }
|
||||
]
|
||||
}]
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
**Success Criteria**:
|
||||
- Post-wave interactive processing complete
|
||||
- Coverage metrics displayed
|
||||
- User informed of results
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Results Aggregation
|
||||
|
||||
**Objective**: Generate final results and human-readable report.
|
||||
|
||||
```javascript
|
||||
// 1. Export results.csv
|
||||
Bash(`cp ${sessionFolder}/tasks.csv ${sessionFolder}/results.csv`)
|
||||
|
||||
// 2. Generate context.md
|
||||
const tasks = parseCsv(Read(`${sessionFolder}/tasks.csv`))
|
||||
const gcState = JSON.parse(Read(`${sessionFolder}/gc-state.json`))
|
||||
|
||||
let contextMd = `# Team Testing Report\n\n`
|
||||
contextMd += `**Session**: ${sessionId}\n`
|
||||
contextMd += `**Date**: ${getUtc8ISOString().substring(0, 10)}\n\n`
|
||||
|
||||
contextMd += `## Summary\n`
|
||||
contextMd += `| Status | Count |\n|--------|-------|\n`
|
||||
contextMd += `| Completed | ${tasks.filter(t => t.status === 'completed').length} |\n`
|
||||
contextMd += `| Failed | ${tasks.filter(t => t.status === 'failed').length} |\n`
|
||||
contextMd += `| Skipped | ${tasks.filter(t => t.status === 'skipped').length} |\n\n`
|
||||
|
||||
contextMd += `## Coverage Results\n\n`
|
||||
contextMd += `| Layer | Coverage | Target | Pass Rate | GC Rounds |\n`
|
||||
contextMd += `|-------|----------|--------|-----------|----------|\n`
|
||||
for (const layer of ['L1', 'L2', 'L3']) {
|
||||
const execTask = tasks.find(t => t.role === 'executor' && t.layer === layer)
|
||||
if (execTask) {
|
||||
contextMd += `| ${layer} | ${execTask.coverage_achieved || 'N/A'}% | ${execTask.coverage_target}% | ${execTask.pass_rate || 'N/A'} | ${gcState.rounds[layer] || 0} |\n`
|
||||
}
|
||||
}
|
||||
contextMd += '\n'
|
||||
|
||||
const maxWave = Math.max(...tasks.map(t => t.wave))
|
||||
contextMd += `## Wave Execution\n\n`
|
||||
for (let w = 1; w <= maxWave; w++) {
|
||||
const waveTasks = tasks.filter(t => t.wave === w)
|
||||
contextMd += `### Wave ${w}\n\n`
|
||||
for (const t of waveTasks) {
|
||||
const icon = t.status === 'completed' ? '[DONE]' : t.status === 'failed' ? '[FAIL]' : '[SKIP]'
|
||||
contextMd += `${icon} **${t.title}** [${t.role}/${t.layer || '-'}] ${t.findings || ''}\n\n`
|
||||
}
|
||||
}
|
||||
|
||||
Write(`${sessionFolder}/context.md`, contextMd)
|
||||
|
||||
console.log(`Results exported to: ${sessionFolder}/results.csv`)
|
||||
console.log(`Report generated at: ${sessionFolder}/context.md`)
|
||||
```
|
||||
|
||||
**Success Criteria**:
|
||||
- results.csv exported (all tasks, both modes)
|
||||
- context.md generated with coverage breakdown
|
||||
- Summary displayed to user
|
||||
|
||||
---
|
||||
|
||||
## Shared Discovery Board Protocol
|
||||
|
||||
All agents (csv-wave and interactive) share a single `discoveries.ndjson` file for cross-task knowledge exchange.
|
||||
|
||||
**Format**: One JSON object per line (NDJSON):
|
||||
|
||||
```jsonl
|
||||
{"ts":"2026-03-08T10:00:00Z","worker":"STRATEGY-001","type":"framework_detected","data":{"framework":"vitest","config_file":"vitest.config.ts","test_pattern":"**/*.test.ts"}}
|
||||
{"ts":"2026-03-08T10:05:00Z","worker":"TESTGEN-001","type":"test_generated","data":{"file":"tests/L1-unit/auth.test.ts","source_file":"src/auth.ts","test_count":8}}
|
||||
{"ts":"2026-03-08T10:10:00Z","worker":"TESTRUN-001","type":"defect_found","data":{"file":"src/auth.ts","line":42,"pattern":"null_reference","description":"Missing null check on token payload"}}
|
||||
```
|
||||
|
||||
**Discovery Types**:
|
||||
|
||||
| Type | Data Schema | Description |
|
||||
|------|-------------|-------------|
|
||||
| `framework_detected` | `{framework, config_file, test_pattern}` | Test framework identified |
|
||||
| `test_generated` | `{file, source_file, test_count}` | Test file created |
|
||||
| `defect_found` | `{file, line, pattern, description}` | Defect pattern discovered |
|
||||
| `coverage_gap` | `{file, current, target, gap}` | Coverage gap identified |
|
||||
| `convention_found` | `{pattern, example_file, description}` | Test convention detected |
|
||||
| `fix_applied` | `{test_file, fix_type, description}` | Test fix during GC loop |
|
||||
|
||||
**Protocol**:
|
||||
1. Agents MUST read discoveries.ndjson at start of execution
|
||||
2. Agents MUST append relevant discoveries during execution
|
||||
3. Agents MUST NOT modify or delete existing entries
|
||||
4. Deduplication by `{type, data.file}` key
|
||||
|
||||
---
|
||||
|
||||
## Pipeline Definitions
|
||||
|
||||
### Targeted Pipeline (3 tasks, serial)
|
||||
When pipeline completes, coordinator presents:
|
||||
|
||||
```
|
||||
STRATEGY-001 -> TESTGEN-001 -> TESTRUN-001
|
||||
request_user_input({
|
||||
questions: [{
|
||||
question: "Testing pipeline complete. What would you like to do?",
|
||||
header: "Completion",
|
||||
multiSelect: false,
|
||||
options: [
|
||||
{ label: "Archive & Clean (Recommended)", description: "Archive session, clean up team" },
|
||||
{ label: "Keep Active", description: "Keep session for follow-up work" },
|
||||
{ label: "Deepen Coverage", description: "Add more test layers or increase coverage targets" }
|
||||
]
|
||||
}]
|
||||
})
|
||||
```
|
||||
|
||||
| Task ID | Role | Layer | Wave | exec_mode |
|
||||
|---------|------|-------|------|-----------|
|
||||
| STRATEGY-001 | strategist | - | 1 | csv-wave |
|
||||
| TESTGEN-001 | generator | L1 | 2 | csv-wave |
|
||||
| TESTRUN-001 | executor | L1 | 3 | interactive |
|
||||
|
||||
### Standard Pipeline (6 tasks, progressive layers)
|
||||
## Session Directory
|
||||
|
||||
```
|
||||
STRATEGY-001 -> TESTGEN-001 -> TESTRUN-001 -> TESTGEN-002 -> TESTRUN-002 -> TESTANA-001
|
||||
.workflow/.team/TST-<slug>-<date>/
|
||||
├── .msg/messages.jsonl # Team message bus
|
||||
├── .msg/meta.json # Session metadata
|
||||
├── wisdom/ # Cross-task knowledge
|
||||
├── strategy/ # Strategist output
|
||||
├── tests/ # Generator output (L1-unit/, L2-integration/, L3-e2e/)
|
||||
├── results/ # Executor output
|
||||
└── analysis/ # Analyst output
|
||||
```
|
||||
|
||||
| Task ID | Role | Layer | Wave | exec_mode |
|
||||
|---------|------|-------|------|-----------|
|
||||
| STRATEGY-001 | strategist | - | 1 | csv-wave |
|
||||
| TESTGEN-001 | generator | L1 | 2 | csv-wave |
|
||||
| TESTRUN-001 | executor | L1 | 3 | interactive |
|
||||
| TESTGEN-002 | generator | L2 | 4 | csv-wave |
|
||||
| TESTRUN-002 | executor | L2 | 5 | interactive |
|
||||
| TESTANA-001 | analyst | - | 6 | csv-wave |
|
||||
## Specs Reference
|
||||
|
||||
### Comprehensive Pipeline (8 tasks, parallel windows)
|
||||
|
||||
```
|
||||
STRATEGY-001 -> [TESTGEN-001 // TESTGEN-002] -> [TESTRUN-001 // TESTRUN-002] -> TESTGEN-003 -> TESTRUN-003 -> TESTANA-001
|
||||
```
|
||||
|
||||
| Task ID | Role | Layer | Wave | exec_mode |
|
||||
|---------|------|-------|------|-----------|
|
||||
| STRATEGY-001 | strategist | - | 1 | csv-wave |
|
||||
| TESTGEN-001 | generator | L1 | 2 | csv-wave |
|
||||
| TESTGEN-002 | generator | L2 | 2 | csv-wave |
|
||||
| TESTRUN-001 | executor | L1 | 3 | interactive |
|
||||
| TESTRUN-002 | executor | L2 | 3 | interactive |
|
||||
| TESTGEN-003 | generator | L3 | 4 | csv-wave |
|
||||
| TESTRUN-003 | executor | L3 | 5 | interactive |
|
||||
| TESTANA-001 | analyst | - | 6 | csv-wave |
|
||||
|
||||
---
|
||||
|
||||
## GC Loop (Generator-Critic)
|
||||
|
||||
Generator and executor iterate per test layer until coverage converges:
|
||||
|
||||
```
|
||||
TESTGEN -> TESTRUN -> (if pass_rate < 0.95 OR coverage < target) -> GC Loop Handler
|
||||
(if pass_rate >= 0.95 AND coverage >= target) -> next wave
|
||||
```
|
||||
|
||||
- Max iterations: 3 per layer
|
||||
- After 3 iterations: accept current coverage with warning
|
||||
- GC loop runs as interactive agent (gc-loop-handler.md) which internally generates fixes and re-runs tests
|
||||
|
||||
---
|
||||
- [specs/pipelines.md](specs/pipelines.md) — Pipeline definitions and task registry
|
||||
- [specs/team-config.json](specs/team-config.json) — Team configuration
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Error | Resolution |
|
||||
|-------|------------|
|
||||
| Circular dependency | Detect in wave computation, abort with error message |
|
||||
| CSV agent timeout | Mark as failed in results, continue with wave |
|
||||
| CSV agent failed | Mark as failed, skip dependent tasks in later waves |
|
||||
| Interactive agent timeout | Urge convergence via send_input, then close if still timed out |
|
||||
| Interactive agent failed | Mark as failed, skip dependents |
|
||||
| All agents in wave failed | Log error, offer retry or abort |
|
||||
| CSV parse error | Validate CSV format before execution, show line number |
|
||||
| discoveries.ndjson corrupt | Ignore malformed lines, continue with valid entries |
|
||||
| GC loop exceeded (3 rounds) | Accept current coverage with warning, proceed to next layer |
|
||||
| Test framework not detected | Default to Jest patterns |
|
||||
| No changed files found | Use full project scan with user confirmation |
|
||||
| Coverage tool unavailable | Degrade to pass rate judgment |
|
||||
| Continue mode: no session found | List available sessions, prompt user to select |
|
||||
|
||||
---
|
||||
|
||||
## Core Rules
|
||||
|
||||
1. **Start Immediately**: First action is session initialization, then Phase 0/1
|
||||
2. **Wave Order is Sacred**: Never execute wave N before wave N-1 completes and results are merged
|
||||
3. **CSV is Source of Truth**: Master tasks.csv holds all state (both csv-wave and interactive)
|
||||
4. **CSV First**: Default to csv-wave for tasks; only use interactive when multi-round interaction is required
|
||||
5. **Context Propagation**: prev_context built from master CSV, not from memory
|
||||
6. **Discovery Board is Append-Only**: Never clear, modify, or recreate discoveries.ndjson
|
||||
7. **Skip on Failure**: If a dependency failed, skip the dependent task
|
||||
8. **GC Loop Discipline**: Max 3 rounds per layer; never infinite-loop on coverage
|
||||
9. **Cleanup Temp Files**: Remove wave-{N}.csv after results are merged
|
||||
10. **DO NOT STOP**: Continuous execution until all waves complete or all remaining tasks are skipped
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Coordinator Role Constraints (Main Agent)
|
||||
|
||||
**CRITICAL**: The coordinator (main agent executing this skill) is responsible for **orchestration only**, NOT implementation.
|
||||
|
||||
15. **Coordinator Does NOT Execute Code**: The main agent MUST NOT write, modify, or implement any code directly. All implementation work is delegated to spawned team agents. The coordinator only:
|
||||
- Spawns agents with task assignments
|
||||
- Waits for agent callbacks
|
||||
- Merges results and coordinates workflow
|
||||
- Manages workflow transitions between phases
|
||||
|
||||
16. **Patient Waiting is Mandatory**: Agent execution takes significant time (typically 10-30 minutes per phase, sometimes longer). The coordinator MUST:
|
||||
- Wait patiently for `wait()` calls to complete
|
||||
- NOT skip workflow steps due to perceived delays
|
||||
- NOT assume agents have failed just because they're taking time
|
||||
- Trust the timeout mechanisms defined in the skill
|
||||
|
||||
17. **Use send_input for Clarification**: When agents need guidance or appear stuck, the coordinator MUST:
|
||||
- Use `send_input()` to ask questions or provide clarification
|
||||
- NOT skip the agent or move to next phase prematurely
|
||||
- Give agents opportunity to respond before escalating
|
||||
- Example: `send_input({ id: agent_id, message: "Please provide status update or clarify blockers" })`
|
||||
|
||||
18. **No Workflow Shortcuts**: The coordinator MUST NOT:
|
||||
- Skip phases or stages defined in the workflow
|
||||
- Bypass required approval or review steps
|
||||
- Execute dependent tasks before prerequisites complete
|
||||
- Assume task completion without explicit agent callback
|
||||
- Make up or fabricate agent results
|
||||
|
||||
19. **Respect Long-Running Processes**: This is a complex multi-agent workflow that requires patience:
|
||||
- Total execution time may range from 30-90 minutes or longer
|
||||
- Each phase may take 10-30 minutes depending on complexity
|
||||
- The coordinator must remain active and attentive throughout the entire process
|
||||
- Do not terminate or skip steps due to time concerns
|
||||
| Scenario | Resolution |
|
||||
|----------|------------|
|
||||
| Unknown --role value | Error with available role list |
|
||||
| Role not found | Error with expected path (roles/<name>/role.md) |
|
||||
| CLI tool fails | Worker fallback to direct implementation |
|
||||
| GC loop exceeded | Accept current coverage with warning |
|
||||
| Fast-advance conflict | Coordinator reconciles on next callback |
|
||||
| Completion action fails | Default to Keep Active |
|
||||
|
||||
@@ -1,195 +0,0 @@
|
||||
# Test Executor Agent
|
||||
|
||||
Interactive agent that executes test suites, collects coverage, and performs iterative auto-fix cycles. Acts as the Critic in the Generator-Critic loop.
|
||||
|
||||
## Identity
|
||||
|
||||
- **Type**: `interactive`
|
||||
- **Responsibility**: Validation (test execution with fix cycles)
|
||||
|
||||
## Boundaries
|
||||
|
||||
### MUST
|
||||
|
||||
- Load role definition via MANDATORY FIRST STEPS pattern
|
||||
- Run test suites using the correct framework command
|
||||
- Collect coverage data from test output or coverage reports
|
||||
- Attempt auto-fix for failing tests (max 3 iterations per invocation)
|
||||
- Only modify test files, NEVER modify source code
|
||||
- Save results to session results directory
|
||||
- Share defect discoveries to discoveries.ndjson
|
||||
- Report pass rate and coverage in structured output
|
||||
|
||||
### MUST NOT
|
||||
|
||||
- Skip the MANDATORY FIRST STEPS role loading
|
||||
- Modify source code (only test files may be changed)
|
||||
- Use `@ts-ignore`, `as any`, or skip/ignore test annotations
|
||||
- Exceed 3 fix iterations without reporting current state
|
||||
- Delete or disable existing passing tests
|
||||
|
||||
---
|
||||
|
||||
## Toolbox
|
||||
|
||||
### Available Tools
|
||||
|
||||
| Tool | Type | Purpose |
|
||||
|------|------|---------|
|
||||
| `Read` | file-read | Load test files, source files, strategy, results |
|
||||
| `Write` | file-write | Save test results, update test files |
|
||||
| `Edit` | file-edit | Fix test assertions, imports, mocks |
|
||||
| `Bash` | shell | Run test commands, collect coverage |
|
||||
| `Glob` | search | Find test files in session directory |
|
||||
| `Grep` | search | Find patterns in test output |
|
||||
|
||||
---
|
||||
|
||||
## Execution
|
||||
|
||||
### Phase 1: Context Loading
|
||||
|
||||
**Objective**: Detect test framework and locate test files.
|
||||
|
||||
**Input**:
|
||||
|
||||
| Source | Required | Description |
|
||||
|--------|----------|-------------|
|
||||
| Session folder | Yes | Path to session directory |
|
||||
| Layer | Yes | Target test layer (L1/L2/L3) |
|
||||
| Coverage target | Yes | Minimum coverage percentage |
|
||||
| Previous context | No | Findings from generator |
|
||||
|
||||
**Steps**:
|
||||
|
||||
1. Read discoveries.ndjson for framework detection info
|
||||
2. Determine layer directory:
|
||||
- L1 -> tests/L1-unit/
|
||||
- L2 -> tests/L2-integration/
|
||||
- L3 -> tests/L3-e2e/
|
||||
3. Find test files in the layer directory
|
||||
4. Determine test framework command:
|
||||
|
||||
| Framework | Command Template |
|
||||
|-----------|-----------------|
|
||||
| vitest | `npx vitest run --coverage --reporter=json <test-dir>` |
|
||||
| jest | `npx jest --coverage --json --outputFile=<results-path> <test-dir>` |
|
||||
| pytest | `python -m pytest --cov --cov-report=json -v <test-dir>` |
|
||||
| default | `npm test -- --coverage` |
|
||||
|
||||
**Output**: Framework, test command, test file list
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Iterative Test-Fix Cycle
|
||||
|
||||
**Objective**: Run tests and fix failures up to 3 iterations.
|
||||
|
||||
**Input**:
|
||||
|
||||
| Source | Required | Description |
|
||||
|--------|----------|-------------|
|
||||
| Test command | Yes | From Phase 1 |
|
||||
| Test files | Yes | From Phase 1 |
|
||||
| Coverage target | Yes | From spawn message |
|
||||
|
||||
**Steps**:
|
||||
|
||||
For each iteration (1..3):
|
||||
|
||||
1. Run test command, capture stdout/stderr
|
||||
2. Parse results: extract passed/failed counts, parse coverage
|
||||
3. Evaluate exit condition:
|
||||
|
||||
| Condition | Action |
|
||||
|-----------|--------|
|
||||
| All tests pass AND coverage >= target | Exit loop: SUCCESS |
|
||||
| pass_rate >= 0.95 AND iteration >= 2 | Exit loop: GOOD ENOUGH |
|
||||
| iteration >= 3 | Exit loop: MAX ITERATIONS |
|
||||
|
||||
4. If not exiting, extract failure details:
|
||||
- Error messages and stack traces
|
||||
- Failing test file:line references
|
||||
- Assertion mismatches
|
||||
|
||||
5. Apply targeted fixes:
|
||||
- Fix incorrect assertions (expected vs actual swap)
|
||||
- Fix missing imports or broken module paths
|
||||
- Fix mock setup issues
|
||||
- Fix async/await handling
|
||||
- Do NOT skip tests, do NOT add type suppressions
|
||||
|
||||
6. Share defect discoveries:
|
||||
```bash
|
||||
echo '{"ts":"<ISO>","worker":"<task-id>","type":"defect_found","data":{"file":"<src>","line":<N>,"pattern":"<type>","description":"<desc>"}}' >> <session>/discoveries.ndjson
|
||||
```
|
||||
|
||||
**Output**: Final pass rate, coverage achieved, iteration count
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Result Recording
|
||||
|
||||
**Objective**: Save execution results and update state.
|
||||
|
||||
**Steps**:
|
||||
|
||||
1. Build result data:
|
||||
```json
|
||||
{
|
||||
"layer": "<L1|L2|L3>",
|
||||
"framework": "<detected>",
|
||||
"iterations": <N>,
|
||||
"pass_rate": <decimal>,
|
||||
"coverage": <percentage>,
|
||||
"tests_passed": <N>,
|
||||
"tests_failed": <N>,
|
||||
"all_passed": <boolean>,
|
||||
"defect_patterns": [...]
|
||||
}
|
||||
```
|
||||
|
||||
2. Save results to `<session>/results/run-<layer>.json`
|
||||
3. Save last test output to `<session>/results/output-<layer>.txt`
|
||||
4. Record effective test patterns (if pass_rate > 0.8):
|
||||
- Happy path patterns that work
|
||||
- Edge case patterns that catch bugs
|
||||
- Error handling patterns
|
||||
|
||||
---
|
||||
|
||||
## Structured Output Template
|
||||
|
||||
```
|
||||
## Summary
|
||||
- Test execution for <layer>: <pass_rate> pass rate, <coverage>% coverage after <N> iterations
|
||||
|
||||
## Findings
|
||||
- Finding 1: specific test result with file:line reference
|
||||
- Finding 2: defect pattern discovered
|
||||
|
||||
## Defect Patterns
|
||||
- Pattern: type, frequency, severity
|
||||
- Pattern: type, frequency, severity
|
||||
|
||||
## Coverage
|
||||
- Overall: <N>%
|
||||
- Target: <N>%
|
||||
- Gap files: file1 (<N>%), file2 (<N>%)
|
||||
|
||||
## Open Questions
|
||||
1. Any unresolvable test failures (if any)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Scenario | Resolution |
|
||||
|----------|------------|
|
||||
| Test command not found | Try alternative commands (npx, npm test), report if all fail |
|
||||
| No test files found | Report in findings, status = failed |
|
||||
| Coverage tool unavailable | Degrade to pass rate only, report in findings |
|
||||
| All tests timeout | Report with partial results, status = failed |
|
||||
| Import resolution fails after fix | Report remaining failures, continue with other tests |
|
||||
| Timeout approaching | Output current findings with "PARTIAL" status |
|
||||
@@ -1,155 +0,0 @@
|
||||
# GC Loop Handler Agent
|
||||
|
||||
Interactive agent that manages Generator-Critic loop iterations. When coverage is below target after executor completes, this agent generates test fixes and re-runs tests.
|
||||
|
||||
## Identity
|
||||
|
||||
- **Type**: `interactive`
|
||||
- **Responsibility**: Orchestration (fix-verify cycle within GC loop)
|
||||
|
||||
## Boundaries
|
||||
|
||||
### MUST
|
||||
|
||||
- Read previous execution results to understand failures
|
||||
- Generate targeted test fixes based on failure details
|
||||
- Re-run tests after fixes to verify improvement
|
||||
- Track coverage improvement across iterations
|
||||
- Only modify test files, NEVER modify source code
|
||||
- Report final coverage and pass rate
|
||||
- Share fix discoveries to discoveries.ndjson
|
||||
|
||||
### MUST NOT
|
||||
|
||||
- Skip the MANDATORY FIRST STEPS role loading
|
||||
- Modify source code (only test files)
|
||||
- Use `@ts-ignore`, `as any`, or test skip annotations
|
||||
- Run more than 1 fix-verify cycle per invocation (coordinator manages round count)
|
||||
- Delete or disable passing tests
|
||||
|
||||
---
|
||||
|
||||
## Toolbox
|
||||
|
||||
### Available Tools
|
||||
|
||||
| Tool | Type | Purpose |
|
||||
|------|------|---------|
|
||||
| `Read` | file-read | Load test results, test files, source files |
|
||||
| `Write` | file-write | Write fixed test files |
|
||||
| `Edit` | file-edit | Apply targeted test fixes |
|
||||
| `Bash` | shell | Run test commands |
|
||||
| `Glob` | search | Find test files |
|
||||
| `Grep` | search | Search test output for patterns |
|
||||
|
||||
---
|
||||
|
||||
## Execution
|
||||
|
||||
### Phase 1: Failure Analysis
|
||||
|
||||
**Objective**: Understand why tests failed or coverage was insufficient.
|
||||
|
||||
**Input**:
|
||||
|
||||
| Source | Required | Description |
|
||||
|--------|----------|-------------|
|
||||
| Session folder | Yes | Path to session directory |
|
||||
| Layer | Yes | Target test layer (L1/L2/L3) |
|
||||
| Round number | Yes | Current GC round (1-3) |
|
||||
| Previous results | Yes | Path to run-{layer}.json |
|
||||
|
||||
**Steps**:
|
||||
|
||||
1. Read previous execution results from results/run-{layer}.json
|
||||
2. Read test output from results/output-{layer}.txt
|
||||
3. Categorize failures:
|
||||
|
||||
| Failure Type | Detection | Fix Strategy |
|
||||
|--------------|-----------|--------------|
|
||||
| Assertion mismatch | "expected X, received Y" | Correct expected values |
|
||||
| Missing import | "Cannot find module" | Fix import paths |
|
||||
| Null reference | "Cannot read property of null" | Add null guards in tests |
|
||||
| Async issue | "timeout", "not resolved" | Fix async/await patterns |
|
||||
| Mock issue | "mock not called" | Fix mock setup/teardown |
|
||||
| Type error | "Type X is not assignable" | Fix type annotations |
|
||||
|
||||
4. Identify uncovered files from coverage report
|
||||
|
||||
**Output**: Failure categories, fix targets, uncovered areas
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Fix Generation + Re-execution
|
||||
|
||||
**Objective**: Apply fixes and verify improvement.
|
||||
|
||||
**Steps**:
|
||||
|
||||
1. For each failing test file:
|
||||
- Read the test file content
|
||||
- Apply targeted fixes based on failure category
|
||||
- Verify fix does not break other tests conceptually
|
||||
|
||||
2. For coverage gaps:
|
||||
- Read uncovered source files
|
||||
- Generate additional test cases targeting uncovered paths
|
||||
- Append to existing test files or create new ones
|
||||
|
||||
3. Re-run test suite with coverage:
|
||||
```bash
|
||||
<test-command> 2>&1 || true
|
||||
```
|
||||
|
||||
4. Parse new results: pass rate, coverage
|
||||
5. Calculate improvement delta
|
||||
|
||||
6. Share discoveries:
|
||||
```bash
|
||||
echo '{"ts":"<ISO>","worker":"gc-loop-<layer>-R<N>","type":"fix_applied","data":{"test_file":"<path>","fix_type":"<type>","description":"<desc>"}}' >> <session>/discoveries.ndjson
|
||||
```
|
||||
|
||||
**Output**: Updated pass rate, coverage, improvement delta
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Result Update
|
||||
|
||||
**Objective**: Save updated results for coordinator evaluation.
|
||||
|
||||
**Steps**:
|
||||
|
||||
1. Overwrite results/run-{layer}.json with new data
|
||||
2. Save test output to results/output-{layer}.txt
|
||||
3. Report improvement delta in findings
|
||||
|
||||
---
|
||||
|
||||
## Structured Output Template
|
||||
|
||||
```
|
||||
## Summary
|
||||
- GC Loop Round <N> for <layer>: coverage <before>% -> <after>% (delta: +<N>%)
|
||||
|
||||
## Fixes Applied
|
||||
- Fix 1: <test-file> - <fix-type> - <description>
|
||||
- Fix 2: <test-file> - <fix-type> - <description>
|
||||
|
||||
## Coverage Update
|
||||
- Before: <N>%, After: <N>%, Target: <N>%
|
||||
- Pass Rate: <before> -> <after>
|
||||
|
||||
## Remaining Issues
|
||||
- Issue 1: <description> (if any)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Scenario | Resolution |
|
||||
|----------|------------|
|
||||
| No previous results found | Report error, cannot proceed without baseline |
|
||||
| All fixes cause new failures | Revert fixes, report inability to improve |
|
||||
| Coverage tool unavailable | Use pass rate as proxy metric |
|
||||
| Timeout approaching | Output partial results with current state |
|
||||
@@ -1,142 +0,0 @@
|
||||
# Agent Instruction Template -- Team Testing
|
||||
|
||||
Base instruction template for CSV wave agents in the testing pipeline. Used by strategist, generator, and analyst roles (csv-wave tasks).
|
||||
|
||||
## Purpose
|
||||
|
||||
| Phase | Usage |
|
||||
|-------|-------|
|
||||
| Phase 1 | Coordinator builds instruction from this template with session folder baked in |
|
||||
| Phase 2 | Injected as `instruction` parameter to `spawn_agents_on_csv` |
|
||||
|
||||
---
|
||||
|
||||
## Base Instruction Template
|
||||
|
||||
```markdown
|
||||
## TASK ASSIGNMENT -- Team Testing
|
||||
|
||||
### MANDATORY FIRST STEPS
|
||||
1. Read shared discoveries: <session-folder>/discoveries.ndjson (if exists, skip if not)
|
||||
2. Read project context: .workflow/project-tech.json (if exists)
|
||||
3. Read test strategy: <session-folder>/strategy/test-strategy.md (if exists, skip for strategist)
|
||||
|
||||
---
|
||||
|
||||
## Your Task
|
||||
|
||||
**Task ID**: {id}
|
||||
**Title**: {title}
|
||||
**Role**: {role}
|
||||
**Layer**: {layer}
|
||||
**Coverage Target**: {coverage_target}%
|
||||
|
||||
### Task Description
|
||||
{description}
|
||||
|
||||
### Previous Tasks' Findings (Context)
|
||||
{prev_context}
|
||||
|
||||
---
|
||||
|
||||
## Execution Protocol
|
||||
|
||||
### If Role = strategist
|
||||
|
||||
1. **Analyze git diff**: Run `git diff --name-only HEAD~1 2>/dev/null || git diff --name-only --cached` to identify changed files
|
||||
2. **Detect test framework**: Check for vitest.config.ts, jest.config.js, pytest.ini, pyproject.toml
|
||||
3. **Scan existing test patterns**: Glob for `**/*.test.*` and `**/*.spec.*` to understand conventions
|
||||
4. **Formulate strategy**:
|
||||
- Classify changed files by impact (new, modified, deleted, config)
|
||||
- Determine appropriate test layers (L1/L2/L3)
|
||||
- Set coverage targets per layer
|
||||
- Prioritize files for testing
|
||||
- Document risk assessment
|
||||
5. **Write strategy**: Save to <session-folder>/strategy/test-strategy.md
|
||||
6. **Share discoveries**: Append framework detection and conventions to discoveries board:
|
||||
```bash
|
||||
echo '{"ts":"<ISO8601>","worker":"{id}","type":"framework_detected","data":{"framework":"<name>","config_file":"<path>","test_pattern":"<pattern>"}}' >> <session-folder>/discoveries.ndjson
|
||||
```
|
||||
|
||||
### If Role = generator
|
||||
|
||||
1. **Read strategy**: Load <session-folder>/strategy/test-strategy.md for layer config and priority files
|
||||
2. **Read source files**: Load files listed in strategy for the target layer
|
||||
3. **Learn test patterns**: Find 3 existing test files to understand conventions (imports, structure, naming)
|
||||
4. **Generate tests**: For each priority source file:
|
||||
- Determine test file path following project conventions
|
||||
- Generate test cases: happy path, edge cases, error handling
|
||||
- Use proper test framework API (describe/it/test/expect)
|
||||
- Include proper imports and mocks
|
||||
5. **Write test files**: Save to <session-folder>/tests/<layer-dir>/
|
||||
- L1 -> tests/L1-unit/
|
||||
- L2 -> tests/L2-integration/
|
||||
- L3 -> tests/L3-e2e/
|
||||
6. **Syntax check**: Run `tsc --noEmit` or equivalent to verify syntax
|
||||
7. **Share discoveries**: Append test generation info to discoveries board:
|
||||
```bash
|
||||
echo '{"ts":"<ISO8601>","worker":"{id}","type":"test_generated","data":{"file":"<test-path>","source_file":"<src-path>","test_count":<N>}}' >> <session-folder>/discoveries.ndjson
|
||||
```
|
||||
|
||||
### If Role = analyst
|
||||
|
||||
1. **Read all results**: Load <session-folder>/results/run-*.json for execution data
|
||||
2. **Read strategy**: Load <session-folder>/strategy/test-strategy.md
|
||||
3. **Read discoveries**: Parse <session-folder>/discoveries.ndjson for defect patterns
|
||||
4. **Analyze coverage**: Compare achieved vs target per layer
|
||||
5. **Analyze defect patterns**: Group by type/frequency, assign severity
|
||||
6. **Assess GC effectiveness**: Review improvement across rounds
|
||||
7. **Calculate quality score** (0-100):
|
||||
- Coverage achievement: 30% weight
|
||||
- Test effectiveness: 25% weight
|
||||
- Defect detection: 25% weight
|
||||
- GC loop efficiency: 20% weight
|
||||
8. **Generate report**: Write comprehensive analysis to <session-folder>/analysis/quality-report.md
|
||||
9. **Share discoveries**: Append analysis findings to discoveries board
|
||||
|
||||
---
|
||||
|
||||
## Output (report_agent_job_result)
|
||||
|
||||
Return JSON:
|
||||
{
|
||||
"id": "{id}",
|
||||
"status": "completed" | "failed",
|
||||
"findings": "Key discoveries and implementation notes (max 500 chars)",
|
||||
"pass_rate": "test pass rate as decimal (empty for non-executor tasks)",
|
||||
"coverage_achieved": "actual coverage percentage (empty for non-executor tasks)",
|
||||
"test_files": "semicolon-separated paths of test files (empty for non-generator tasks)",
|
||||
"error": ""
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quality Requirements
|
||||
|
||||
All agents must verify before reporting complete:
|
||||
|
||||
| Requirement | Criteria |
|
||||
|-------------|----------|
|
||||
| Strategy written | Verify test-strategy.md exists (strategist) |
|
||||
| Tests generated | Verify test files exist in correct layer dir (generator) |
|
||||
| Syntax clean | No compilation errors in generated tests (generator) |
|
||||
| Report written | Verify quality-report.md exists (analyst) |
|
||||
| Findings accuracy | Findings reflect actual work done |
|
||||
| Discovery sharing | At least 1 discovery shared to board |
|
||||
| Error reporting | Non-empty error field if status is failed |
|
||||
|
||||
---
|
||||
|
||||
## Placeholder Reference
|
||||
|
||||
| Placeholder | Resolved By | When |
|
||||
|-------------|------------|------|
|
||||
| `<session-folder>` | Skill designer (Phase 1) | Literal path baked into instruction |
|
||||
| `{id}` | spawn_agents_on_csv | Runtime from CSV row |
|
||||
| `{title}` | spawn_agents_on_csv | Runtime from CSV row |
|
||||
| `{description}` | spawn_agents_on_csv | Runtime from CSV row |
|
||||
| `{role}` | spawn_agents_on_csv | Runtime from CSV row |
|
||||
| `{layer}` | spawn_agents_on_csv | Runtime from CSV row |
|
||||
| `{coverage_target}` | spawn_agents_on_csv | Runtime from CSV row |
|
||||
| `{prev_context}` | spawn_agents_on_csv | Runtime from CSV row |
|
||||
95
.codex/skills/team-testing/roles/analyst/role.md
Normal file
95
.codex/skills/team-testing/roles/analyst/role.md
Normal file
@@ -0,0 +1,95 @@
|
||||
---
|
||||
role: analyst
|
||||
prefix: TESTANA
|
||||
inner_loop: false
|
||||
message_types:
|
||||
success: analysis_ready
|
||||
error: error
|
||||
---
|
||||
|
||||
# Test Quality Analyst
|
||||
|
||||
Analyze defect patterns, identify coverage gaps, assess GC loop effectiveness, and generate a quality report with actionable recommendations.
|
||||
|
||||
## Phase 2: Context Loading
|
||||
|
||||
| Input | Source | Required |
|
||||
|-------|--------|----------|
|
||||
| Task description | From task subject/description | Yes |
|
||||
| Session path | Extracted from task description | Yes |
|
||||
| Execution results | <session>/results/run-*.json | Yes |
|
||||
| Test strategy | <session>/strategy/test-strategy.md | Yes |
|
||||
| .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
|
||||
|
||||
1. Extract session path from task description
|
||||
2. Read .msg/meta.json for execution context (executor, generator namespaces)
|
||||
3. Read all execution results:
|
||||
|
||||
```
|
||||
Glob("<session>/results/run-*.json")
|
||||
Read("<session>/results/run-001.json")
|
||||
```
|
||||
|
||||
4. Read test strategy:
|
||||
|
||||
```
|
||||
Read("<session>/strategy/test-strategy.md")
|
||||
```
|
||||
|
||||
5. Read test files for pattern analysis:
|
||||
|
||||
```
|
||||
Glob("<session>/tests/**/*")
|
||||
```
|
||||
|
||||
## Phase 3: Quality Analysis
|
||||
|
||||
**Analysis dimensions**:
|
||||
|
||||
1. **Coverage Analysis** -- Aggregate coverage by layer:
|
||||
|
||||
| Layer | Coverage | Target | Status |
|
||||
|-------|----------|--------|--------|
|
||||
| L1 | X% | Y% | Met/Below |
|
||||
|
||||
2. **Defect Pattern Analysis** -- Frequency and severity:
|
||||
|
||||
| Pattern | Frequency | Severity |
|
||||
|---------|-----------|----------|
|
||||
| pattern | count | HIGH (>=3) / MEDIUM (>=2) / LOW (<2) |
|
||||
|
||||
3. **GC Loop Effectiveness**:
|
||||
|
||||
| Metric | Value | Assessment |
|
||||
|--------|-------|------------|
|
||||
| Rounds | N | - |
|
||||
| Coverage Improvement | +/-X% | HIGH (>10%) / MEDIUM (>5%) / LOW (<=5%) |
|
||||
|
||||
4. **Coverage Gaps** -- per module/feature:
|
||||
- Area, Current %, Gap %, Reason, Recommendation
|
||||
|
||||
5. **Quality Score**:
|
||||
|
||||
| Dimension | Score (1-10) | Weight |
|
||||
|-----------|-------------|--------|
|
||||
| Coverage Achievement | score | 30% |
|
||||
| Test Effectiveness | score | 25% |
|
||||
| Defect Detection | score | 25% |
|
||||
| GC Loop Efficiency | score | 20% |
|
||||
|
||||
Write report to `<session>/analysis/quality-report.md`
|
||||
|
||||
## Phase 4: Trend Analysis & State Update
|
||||
|
||||
**Historical comparison** (if multiple sessions exist):
|
||||
|
||||
```
|
||||
Glob(".workflow/.team/TST-*/.msg/meta.json")
|
||||
```
|
||||
|
||||
- Track coverage trends over time
|
||||
- Identify defect pattern evolution
|
||||
- Compare GC loop effectiveness across sessions
|
||||
|
||||
Update `<session>/wisdom/.msg/meta.json` under `analyst` namespace:
|
||||
- Merge `{ "analyst": { quality_score, coverage_gaps, top_defect_patterns, gc_effectiveness, recommendations } }`
|
||||
@@ -0,0 +1,70 @@
|
||||
# Analyze Task
|
||||
|
||||
Parse user task -> detect testing capabilities -> select pipeline -> design roles.
|
||||
|
||||
**CONSTRAINT**: Text-level analysis only. NO source code reading, NO codebase exploration.
|
||||
|
||||
## Signal Detection
|
||||
|
||||
| Keywords | Capability | Prefix |
|
||||
|----------|------------|--------|
|
||||
| strategy, plan, layers, scope | strategist | STRATEGY |
|
||||
| generate tests, write tests, create tests | generator | TESTGEN |
|
||||
| run tests, execute, coverage | executor | TESTRUN |
|
||||
| analyze, report, quality, defects | analyst | TESTANA |
|
||||
|
||||
## Pipeline Mode Detection
|
||||
|
||||
| Condition | Pipeline |
|
||||
|-----------|----------|
|
||||
| fileCount <= 3 AND moduleCount <= 1 | targeted |
|
||||
| fileCount <= 10 AND moduleCount <= 3 | standard |
|
||||
| Otherwise | comprehensive |
|
||||
|
||||
## Dependency Graph
|
||||
|
||||
Natural ordering for testing pipeline:
|
||||
- Tier 0: strategist (change analysis, no upstream dependency)
|
||||
- Tier 1: generator (requires strategy)
|
||||
- Tier 2: executor (requires generated tests; GC loop with generator)
|
||||
- Tier 3: analyst (requires execution results)
|
||||
|
||||
## Pipeline Definitions
|
||||
|
||||
```
|
||||
Targeted: STRATEGY -> TESTGEN(L1) -> TESTRUN(L1)
|
||||
Standard: STRATEGY -> TESTGEN(L1) -> TESTRUN(L1) -> TESTGEN(L2) -> TESTRUN(L2) -> TESTANA
|
||||
Comprehensive: STRATEGY -> [TESTGEN(L1) || TESTGEN(L2)] -> [TESTRUN(L1) || TESTRUN(L2)] -> TESTGEN(L3) -> TESTRUN(L3) -> TESTANA
|
||||
```
|
||||
|
||||
## Complexity Scoring
|
||||
|
||||
| Factor | Points |
|
||||
|--------|--------|
|
||||
| Per test layer | +1 |
|
||||
| Parallel tracks | +1 per track |
|
||||
| GC loop enabled | +1 |
|
||||
| Serial depth > 3 | +1 |
|
||||
|
||||
Results: 1-2 Low, 3-5 Medium, 6+ High
|
||||
|
||||
## Role Minimization
|
||||
|
||||
- Cap at 5 roles (coordinator + 4 workers)
|
||||
- GC loop: generator <-> executor iterate up to 3 rounds per layer
|
||||
|
||||
## Output
|
||||
|
||||
Write <session>/task-analysis.json:
|
||||
```json
|
||||
{
|
||||
"task_description": "<original>",
|
||||
"pipeline_mode": "<targeted|standard|comprehensive>",
|
||||
"capabilities": [{ "name": "<cap>", "prefix": "<PREFIX>", "keywords": ["..."] }],
|
||||
"dependency_graph": { "<TASK-ID>": { "role": "<role>", "blockedBy": ["..."], "layer": "L1|L2|L3" } },
|
||||
"roles": [{ "name": "<role>", "prefix": "<PREFIX>", "inner_loop": true }],
|
||||
"complexity": { "score": 0, "level": "Low|Medium|High" },
|
||||
"coverage_targets": { "L1": 80, "L2": 60, "L3": 40 },
|
||||
"gc_loop_enabled": true
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,106 @@
|
||||
# Dispatch Tasks
|
||||
|
||||
Create testing task chains with correct dependencies. Supports targeted, standard, and comprehensive pipelines.
|
||||
|
||||
## Workflow
|
||||
|
||||
1. Read task-analysis.json -> extract pipeline_mode and dependency_graph
|
||||
2. Read specs/pipelines.md -> get task registry for selected pipeline
|
||||
3. Topological sort tasks (respect deps)
|
||||
4. Validate all owners exist in role registry (SKILL.md)
|
||||
5. For each task (in order):
|
||||
- Add task entry to tasks.json `tasks` object (see template below)
|
||||
- Set deps array with upstream task IDs
|
||||
6. Update tasks.json metadata: total count
|
||||
7. Validate chain (no orphans, no cycles, all refs valid)
|
||||
|
||||
## Task Entry Template
|
||||
|
||||
Each task in tasks.json `tasks` object:
|
||||
```json
|
||||
{
|
||||
"<TASK-ID>": {
|
||||
"title": "<concise title>",
|
||||
"description": "PURPOSE: <goal> | Success: <criteria>\nTASK:\n - <step 1>\n - <step 2>\nCONTEXT:\n - Session: <session-folder>\n - Scope: <scope>\n - Layer: <L1-unit|L2-integration|L3-e2e>\n - Upstream artifacts: <artifact-1>, <artifact-2>\n - Shared memory: <session>/wisdom/.msg/meta.json\nEXPECTED: <deliverable path> + <quality criteria>\nCONSTRAINTS: <scope limits, focus areas>\n---\nInnerLoop: <true|false>\nRoleSpec: <project>/.codex/skills/team-testing/roles/<role>/role.md",
|
||||
"role": "<role-name>",
|
||||
"prefix": "<PREFIX>",
|
||||
"deps": ["<upstream-task-id>"],
|
||||
"status": "pending",
|
||||
"findings": null,
|
||||
"error": null
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Pipeline Task Registry
|
||||
|
||||
### Targeted Pipeline
|
||||
```
|
||||
STRATEGY-001 (strategist): Analyze change scope, define test strategy
|
||||
deps: []
|
||||
TESTGEN-001 (generator): Generate L1 unit tests
|
||||
deps: [STRATEGY-001], meta: layer=L1-unit
|
||||
TESTRUN-001 (executor): Execute L1 tests, collect coverage
|
||||
deps: [TESTGEN-001], inner_loop: true, meta: layer=L1-unit, coverage_target=80%
|
||||
```
|
||||
|
||||
### Standard Pipeline
|
||||
```
|
||||
STRATEGY-001 (strategist): Analyze change scope, define test strategy
|
||||
deps: []
|
||||
TESTGEN-001 (generator): Generate L1 unit tests
|
||||
deps: [STRATEGY-001], meta: layer=L1-unit
|
||||
TESTRUN-001 (executor): Execute L1 tests, collect coverage
|
||||
deps: [TESTGEN-001], inner_loop: true, meta: layer=L1-unit, coverage_target=80%
|
||||
TESTGEN-002 (generator): Generate L2 integration tests
|
||||
deps: [TESTRUN-001], meta: layer=L2-integration
|
||||
TESTRUN-002 (executor): Execute L2 tests, collect coverage
|
||||
deps: [TESTGEN-002], inner_loop: true, meta: layer=L2-integration, coverage_target=60%
|
||||
TESTANA-001 (analyst): Defect pattern analysis, quality report
|
||||
deps: [TESTRUN-002]
|
||||
```
|
||||
|
||||
### Comprehensive Pipeline
|
||||
```
|
||||
STRATEGY-001 (strategist): Analyze change scope, define test strategy
|
||||
deps: []
|
||||
TESTGEN-001 (generator-1): Generate L1 unit tests
|
||||
deps: [STRATEGY-001], meta: layer=L1-unit
|
||||
TESTGEN-002 (generator-2): Generate L2 integration tests
|
||||
deps: [STRATEGY-001], meta: layer=L2-integration
|
||||
TESTRUN-001 (executor-1): Execute L1 tests, collect coverage
|
||||
deps: [TESTGEN-001], inner_loop: true, meta: layer=L1-unit, coverage_target=80%
|
||||
TESTRUN-002 (executor-2): Execute L2 tests, collect coverage
|
||||
deps: [TESTGEN-002], inner_loop: true, meta: layer=L2-integration, coverage_target=60%
|
||||
TESTGEN-003 (generator): Generate L3 E2E tests
|
||||
deps: [TESTRUN-001, TESTRUN-002], meta: layer=L3-e2e
|
||||
TESTRUN-003 (executor): Execute L3 tests, collect coverage
|
||||
deps: [TESTGEN-003], inner_loop: true, meta: layer=L3-e2e, coverage_target=40%
|
||||
TESTANA-001 (analyst): Defect pattern analysis, quality report
|
||||
deps: [TESTRUN-003]
|
||||
```
|
||||
|
||||
## InnerLoop Flag Rules
|
||||
|
||||
- true: generator, executor roles (GC loop iterations)
|
||||
- false: strategist, analyst roles
|
||||
|
||||
## Dependency Validation
|
||||
|
||||
- No orphan tasks (all tasks have valid owner)
|
||||
- No circular dependencies
|
||||
- All deps references exist in tasks object
|
||||
- Session reference in every task description
|
||||
- RoleSpec reference in every task description
|
||||
|
||||
## Log After Creation
|
||||
|
||||
```
|
||||
mcp__ccw-tools__team_msg({
|
||||
operation: "log",
|
||||
session_id: <session-id>,
|
||||
from: "coordinator",
|
||||
type: "pipeline_selected",
|
||||
data: { pipeline: "<mode>", task_count: <N> }
|
||||
})
|
||||
```
|
||||
242
.codex/skills/team-testing/roles/coordinator/commands/monitor.md
Normal file
242
.codex/skills/team-testing/roles/coordinator/commands/monitor.md
Normal file
@@ -0,0 +1,242 @@
|
||||
# Monitor Pipeline
|
||||
|
||||
Synchronous pipeline coordination using spawn_agent + wait_agent.
|
||||
|
||||
## Constants
|
||||
|
||||
- WORKER_AGENT: team_worker
|
||||
- ONE_STEP_PER_INVOCATION: false (synchronous wait loop)
|
||||
- FAST_ADVANCE_AWARE: true
|
||||
- MAX_GC_ROUNDS: 3
|
||||
|
||||
## Handler Router
|
||||
|
||||
| Source | Handler |
|
||||
|--------|---------|
|
||||
| "capability_gap" | handleAdapt |
|
||||
| "check" or "status" | handleCheck |
|
||||
| "resume" or "continue" | handleResume |
|
||||
| All tasks completed | handleComplete |
|
||||
| Default | handleSpawnNext |
|
||||
|
||||
## Role-Worker Map
|
||||
|
||||
| Prefix | Role | Role Spec | inner_loop |
|
||||
|--------|------|-----------|------------|
|
||||
| STRATEGY-* | strategist | `<project>/.codex/skills/team-testing/roles/strategist/role.md` | false |
|
||||
| TESTGEN-* | generator | `<project>/.codex/skills/team-testing/roles/generator/role.md` | true |
|
||||
| TESTRUN-* | executor | `<project>/.codex/skills/team-testing/roles/executor/role.md` | true |
|
||||
| TESTANA-* | analyst | `<project>/.codex/skills/team-testing/roles/analyst/role.md` | false |
|
||||
|
||||
## handleCheck
|
||||
|
||||
Read-only status report from tasks.json, then STOP.
|
||||
|
||||
1. Read tasks.json
|
||||
2. Count tasks by status (pending, in_progress, completed, failed)
|
||||
|
||||
Output:
|
||||
```
|
||||
[coordinator] Testing Pipeline Status
|
||||
[coordinator] Mode: <pipeline_mode>
|
||||
[coordinator] Progress: <done>/<total> (<pct>%)
|
||||
[coordinator] GC Rounds: L1: <n>/3, L2: <n>/3
|
||||
|
||||
[coordinator] Pipeline Graph:
|
||||
STRATEGY-001: <done|run|wait> test-strategy.md
|
||||
TESTGEN-001: <done|run|wait> generating L1...
|
||||
TESTRUN-001: <done|run|wait> blocked by TESTGEN-001
|
||||
TESTGEN-002: <done|run|wait> blocked by TESTRUN-001
|
||||
TESTRUN-002: <done|run|wait> blocked by TESTGEN-002
|
||||
TESTANA-001: <done|run|wait> blocked by TESTRUN-*
|
||||
|
||||
[coordinator] Active agents: <list with elapsed time>
|
||||
[coordinator] Ready: <pending tasks with resolved deps>
|
||||
[coordinator] Commands: 'resume' to advance | 'check' to refresh
|
||||
```
|
||||
|
||||
Then STOP.
|
||||
|
||||
## handleResume
|
||||
|
||||
1. Read tasks.json, check active_agents
|
||||
2. No active agents -> handleSpawnNext
|
||||
3. Has active agents -> check each status
|
||||
- completed -> mark done
|
||||
- in_progress -> still running
|
||||
4. Some completed -> handleSpawnNext
|
||||
5. All running -> report status, STOP
|
||||
|
||||
## handleSpawnNext
|
||||
|
||||
Find ready tasks, spawn workers, wait for completion, process results.
|
||||
|
||||
1. Read tasks.json
|
||||
2. Collect:
|
||||
- completedTasks: status = completed
|
||||
- inProgressTasks: status = in_progress
|
||||
- readyTasks: status = pending AND all deps in completedTasks
|
||||
|
||||
3. No ready + work in progress -> report waiting, STOP
|
||||
4. No ready + nothing in progress -> handleComplete
|
||||
5. Has ready -> for each ready task:
|
||||
a. Determine role from prefix (use Role-Worker Map)
|
||||
b. Check if inner loop role (generator/executor) with active worker -> skip (worker picks up next task)
|
||||
c. Update task status in tasks.json -> in_progress
|
||||
d. team_msg log -> task_unblocked
|
||||
|
||||
### Spawn Workers
|
||||
|
||||
For each ready task:
|
||||
|
||||
```javascript
|
||||
// 1) Update status in tasks.json
|
||||
state.tasks[taskId].status = 'in_progress'
|
||||
|
||||
// 2) Spawn worker
|
||||
const agentId = spawn_agent({
|
||||
agent_type: "team_worker",
|
||||
items: [
|
||||
{ type: "text", text: `## Role Assignment
|
||||
role: ${task.role}
|
||||
role_spec: ${skillRoot}/roles/${task.role}/role.md
|
||||
session: ${sessionFolder}
|
||||
session_id: ${sessionId}
|
||||
team_name: testing
|
||||
requirement: ${task.description}
|
||||
inner_loop: ${task.role === 'generator' || task.role === 'executor'}
|
||||
|
||||
## Current Task
|
||||
- Task ID: ${taskId}
|
||||
- Task: ${task.title}` },
|
||||
|
||||
{ type: "text", text: `Read role_spec file (${skillRoot}/roles/${task.role}/role.md) to load Phase 2-4 domain instructions.
|
||||
Execute built-in Phase 1 (task discovery) -> role Phase 2-4 -> built-in Phase 5 (report).` },
|
||||
|
||||
{ type: "text", text: `## Task Context
|
||||
task_id: ${taskId}
|
||||
title: ${task.title}
|
||||
description: ${task.description}` },
|
||||
|
||||
{ type: "text", text: `## Upstream Context\n${prevContext}` }
|
||||
]
|
||||
})
|
||||
|
||||
// 3) Track agent
|
||||
state.active_agents[taskId] = { agentId, role: task.role, started_at: now }
|
||||
```
|
||||
|
||||
6. **Parallel spawn** (comprehensive pipeline):
|
||||
- TESTGEN-001 + TESTGEN-002 both unblocked -> spawn both in parallel (name: "generator-1", "generator-2")
|
||||
- TESTRUN-001 + TESTRUN-002 both unblocked -> spawn both in parallel (name: "executor-1", "executor-2")
|
||||
|
||||
### Wait and Process Results
|
||||
|
||||
After spawning all ready tasks:
|
||||
|
||||
```javascript
|
||||
// 4) Batch wait for all spawned workers
|
||||
const agentIds = Object.values(state.active_agents).map(a => a.agentId)
|
||||
wait_agent({ ids: agentIds, timeout_ms: 900000 })
|
||||
|
||||
// 5) Collect results
|
||||
for (const [taskId, agent] of Object.entries(state.active_agents)) {
|
||||
state.tasks[taskId].status = 'completed'
|
||||
close_agent({ id: agent.agentId })
|
||||
delete state.active_agents[taskId]
|
||||
}
|
||||
```
|
||||
|
||||
### GC Checkpoint (TESTRUN-* completes)
|
||||
|
||||
After TESTRUN-* completion, read meta.json for executor.pass_rate and executor.coverage:
|
||||
- (pass_rate >= 0.95 AND coverage >= target) OR gc_rounds[layer] >= MAX_GC_ROUNDS -> proceed
|
||||
- (pass_rate < 0.95 OR coverage < target) AND gc_rounds[layer] < MAX_GC_ROUNDS -> create GC fix tasks, increment gc_rounds[layer]
|
||||
|
||||
**GC Fix Task Creation** (when coverage below target):
|
||||
|
||||
Add to tasks.json:
|
||||
```json
|
||||
{
|
||||
"TESTGEN-<layer>-fix-<round>": {
|
||||
"title": "Revise <layer> tests (GC #<round>)",
|
||||
"description": "PURPOSE: Revise tests to fix failures and improve coverage | Success: pass_rate >= 0.95 AND coverage >= target\nTASK:\n - Read previous test results and failure details\n - Revise tests to address failures\n - Improve coverage for uncovered areas\nCONTEXT:\n - Session: <session-folder>\n - Layer: <layer>\n - Previous results: <session>/results/run-<N>.json\nEXPECTED: Revised test files in <session>/tests/<layer>/\nCONSTRAINTS: Only modify test files\n---\nInnerLoop: true\nRoleSpec: <project>/.codex/skills/team-testing/roles/generator/role.md",
|
||||
"role": "generator",
|
||||
"prefix": "TESTGEN",
|
||||
"deps": [],
|
||||
"status": "pending",
|
||||
"findings": null,
|
||||
"error": null
|
||||
},
|
||||
"TESTRUN-<layer>-fix-<round>": {
|
||||
"title": "Re-execute <layer> (GC #<round>)",
|
||||
"description": "PURPOSE: Re-execute tests after revision | Success: pass_rate >= 0.95\nCONTEXT:\n - Session: <session-folder>\n - Layer: <layer>\n - Input: tests/<layer>\nEXPECTED: <session>/results/run-<N>-gc.json\n---\nInnerLoop: true\nRoleSpec: <project>/.codex/skills/team-testing/roles/executor/role.md",
|
||||
"role": "executor",
|
||||
"prefix": "TESTRUN",
|
||||
"deps": ["TESTGEN-<layer>-fix-<round>"],
|
||||
"status": "pending",
|
||||
"findings": null,
|
||||
"error": null
|
||||
}
|
||||
}
|
||||
```
|
||||
Update tasks.json gc_rounds[layer]++
|
||||
|
||||
### Persist and Loop
|
||||
|
||||
After processing all results:
|
||||
1. Write updated tasks.json
|
||||
2. Check if more tasks are now ready (deps newly resolved)
|
||||
3. If yes -> loop back to step 1 of handleSpawnNext
|
||||
4. If no more ready and all done -> handleComplete
|
||||
5. If no more ready but some still blocked -> report status, STOP
|
||||
|
||||
## handleComplete
|
||||
|
||||
Pipeline done. Generate report and completion action.
|
||||
|
||||
1. Verify all tasks (including any GC fix tasks) have status "completed" or "failed"
|
||||
2. If any tasks incomplete -> return to handleSpawnNext
|
||||
3. If all complete:
|
||||
- Read final state from meta.json (analyst.quality_score, executor.coverage, gc_rounds)
|
||||
- Generate summary (deliverables, task count, GC rounds, coverage metrics)
|
||||
4. Execute completion action per tasks.json completion_action:
|
||||
- interactive -> request_user_input (Archive/Keep/Deepen Coverage)
|
||||
- auto_archive -> Archive & Clean (rm -rf session folder)
|
||||
- auto_keep -> Keep Active (status=paused)
|
||||
|
||||
## handleAdapt
|
||||
|
||||
Capability gap reported mid-pipeline.
|
||||
|
||||
1. Parse gap description
|
||||
2. Check if existing role covers it -> redirect
|
||||
3. Role count < 5 -> generate dynamic role-spec in <session>/role-specs/
|
||||
4. Add new task to tasks.json, spawn worker via spawn_agent + wait_agent
|
||||
5. Role count >= 5 -> merge or pause
|
||||
|
||||
## Fast-Advance Reconciliation
|
||||
|
||||
On every coordinator wake:
|
||||
1. Read team_msg entries with type="fast_advance"
|
||||
2. Sync active_agents with spawned successors
|
||||
3. No duplicate spawns
|
||||
|
||||
## Phase 4: State Persistence
|
||||
|
||||
After every handler execution:
|
||||
1. Reconcile active_agents with actual tasks.json states
|
||||
2. Remove entries for completed/failed tasks
|
||||
3. Write updated tasks.json
|
||||
4. STOP (wait for next invocation)
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Scenario | Resolution |
|
||||
|----------|------------|
|
||||
| Session file not found | Error, suggest re-initialization |
|
||||
| Unknown role in callback | Log info, scan for other completions |
|
||||
| GC loop exceeded (3 rounds) | Accept current coverage with warning, proceed |
|
||||
| Pipeline stall | Check deps chains, report to user |
|
||||
| Coverage tool unavailable | Degrade to pass rate judgment |
|
||||
| Worker crash | Reset task to pending in tasks.json, respawn via spawn_agent |
|
||||
151
.codex/skills/team-testing/roles/coordinator/role.md
Normal file
151
.codex/skills/team-testing/roles/coordinator/role.md
Normal file
@@ -0,0 +1,151 @@
|
||||
# Coordinator Role
|
||||
|
||||
Orchestrate team-testing: analyze -> dispatch -> spawn -> monitor -> report.
|
||||
|
||||
## Identity
|
||||
- Name: coordinator | Tag: [coordinator]
|
||||
- Responsibility: Change scope analysis -> Create session -> Dispatch tasks -> Monitor progress -> Report results
|
||||
|
||||
## Boundaries
|
||||
|
||||
### MUST
|
||||
- Spawn workers via `spawn_agent({ agent_type: "team_worker" })` and wait via `wait_agent`
|
||||
- Follow Command Execution Protocol for dispatch and monitor commands
|
||||
- Respect pipeline stage dependencies (deps)
|
||||
- Handle Generator-Critic cycles with max 3 iterations per layer
|
||||
- Execute completion action in Phase 5
|
||||
|
||||
### MUST NOT
|
||||
- Implement domain logic (test generation, execution, analysis) -- workers handle this
|
||||
- Spawn workers without creating tasks first
|
||||
- Skip quality gates when coverage is below target
|
||||
- Modify test files or source code directly -- delegate to workers
|
||||
- Force-advance pipeline past failed GC loops
|
||||
|
||||
## Command Execution Protocol
|
||||
When coordinator needs to execute a specific phase:
|
||||
1. Read `commands/<command>.md`
|
||||
2. Follow the workflow defined in the command
|
||||
3. Commands are inline execution guides, NOT separate agents
|
||||
4. Execute synchronously, complete before proceeding
|
||||
|
||||
## Entry Router
|
||||
|
||||
| Detection | Condition | Handler |
|
||||
|-----------|-----------|---------|
|
||||
| Status check | Args contain "check" or "status" | -> handleCheck (monitor.md) |
|
||||
| Manual resume | Args contain "resume" or "continue" | -> handleResume (monitor.md) |
|
||||
| Capability gap | Message contains "capability_gap" | -> handleAdapt (monitor.md) |
|
||||
| Pipeline complete | All tasks completed | -> handleComplete (monitor.md) |
|
||||
| Interrupted session | Active session in .workflow/.team/TST-* | -> Phase 0 |
|
||||
| New session | None of above | -> Phase 1 |
|
||||
|
||||
For check/resume/adapt/complete: load @commands/monitor.md, execute handler, STOP.
|
||||
|
||||
## Phase 0: Session Resume Check
|
||||
|
||||
1. Scan .workflow/.team/TST-*/tasks.json for active/paused sessions
|
||||
2. No sessions -> Phase 1
|
||||
3. Single session -> reconcile:
|
||||
a. Read tasks.json, reset in_progress -> pending
|
||||
b. Rebuild active_agents map
|
||||
c. Kick first ready task via handleSpawnNext
|
||||
4. Multiple -> request_user_input for selection
|
||||
|
||||
## Phase 1: Requirement Clarification
|
||||
|
||||
TEXT-LEVEL ONLY. No source code reading.
|
||||
|
||||
1. Parse task description from $ARGUMENTS
|
||||
2. Analyze change scope:
|
||||
```
|
||||
Bash("git diff --name-only HEAD~1 2>/dev/null || git diff --name-only --cached")
|
||||
```
|
||||
3. Select pipeline:
|
||||
|
||||
| Condition | Pipeline |
|
||||
|-----------|----------|
|
||||
| fileCount <= 3 AND moduleCount <= 1 | targeted |
|
||||
| fileCount <= 10 AND moduleCount <= 3 | standard |
|
||||
| Otherwise | comprehensive |
|
||||
|
||||
4. Clarify if ambiguous (request_user_input for scope)
|
||||
5. Delegate to @commands/analyze.md
|
||||
6. Output: task-analysis.json
|
||||
7. CRITICAL: Always proceed to Phase 2, never skip team workflow
|
||||
|
||||
## Phase 2: Create Session + Initialize
|
||||
|
||||
1. Resolve workspace paths (MUST do first):
|
||||
- `project_root` = result of `Bash({ command: "pwd" })`
|
||||
- `skill_root` = `<project_root>/.codex/skills/team-testing`
|
||||
2. Generate session ID: TST-<slug>-<date>
|
||||
3. Create session folder structure:
|
||||
```bash
|
||||
mkdir -p .workflow/.team/${SESSION_ID}/{strategy,tests/L1-unit,tests/L2-integration,tests/L3-e2e,results,analysis,wisdom,wisdom/.msg}
|
||||
```
|
||||
4. Read specs/pipelines.md -> select pipeline based on mode
|
||||
5. Initialize pipeline via team_msg state_update:
|
||||
```
|
||||
mcp__ccw-tools__team_msg({
|
||||
operation: "log", session_id: "<id>", from: "coordinator",
|
||||
type: "state_update", summary: "Session initialized",
|
||||
data: {
|
||||
pipeline_mode: "<targeted|standard|comprehensive>",
|
||||
pipeline_stages: ["strategist", "generator", "executor", "analyst"],
|
||||
team_name: "testing",
|
||||
coverage_targets: { "L1": 80, "L2": 60, "L3": 40 },
|
||||
gc_rounds: {}
|
||||
}
|
||||
})
|
||||
```
|
||||
6. Write initial tasks.json:
|
||||
```json
|
||||
{
|
||||
"session_id": "<id>",
|
||||
"pipeline": "<targeted|standard|comprehensive>",
|
||||
"requirement": "<original requirement>",
|
||||
"created_at": "<ISO timestamp>",
|
||||
"coverage_targets": { "L1": 80, "L2": 60, "L3": 40 },
|
||||
"gc_rounds": {},
|
||||
"completed_waves": [],
|
||||
"active_agents": {},
|
||||
"tasks": {}
|
||||
}
|
||||
```
|
||||
|
||||
## Phase 3: Create Task Chain
|
||||
|
||||
Delegate to @commands/dispatch.md:
|
||||
1. Read specs/pipelines.md for selected pipeline's task registry
|
||||
2. Topological sort tasks
|
||||
3. Write tasks to tasks.json with deps arrays
|
||||
4. Update tasks.json metadata
|
||||
|
||||
## Phase 4: Spawn-and-Wait
|
||||
|
||||
Delegate to @commands/monitor.md#handleSpawnNext:
|
||||
1. Find ready tasks (pending + deps resolved)
|
||||
2. Spawn team_worker agents via spawn_agent
|
||||
3. Wait for completion via wait_agent
|
||||
4. Process results, advance pipeline
|
||||
5. Repeat until all waves complete or pipeline blocked
|
||||
|
||||
## Phase 5: Report + Completion Action
|
||||
|
||||
1. Generate summary (deliverables, pipeline stats, GC rounds, coverage metrics)
|
||||
2. Execute completion action per tasks.json completion_action:
|
||||
- interactive -> request_user_input (Archive/Keep/Deepen Coverage)
|
||||
- auto_archive -> Archive & Clean (rm -rf session folder)
|
||||
- auto_keep -> Keep Active
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Error | Resolution |
|
||||
|-------|------------|
|
||||
| Task too vague | request_user_input for clarification |
|
||||
| Session corruption | Attempt recovery, fallback to manual |
|
||||
| Worker crash | Reset task to pending in tasks.json, respawn via spawn_agent |
|
||||
| Dependency cycle | Detect in analysis, halt |
|
||||
| GC loop exceeded (3 rounds) | Accept current coverage, log to wisdom, proceed |
|
||||
| Coverage tool unavailable | Degrade to pass rate judgment |
|
||||
96
.codex/skills/team-testing/roles/executor/role.md
Normal file
96
.codex/skills/team-testing/roles/executor/role.md
Normal file
@@ -0,0 +1,96 @@
|
||||
---
|
||||
role: executor
|
||||
prefix: TESTRUN
|
||||
inner_loop: true
|
||||
message_types:
|
||||
success: tests_passed
|
||||
failure: tests_failed
|
||||
coverage: coverage_report
|
||||
error: error
|
||||
---
|
||||
|
||||
# Test Executor
|
||||
|
||||
Execute tests, collect coverage, attempt auto-fix for failures. Acts as the Critic in the Generator-Critic loop. Reports pass rate and coverage for coordinator GC decisions.
|
||||
|
||||
## Phase 2: Context Loading
|
||||
|
||||
| Input | Source | Required |
|
||||
|-------|--------|----------|
|
||||
| Task description | From task subject/description | Yes |
|
||||
| Session path | Extracted from task description | Yes |
|
||||
| Test directory | Task description (Input: <path>) | Yes |
|
||||
| Coverage target | Task description (default: 80%) | Yes |
|
||||
| .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
|
||||
|
||||
1. Extract session path and test directory from task description
|
||||
2. Load test specs: Run `ccw spec load --category test` for test framework conventions and coverage targets
|
||||
3. Extract coverage target (default: 80%)
|
||||
3. Read .msg/meta.json for framework info (from strategist namespace)
|
||||
4. Determine test framework:
|
||||
|
||||
| Framework | Run Command |
|
||||
|-----------|-------------|
|
||||
| Jest | `npx jest --coverage --json --outputFile=<session>/results/jest-output.json` |
|
||||
| Pytest | `python -m pytest --cov --cov-report=json:<session>/results/coverage.json -v` |
|
||||
| Vitest | `npx vitest run --coverage --reporter=json` |
|
||||
|
||||
5. Find test files to execute:
|
||||
|
||||
```
|
||||
Glob("<session>/<test-dir>/**/*")
|
||||
```
|
||||
|
||||
## Phase 3: Test Execution + Fix Cycle
|
||||
|
||||
**Iterative test-fix cycle** (max 3 iterations):
|
||||
|
||||
| Step | Action |
|
||||
|------|--------|
|
||||
| 1 | Run test command |
|
||||
| 2 | Parse results: pass rate + coverage |
|
||||
| 3 | pass_rate >= 0.95 AND coverage >= target -> success, exit |
|
||||
| 4 | Extract failing test details |
|
||||
| 5 | Delegate fix to CLI tool (gemini write mode) |
|
||||
| 6 | Increment iteration; >= 3 -> exit with failures |
|
||||
|
||||
```
|
||||
Bash("<test-command> 2>&1 || true")
|
||||
```
|
||||
|
||||
**Auto-fix delegation** (on failure):
|
||||
|
||||
```
|
||||
Bash(`ccw cli -p "PURPOSE: Fix test failures to achieve pass rate >= 0.95; success = all tests pass
|
||||
TASK: • Analyze test failure output • Identify root causes • Fix test code only (not source) • Preserve test intent
|
||||
MODE: write
|
||||
CONTEXT: @<session>/<test-dir>/**/* | Memory: Test framework: <framework>, iteration <N>/3
|
||||
EXPECTED: Fixed test files with: corrected assertions, proper async handling, fixed imports, maintained coverage
|
||||
CONSTRAINTS: Only modify test files | Preserve test structure | No source code changes
|
||||
Test failures:
|
||||
<test-output>" --tool gemini --mode write --cd <session>`)
|
||||
```
|
||||
|
||||
**Save results**: `<session>/results/run-<N>.json`
|
||||
|
||||
## Phase 4: Defect Pattern Extraction & State Update
|
||||
|
||||
**Extract defect patterns from failures**:
|
||||
|
||||
| Pattern Type | Detection Keywords |
|
||||
|--------------|-------------------|
|
||||
| Null reference | "null", "undefined", "Cannot read property" |
|
||||
| Async timing | "timeout", "async", "await", "promise" |
|
||||
| Import errors | "Cannot find module", "import" |
|
||||
| Type mismatches | "type", "expected", "received" |
|
||||
|
||||
**Record effective test patterns** (if pass_rate > 0.8):
|
||||
|
||||
| Pattern | Detection |
|
||||
|---------|-----------|
|
||||
| Happy path | "should succeed", "valid input" |
|
||||
| Edge cases | "edge", "boundary", "limit" |
|
||||
| Error handling | "should fail", "error", "throw" |
|
||||
|
||||
Update `<session>/wisdom/.msg/meta.json` under `executor` namespace:
|
||||
- Merge `{ "executor": { pass_rate, coverage, defect_patterns, effective_patterns, coverage_history_entry } }`
|
||||
95
.codex/skills/team-testing/roles/generator/role.md
Normal file
95
.codex/skills/team-testing/roles/generator/role.md
Normal file
@@ -0,0 +1,95 @@
|
||||
---
|
||||
role: generator
|
||||
prefix: TESTGEN
|
||||
inner_loop: true
|
||||
message_types:
|
||||
success: tests_generated
|
||||
revision: tests_revised
|
||||
error: error
|
||||
---
|
||||
|
||||
# Test Generator
|
||||
|
||||
Generate test code by layer (L1 unit / L2 integration / L3 E2E). Acts as the Generator in the Generator-Critic loop. Supports revision mode for GC loop iterations.
|
||||
|
||||
## Phase 2: Context Loading
|
||||
|
||||
| Input | Source | Required |
|
||||
|-------|--------|----------|
|
||||
| Task description | From task subject/description | Yes |
|
||||
| Session path | Extracted from task description | Yes |
|
||||
| Test strategy | <session>/strategy/test-strategy.md | Yes |
|
||||
| .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
|
||||
|
||||
1. Extract session path and layer from task description
|
||||
2. Load test specs: Run `ccw spec load --category test` for test framework conventions and coverage targets
|
||||
3. Read test strategy:
|
||||
|
||||
```
|
||||
Read("<session>/strategy/test-strategy.md")
|
||||
```
|
||||
|
||||
3. Read source files to test (from strategy priority_files, limit 20)
|
||||
4. Read .msg/meta.json for framework and scope context
|
||||
|
||||
5. Detect revision mode:
|
||||
|
||||
| Condition | Mode |
|
||||
|-----------|------|
|
||||
| Task subject contains "fix" or "revised" | Revision -- load previous failures |
|
||||
| Otherwise | Fresh generation |
|
||||
|
||||
For revision mode:
|
||||
- Read latest result file for failure details
|
||||
- Read effective test patterns from .msg/meta.json
|
||||
|
||||
6. Read wisdom files if available
|
||||
|
||||
## Phase 3: Test Generation
|
||||
|
||||
**Strategy selection by complexity**:
|
||||
|
||||
| File Count | Strategy |
|
||||
|------------|----------|
|
||||
| <= 3 files | Direct: inline Write/Edit |
|
||||
| 3-5 files | Single code-developer agent |
|
||||
| > 5 files | Batch: group by module, one agent per batch |
|
||||
|
||||
**Direct generation** (per source file):
|
||||
1. Generate test path: `<session>/tests/<layer>/<test-file>`
|
||||
2. Generate test code: happy path, edge cases, error handling
|
||||
3. Write test file
|
||||
|
||||
**CLI delegation** (medium/high complexity):
|
||||
|
||||
```
|
||||
Bash(`ccw cli -p "PURPOSE: Generate <layer> tests using <framework> to achieve coverage target; success = all priority files covered with quality tests
|
||||
TASK: • Analyze source files • Generate test cases (happy path, edge cases, errors) • Write test files with proper structure • Ensure import resolution
|
||||
MODE: write
|
||||
CONTEXT: @<source-files> @<session>/strategy/test-strategy.md | Memory: Framework: <framework>, Layer: <layer>, Round: <round>
|
||||
<if-revision: Previous failures: <failure-details>
|
||||
Effective patterns: <patterns-from-meta>>
|
||||
EXPECTED: Test files in <session>/tests/<layer>/ with: proper test structure, comprehensive coverage, correct imports, framework conventions
|
||||
CONSTRAINTS: Follow test strategy priorities | Use framework best practices | <layer>-appropriate assertions
|
||||
Source files to test:
|
||||
<file-list-with-content>" --tool gemini --mode write --cd <session>`)
|
||||
```
|
||||
|
||||
**Output verification**:
|
||||
|
||||
```
|
||||
Glob("<session>/tests/<layer>/**/*")
|
||||
```
|
||||
|
||||
## Phase 4: Self-Validation & State Update
|
||||
|
||||
**Validation checks**:
|
||||
|
||||
| Check | Method | Action on Fail |
|
||||
|-------|--------|----------------|
|
||||
| Syntax | `tsc --noEmit` or equivalent | Auto-fix imports/types |
|
||||
| File count | Count generated files | Report issue |
|
||||
| Import resolution | Check broken imports | Fix import paths |
|
||||
|
||||
Update `<session>/wisdom/.msg/meta.json` under `generator` namespace:
|
||||
- Merge `{ "generator": { test_files, layer, round, is_revision } }`
|
||||
83
.codex/skills/team-testing/roles/strategist/role.md
Normal file
83
.codex/skills/team-testing/roles/strategist/role.md
Normal file
@@ -0,0 +1,83 @@
|
||||
---
|
||||
role: strategist
|
||||
prefix: STRATEGY
|
||||
inner_loop: false
|
||||
message_types:
|
||||
success: strategy_ready
|
||||
error: error
|
||||
---
|
||||
|
||||
# Test Strategist
|
||||
|
||||
Analyze git diff, determine test layers, define coverage targets, and formulate test strategy with prioritized execution order.
|
||||
|
||||
## Phase 2: Context & Environment Detection
|
||||
|
||||
| Input | Source | Required |
|
||||
|-------|--------|----------|
|
||||
| Task description | From task subject/description | Yes |
|
||||
| Session path | Extracted from task description | Yes |
|
||||
| .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
|
||||
|
||||
1. Extract session path and scope from task description
|
||||
2. Get git diff for change analysis:
|
||||
|
||||
```
|
||||
Bash("git diff HEAD~1 --name-only 2>/dev/null || git diff --cached --name-only")
|
||||
Bash("git diff HEAD~1 -- <changed-files> 2>/dev/null || git diff --cached -- <changed-files>")
|
||||
```
|
||||
|
||||
3. Detect test framework from project files:
|
||||
|
||||
| Signal File | Framework | Test Pattern |
|
||||
|-------------|-----------|-------------|
|
||||
| jest.config.js/ts | Jest | `**/*.test.{ts,tsx,js}` |
|
||||
| vitest.config.ts/js | Vitest | `**/*.test.{ts,tsx}` |
|
||||
| pytest.ini / pyproject.toml | Pytest | `**/test_*.py` |
|
||||
| No detection | Default | Jest patterns |
|
||||
|
||||
4. Scan existing test patterns:
|
||||
|
||||
```
|
||||
Glob("**/*.test.*")
|
||||
Glob("**/*.spec.*")
|
||||
```
|
||||
|
||||
5. Read .msg/meta.json if exists for session context
|
||||
|
||||
## Phase 3: Strategy Formulation
|
||||
|
||||
**Change analysis dimensions**:
|
||||
|
||||
| Change Type | Analysis | Priority |
|
||||
|-------------|----------|----------|
|
||||
| New files | Need new tests | High |
|
||||
| Modified functions | Need updated tests | Medium |
|
||||
| Deleted files | Need test cleanup | Low |
|
||||
| Config changes | May need integration tests | Variable |
|
||||
|
||||
**Strategy output structure**:
|
||||
|
||||
1. **Change Analysis Table**: File, Change Type, Impact, Priority
|
||||
2. **Test Layer Recommendations**:
|
||||
- L1 Unit: Scope, Coverage Target, Priority Files, Patterns
|
||||
- L2 Integration: Scope, Coverage Target, Integration Points
|
||||
- L3 E2E: Scope, Coverage Target, User Scenarios
|
||||
3. **Risk Assessment**: Risk, Probability, Impact, Mitigation
|
||||
4. **Test Execution Order**: Prioritized sequence
|
||||
|
||||
Write strategy to `<session>/strategy/test-strategy.md`
|
||||
|
||||
**Self-validation**:
|
||||
|
||||
| Check | Criteria | Fallback |
|
||||
|-------|----------|----------|
|
||||
| Has L1 scope | L1 scope not empty | Default to all changed files |
|
||||
| Has coverage targets | L1 target > 0 | Use defaults (80/60/40) |
|
||||
| Has priority files | List not empty | Use all changed files |
|
||||
|
||||
## Phase 4: Wisdom & State Update
|
||||
|
||||
1. Write discoveries to `<session>/wisdom/conventions.md` (detected framework, patterns)
|
||||
2. Update `<session>/wisdom/.msg/meta.json` under `strategist` namespace:
|
||||
- Read existing -> merge `{ "strategist": { framework, layers, coverage_targets, priority_files, risks } }` -> write back
|
||||
@@ -1,172 +0,0 @@
|
||||
# Team Testing -- CSV Schema
|
||||
|
||||
## Master CSV: tasks.csv
|
||||
|
||||
### Column Definitions
|
||||
|
||||
#### Input Columns (Set by Decomposer)
|
||||
|
||||
| Column | Type | Required | Description | Example |
|
||||
|--------|------|----------|-------------|---------|
|
||||
| `id` | string | Yes | Unique task identifier (PREFIX-NNN) | `"STRATEGY-001"` |
|
||||
| `title` | string | Yes | Short task title | `"Analyze changes and define test strategy"` |
|
||||
| `description` | string | Yes | Detailed task description (self-contained) | `"Analyze git diff, detect framework..."` |
|
||||
| `role` | enum | Yes | Worker role: `strategist`, `generator`, `executor`, `analyst` | `"generator"` |
|
||||
| `layer` | string | No | Test layer: `L1`, `L2`, `L3`, or empty | `"L1"` |
|
||||
| `coverage_target` | string | No | Target coverage percentage for this layer | `"80"` |
|
||||
| `deps` | string | No | Semicolon-separated dependency task IDs | `"STRATEGY-001"` |
|
||||
| `context_from` | string | No | Semicolon-separated task IDs for context | `"STRATEGY-001"` |
|
||||
| `exec_mode` | enum | Yes | Execution mechanism: `csv-wave` or `interactive` | `"csv-wave"` |
|
||||
|
||||
#### Computed Columns (Set by Wave Engine)
|
||||
|
||||
| Column | Type | Description | Example |
|
||||
|--------|------|-------------|---------|
|
||||
| `wave` | integer | Wave number (1-based, from topological sort) | `2` |
|
||||
| `prev_context` | string | Aggregated findings from context_from tasks (per-wave CSV only) | `"[STRATEGY-001] Detected vitest, L1 target 80%..."` |
|
||||
|
||||
#### Output Columns (Set by Agent)
|
||||
|
||||
| Column | Type | Description | Example |
|
||||
|--------|------|-------------|---------|
|
||||
| `status` | enum | `pending` -> `completed` / `failed` / `skipped` | `"completed"` |
|
||||
| `findings` | string | Key discoveries (max 500 chars) | `"Generated 5 test files covering auth module..."` |
|
||||
| `pass_rate` | string | Test pass rate as decimal | `"0.95"` |
|
||||
| `coverage_achieved` | string | Actual coverage percentage achieved | `"82"` |
|
||||
| `test_files` | string | Semicolon-separated paths of test files | `"tests/L1-unit/auth.test.ts;tests/L1-unit/user.test.ts"` |
|
||||
| `error` | string | Error message if failed | `""` |
|
||||
|
||||
---
|
||||
|
||||
### exec_mode Values
|
||||
|
||||
| Value | Mechanism | Description |
|
||||
|-------|-----------|-------------|
|
||||
| `csv-wave` | `spawn_agents_on_csv` | One-shot batch execution within wave |
|
||||
| `interactive` | `spawn_agent`/`wait`/`send_input`/`close_agent` | Multi-round individual execution (executor fix cycles) |
|
||||
|
||||
Interactive tasks appear in master CSV for dependency tracking but are NOT included in wave-{N}.csv files.
|
||||
|
||||
---
|
||||
|
||||
### Role Prefixes
|
||||
|
||||
| Role | Prefix | Responsibility Type |
|
||||
|------|--------|---------------------|
|
||||
| strategist | STRATEGY | read-only analysis |
|
||||
| generator | TESTGEN | code-gen (test files) |
|
||||
| executor | TESTRUN | validation (run + fix) |
|
||||
| analyst | TESTANA | read-only analysis |
|
||||
|
||||
---
|
||||
|
||||
### Example Data
|
||||
|
||||
```csv
|
||||
id,title,description,role,layer,coverage_target,deps,context_from,exec_mode,wave,status,findings,pass_rate,coverage_achieved,test_files,error
|
||||
"STRATEGY-001","Analyze changes and define test strategy","Analyze git diff for changed files. Detect test framework (vitest/jest/pytest). Determine test layers needed (L1/L2/L3). Define coverage targets per layer. Generate prioritized test strategy document at <session>/strategy/test-strategy.md","strategist","","","","","csv-wave","1","pending","","","","",""
|
||||
"TESTGEN-001","Generate L1 unit tests","Generate L1 unit tests for priority files from strategy. Read source files, identify exports, generate test cases covering happy path, edge cases, error handling. Write tests to <session>/tests/L1-unit/. Follow project test conventions.","generator","L1","80","STRATEGY-001","STRATEGY-001","csv-wave","2","pending","","","","",""
|
||||
"TESTRUN-001","Execute L1 tests and collect coverage","Run L1 test suite with coverage collection. Parse results for pass rate and coverage. If pass_rate < 0.95 or coverage < 80%, attempt auto-fix (max 3 iterations). Save results to <session>/results/run-L1.json","executor","L1","80","TESTGEN-001","TESTGEN-001","interactive","3","pending","","","","",""
|
||||
"TESTGEN-002","Generate L2 integration tests","Generate L2 integration tests based on L1 results and strategy. Focus on module interaction points. Write tests to <session>/tests/L2-integration/.","generator","L2","60","TESTRUN-001","TESTRUN-001","csv-wave","4","pending","","","","",""
|
||||
"TESTRUN-002","Execute L2 tests and collect coverage","Run L2 integration test suite with coverage. Auto-fix up to 3 iterations. Save results to <session>/results/run-L2.json","executor","L2","60","TESTGEN-002","TESTGEN-002","interactive","5","pending","","","","",""
|
||||
"TESTANA-001","Quality analysis report","Analyze defect patterns, coverage gaps, GC loop effectiveness. Generate quality report with score and recommendations. Write to <session>/analysis/quality-report.md","analyst","","","TESTRUN-002","TESTRUN-001;TESTRUN-002","csv-wave","6","pending","","","","",""
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Column Lifecycle
|
||||
|
||||
```
|
||||
Decomposer (Phase 1) Wave Engine (Phase 2) Agent (Execution)
|
||||
--------------------- -------------------- -----------------
|
||||
id ----------> id ----------> id
|
||||
title ----------> title ----------> (reads)
|
||||
description ----------> description ----------> (reads)
|
||||
role ----------> role ----------> (reads)
|
||||
layer ----------> layer ----------> (reads)
|
||||
coverage_target -------> coverage_target -------> (reads)
|
||||
deps ----------> deps ----------> (reads)
|
||||
context_from----------> context_from----------> (reads)
|
||||
exec_mode ----------> exec_mode ----------> (reads)
|
||||
wave ----------> (reads)
|
||||
prev_context ----------> (reads)
|
||||
status
|
||||
findings
|
||||
pass_rate
|
||||
coverage_achieved
|
||||
test_files
|
||||
error
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Output Schema (JSON)
|
||||
|
||||
Agent output via `report_agent_job_result` (csv-wave tasks):
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "TESTGEN-001",
|
||||
"status": "completed",
|
||||
"findings": "Generated 5 L1 unit test files covering auth, user, and session modules. Total 24 test cases: 15 happy path, 6 edge cases, 3 error handling.",
|
||||
"pass_rate": "",
|
||||
"coverage_achieved": "",
|
||||
"test_files": "tests/L1-unit/auth.test.ts;tests/L1-unit/user.test.ts;tests/L1-unit/session.test.ts",
|
||||
"error": ""
|
||||
}
|
||||
```
|
||||
|
||||
Interactive tasks output via structured text or JSON written to `interactive/{id}-result.json`.
|
||||
|
||||
---
|
||||
|
||||
## Discovery Types
|
||||
|
||||
| Type | Dedup Key | Data Schema | Description |
|
||||
|------|-----------|-------------|-------------|
|
||||
| `framework_detected` | `data.framework` | `{framework, config_file, test_pattern}` | Test framework identified |
|
||||
| `test_generated` | `data.file` | `{file, source_file, test_count}` | Test file created |
|
||||
| `defect_found` | `data.file+data.line` | `{file, line, pattern, description}` | Defect pattern discovered |
|
||||
| `coverage_gap` | `data.file` | `{file, current, target, gap}` | Coverage gap identified |
|
||||
| `convention_found` | `data.pattern` | `{pattern, example_file, description}` | Test convention detected |
|
||||
| `fix_applied` | `data.test_file+data.fix_type` | `{test_file, fix_type, description}` | Test fix during GC loop |
|
||||
|
||||
### Discovery NDJSON Format
|
||||
|
||||
```jsonl
|
||||
{"ts":"2026-03-08T10:00:00Z","worker":"STRATEGY-001","type":"framework_detected","data":{"framework":"vitest","config_file":"vitest.config.ts","test_pattern":"**/*.test.ts"}}
|
||||
{"ts":"2026-03-08T10:05:00Z","worker":"TESTGEN-001","type":"test_generated","data":{"file":"tests/L1-unit/auth.test.ts","source_file":"src/auth.ts","test_count":8}}
|
||||
{"ts":"2026-03-08T10:10:00Z","worker":"TESTRUN-001","type":"defect_found","data":{"file":"src/auth.ts","line":42,"pattern":"null_reference","description":"Missing null check on token payload"}}
|
||||
{"ts":"2026-03-08T10:12:00Z","worker":"TESTRUN-001","type":"fix_applied","data":{"test_file":"tests/L1-unit/auth.test.ts","fix_type":"assertion_fix","description":"Fixed expected return type assertion"}}
|
||||
```
|
||||
|
||||
> Both csv-wave and interactive agents read/write the same discoveries.ndjson file.
|
||||
|
||||
---
|
||||
|
||||
## Cross-Mechanism Context Flow
|
||||
|
||||
| Source | Target | Mechanism |
|
||||
|--------|--------|-----------|
|
||||
| CSV task findings | Interactive task | Injected via spawn message |
|
||||
| Interactive task result | CSV task prev_context | Read from interactive/{id}-result.json |
|
||||
| Any agent discovery | Any agent | Shared via discoveries.ndjson |
|
||||
| Executor coverage data | GC loop handler | Read from results/run-{layer}.json |
|
||||
|
||||
---
|
||||
|
||||
## Validation Rules
|
||||
|
||||
| Rule | Check | Error |
|
||||
|------|-------|-------|
|
||||
| Unique IDs | No duplicate `id` values | "Duplicate task ID: {id}" |
|
||||
| Valid deps | All dep IDs exist in tasks | "Unknown dependency: {dep_id}" |
|
||||
| No self-deps | Task cannot depend on itself | "Self-dependency: {id}" |
|
||||
| No circular deps | Topological sort completes | "Circular dependency detected involving: {ids}" |
|
||||
| context_from valid | All context IDs exist and in earlier waves | "Invalid context_from: {id}" |
|
||||
| exec_mode valid | Value is `csv-wave` or `interactive` | "Invalid exec_mode: {value}" |
|
||||
| Description non-empty | Every task has description | "Empty description for task: {id}" |
|
||||
| Status enum | status in {pending, completed, failed, skipped} | "Invalid status: {status}" |
|
||||
| Role valid | role in {strategist, generator, executor, analyst} | "Invalid role: {role}" |
|
||||
| Layer valid | layer in {L1, L2, L3, ""} | "Invalid layer: {layer}" |
|
||||
| Coverage target valid | If layer present, coverage_target is numeric | "Invalid coverage target: {value}" |
|
||||
101
.codex/skills/team-testing/specs/pipelines.md
Normal file
101
.codex/skills/team-testing/specs/pipelines.md
Normal file
@@ -0,0 +1,101 @@
|
||||
# Testing Pipelines
|
||||
|
||||
Pipeline definitions and task registry for team-testing.
|
||||
|
||||
## Pipeline Selection
|
||||
|
||||
| Condition | Pipeline |
|
||||
|-----------|----------|
|
||||
| fileCount <= 3 AND moduleCount <= 1 | targeted |
|
||||
| fileCount <= 10 AND moduleCount <= 3 | standard |
|
||||
| Otherwise | comprehensive |
|
||||
|
||||
## Pipeline Definitions
|
||||
|
||||
### Targeted Pipeline (3 tasks, serial)
|
||||
|
||||
```
|
||||
STRATEGY-001 -> TESTGEN-001 -> TESTRUN-001
|
||||
```
|
||||
|
||||
| Task ID | Role | Dependencies | Layer | Description |
|
||||
|---------|------|-------------|-------|-------------|
|
||||
| STRATEGY-001 | strategist | (none) | — | Analyze changes, define test strategy |
|
||||
| TESTGEN-001 | generator | STRATEGY-001 | L1 | Generate L1 unit tests |
|
||||
| TESTRUN-001 | executor | TESTGEN-001 | L1 | Execute L1 tests, collect coverage |
|
||||
|
||||
### Standard Pipeline (6 tasks, progressive layers)
|
||||
|
||||
```
|
||||
STRATEGY-001 -> TESTGEN-001 -> TESTRUN-001 -> TESTGEN-002 -> TESTRUN-002 -> TESTANA-001
|
||||
```
|
||||
|
||||
| Task ID | Role | Dependencies | Layer | Description |
|
||||
|---------|------|-------------|-------|-------------|
|
||||
| STRATEGY-001 | strategist | (none) | — | Analyze changes, define test strategy |
|
||||
| TESTGEN-001 | generator | STRATEGY-001 | L1 | Generate L1 unit tests |
|
||||
| TESTRUN-001 | executor | TESTGEN-001 | L1 | Execute L1 tests, collect coverage |
|
||||
| TESTGEN-002 | generator | TESTRUN-001 | L2 | Generate L2 integration tests |
|
||||
| TESTRUN-002 | executor | TESTGEN-002 | L2 | Execute L2 tests, collect coverage |
|
||||
| TESTANA-001 | analyst | TESTRUN-002 | — | Defect pattern analysis, quality report |
|
||||
|
||||
### Comprehensive Pipeline (8 tasks, parallel windows)
|
||||
|
||||
```
|
||||
STRATEGY-001 -> [TESTGEN-001 || TESTGEN-002] -> [TESTRUN-001 || TESTRUN-002] -> TESTGEN-003 -> TESTRUN-003 -> TESTANA-001
|
||||
```
|
||||
|
||||
| Task ID | Role | Dependencies | Layer | Description |
|
||||
|---------|------|-------------|-------|-------------|
|
||||
| STRATEGY-001 | strategist | (none) | — | Analyze changes, define test strategy |
|
||||
| TESTGEN-001 | generator-1 | STRATEGY-001 | L1 | Generate L1 unit tests (parallel) |
|
||||
| TESTGEN-002 | generator-2 | STRATEGY-001 | L2 | Generate L2 integration tests (parallel) |
|
||||
| TESTRUN-001 | executor-1 | TESTGEN-001 | L1 | Execute L1 tests (parallel) |
|
||||
| TESTRUN-002 | executor-2 | TESTGEN-002 | L2 | Execute L2 tests (parallel) |
|
||||
| TESTGEN-003 | generator | TESTRUN-001, TESTRUN-002 | L3 | Generate L3 E2E tests |
|
||||
| TESTRUN-003 | executor | TESTGEN-003 | L3 | Execute L3 tests, collect coverage |
|
||||
| TESTANA-001 | analyst | TESTRUN-003 | — | Defect pattern analysis, quality report |
|
||||
|
||||
## GC Loop (Generator-Critic)
|
||||
|
||||
Generator and executor iterate per test layer:
|
||||
|
||||
```
|
||||
TESTGEN -> TESTRUN -> (if pass_rate < 0.95 OR coverage < target) -> TESTGEN-fix -> TESTRUN-fix
|
||||
(if pass_rate >= 0.95 AND coverage >= target) -> next layer or TESTANA
|
||||
```
|
||||
|
||||
- Max iterations: 3 per layer
|
||||
- After 3 iterations: accept current state with warning
|
||||
|
||||
## Coverage Targets
|
||||
|
||||
| Layer | Name | Default Target |
|
||||
|-------|------|----------------|
|
||||
| L1 | Unit Tests | 80% |
|
||||
| L2 | Integration Tests | 60% |
|
||||
| L3 | E2E Tests | 40% |
|
||||
|
||||
## Session Directory
|
||||
|
||||
```
|
||||
.workflow/.team/TST-<slug>-<YYYY-MM-DD>/
|
||||
├── .msg/messages.jsonl # Message bus log
|
||||
├── .msg/meta.json # Session metadata
|
||||
├── wisdom/ # Cross-task knowledge
|
||||
│ ├── learnings.md
|
||||
│ ├── decisions.md
|
||||
│ ├── conventions.md
|
||||
│ └── issues.md
|
||||
├── strategy/ # Strategist output
|
||||
│ └── test-strategy.md
|
||||
├── tests/ # Generator output
|
||||
│ ├── L1-unit/
|
||||
│ ├── L2-integration/
|
||||
│ └── L3-e2e/
|
||||
├── results/ # Executor output
|
||||
│ ├── run-001.json
|
||||
│ └── coverage-001.json
|
||||
└── analysis/ # Analyst output
|
||||
└── quality-report.md
|
||||
```
|
||||
93
.codex/skills/team-testing/specs/team-config.json
Normal file
93
.codex/skills/team-testing/specs/team-config.json
Normal file
@@ -0,0 +1,93 @@
|
||||
{
|
||||
"team_name": "team-testing",
|
||||
"team_display_name": "Team Testing",
|
||||
"description": "Testing team with Generator-Critic loop, shared defect memory, and progressive test layers",
|
||||
"version": "1.0.0",
|
||||
|
||||
"roles": {
|
||||
"coordinator": {
|
||||
"task_prefix": null,
|
||||
"responsibility": "Change scope analysis, layer selection, quality gating",
|
||||
"message_types": ["pipeline_selected", "gc_loop_trigger", "quality_gate", "task_unblocked", "error", "shutdown"]
|
||||
},
|
||||
"strategist": {
|
||||
"task_prefix": "STRATEGY",
|
||||
"responsibility": "Analyze git diff, determine test layers, define coverage targets",
|
||||
"message_types": ["strategy_ready", "error"]
|
||||
},
|
||||
"generator": {
|
||||
"task_prefix": "TESTGEN",
|
||||
"responsibility": "Generate test cases by layer (unit/integration/E2E)",
|
||||
"message_types": ["tests_generated", "tests_revised", "error"]
|
||||
},
|
||||
"executor": {
|
||||
"task_prefix": "TESTRUN",
|
||||
"responsibility": "Execute tests, collect coverage, auto-fix failures",
|
||||
"message_types": ["tests_passed", "tests_failed", "coverage_report", "error"]
|
||||
},
|
||||
"analyst": {
|
||||
"task_prefix": "TESTANA",
|
||||
"responsibility": "Defect pattern analysis, coverage gap analysis, quality report",
|
||||
"message_types": ["analysis_ready", "error"]
|
||||
}
|
||||
},
|
||||
|
||||
"pipelines": {
|
||||
"targeted": {
|
||||
"description": "Small scope: strategy → generate L1 → run",
|
||||
"task_chain": ["STRATEGY-001", "TESTGEN-001", "TESTRUN-001"],
|
||||
"gc_loops": 0
|
||||
},
|
||||
"standard": {
|
||||
"description": "Progressive: L1 → L2 with analysis",
|
||||
"task_chain": ["STRATEGY-001", "TESTGEN-001", "TESTRUN-001", "TESTGEN-002", "TESTRUN-002", "TESTANA-001"],
|
||||
"gc_loops": 1
|
||||
},
|
||||
"comprehensive": {
|
||||
"description": "Full coverage: parallel L1+L2, then L3 with analysis",
|
||||
"task_chain": ["STRATEGY-001", "TESTGEN-001", "TESTGEN-002", "TESTRUN-001", "TESTRUN-002", "TESTGEN-003", "TESTRUN-003", "TESTANA-001"],
|
||||
"gc_loops": 2,
|
||||
"parallel_groups": [["TESTGEN-001", "TESTGEN-002"], ["TESTRUN-001", "TESTRUN-002"]]
|
||||
}
|
||||
},
|
||||
|
||||
"innovation_patterns": {
|
||||
"generator_critic": {
|
||||
"generator": "generator",
|
||||
"critic": "executor",
|
||||
"max_rounds": 3,
|
||||
"convergence_trigger": "coverage >= target && pass_rate >= 0.95"
|
||||
},
|
||||
"shared_memory": {
|
||||
"file": "shared-memory.json",
|
||||
"fields": {
|
||||
"strategist": "test_strategy",
|
||||
"generator": "generated_tests",
|
||||
"executor": "execution_results",
|
||||
"analyst": "analysis_report"
|
||||
},
|
||||
"persistent_fields": ["defect_patterns", "effective_test_patterns", "coverage_history"]
|
||||
},
|
||||
"dynamic_pipeline": {
|
||||
"selector": "coordinator",
|
||||
"criteria": "changed_file_count + module_count + change_type"
|
||||
}
|
||||
},
|
||||
|
||||
"test_layers": {
|
||||
"L1": { "name": "Unit Tests", "coverage_target": 80, "description": "Function-level isolation tests" },
|
||||
"L2": { "name": "Integration Tests", "coverage_target": 60, "description": "Module interaction tests" },
|
||||
"L3": { "name": "E2E Tests", "coverage_target": 40, "description": "User scenario end-to-end tests" }
|
||||
},
|
||||
|
||||
"collaboration_patterns": ["CP-1", "CP-3", "CP-5"],
|
||||
|
||||
"session_dirs": {
|
||||
"base": ".workflow/.team/TST-{slug}-{YYYY-MM-DD}/",
|
||||
"strategy": "strategy/",
|
||||
"tests": "tests/",
|
||||
"results": "results/",
|
||||
"analysis": "analysis/",
|
||||
"messages": ".workflow/.team-msg/{team-name}/"
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user