mirror of
https://github.com/catlog22/Claude-Code-Workflow.git
synced 2026-03-26 19:56:37 +08:00
feat: migrate all codex team skills from spawn_agents_on_csv to spawn_agent + wait_agent architecture
- Delete 21 old team skill directories using CSV-wave pipeline pattern (~100+ files) - Delete old team-lifecycle (v3) and team-planex-v2 - Create generic team-worker.toml and team-supervisor.toml (replacing tlv4-specific TOMLs) - Convert 19 team skills from Claude Code format (Agent/SendMessage/TaskCreate) to Codex format (spawn_agent/wait_agent/tasks.json/request_user_input) - Update team-lifecycle-v4 to use generic agent types (team_worker/team_supervisor) - Convert all coordinator role files: dispatch.md, monitor.md, role.md - Convert all worker role files: remove run_in_background, fix Bash syntax - Convert all specs/pipelines.md references - Final state: 20 team skills, 217 .md files, zero Claude Code API residuals Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
95
.codex/skills/team-testing/roles/analyst/role.md
Normal file
95
.codex/skills/team-testing/roles/analyst/role.md
Normal file
@@ -0,0 +1,95 @@
|
||||
---
|
||||
role: analyst
|
||||
prefix: TESTANA
|
||||
inner_loop: false
|
||||
message_types:
|
||||
success: analysis_ready
|
||||
error: error
|
||||
---
|
||||
|
||||
# Test Quality Analyst
|
||||
|
||||
Analyze defect patterns, identify coverage gaps, assess GC loop effectiveness, and generate a quality report with actionable recommendations.
|
||||
|
||||
## Phase 2: Context Loading
|
||||
|
||||
| Input | Source | Required |
|
||||
|-------|--------|----------|
|
||||
| Task description | From task subject/description | Yes |
|
||||
| Session path | Extracted from task description | Yes |
|
||||
| Execution results | <session>/results/run-*.json | Yes |
|
||||
| Test strategy | <session>/strategy/test-strategy.md | Yes |
|
||||
| .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
|
||||
|
||||
1. Extract session path from task description
|
||||
2. Read .msg/meta.json for execution context (executor, generator namespaces)
|
||||
3. Read all execution results:
|
||||
|
||||
```
|
||||
Glob("<session>/results/run-*.json")
|
||||
Read("<session>/results/run-001.json")
|
||||
```
|
||||
|
||||
4. Read test strategy:
|
||||
|
||||
```
|
||||
Read("<session>/strategy/test-strategy.md")
|
||||
```
|
||||
|
||||
5. Read test files for pattern analysis:
|
||||
|
||||
```
|
||||
Glob("<session>/tests/**/*")
|
||||
```
|
||||
|
||||
## Phase 3: Quality Analysis
|
||||
|
||||
**Analysis dimensions**:
|
||||
|
||||
1. **Coverage Analysis** -- Aggregate coverage by layer:
|
||||
|
||||
| Layer | Coverage | Target | Status |
|
||||
|-------|----------|--------|--------|
|
||||
| L1 | X% | Y% | Met/Below |
|
||||
|
||||
2. **Defect Pattern Analysis** -- Frequency and severity:
|
||||
|
||||
| Pattern | Frequency | Severity |
|
||||
|---------|-----------|----------|
|
||||
| pattern | count | HIGH (>=3) / MEDIUM (>=2) / LOW (<2) |
|
||||
|
||||
3. **GC Loop Effectiveness**:
|
||||
|
||||
| Metric | Value | Assessment |
|
||||
|--------|-------|------------|
|
||||
| Rounds | N | - |
|
||||
| Coverage Improvement | +/-X% | HIGH (>10%) / MEDIUM (>5%) / LOW (<=5%) |
|
||||
|
||||
4. **Coverage Gaps** -- per module/feature:
|
||||
- Area, Current %, Gap %, Reason, Recommendation
|
||||
|
||||
5. **Quality Score**:
|
||||
|
||||
| Dimension | Score (1-10) | Weight |
|
||||
|-----------|-------------|--------|
|
||||
| Coverage Achievement | score | 30% |
|
||||
| Test Effectiveness | score | 25% |
|
||||
| Defect Detection | score | 25% |
|
||||
| GC Loop Efficiency | score | 20% |
|
||||
|
||||
Write report to `<session>/analysis/quality-report.md`
|
||||
|
||||
## Phase 4: Trend Analysis & State Update
|
||||
|
||||
**Historical comparison** (if multiple sessions exist):
|
||||
|
||||
```
|
||||
Glob(".workflow/.team/TST-*/.msg/meta.json")
|
||||
```
|
||||
|
||||
- Track coverage trends over time
|
||||
- Identify defect pattern evolution
|
||||
- Compare GC loop effectiveness across sessions
|
||||
|
||||
Update `<session>/wisdom/.msg/meta.json` under `analyst` namespace:
|
||||
- Merge `{ "analyst": { quality_score, coverage_gaps, top_defect_patterns, gc_effectiveness, recommendations } }`
|
||||
@@ -0,0 +1,70 @@
|
||||
# Analyze Task
|
||||
|
||||
Parse user task -> detect testing capabilities -> select pipeline -> design roles.
|
||||
|
||||
**CONSTRAINT**: Text-level analysis only. NO source code reading, NO codebase exploration.
|
||||
|
||||
## Signal Detection
|
||||
|
||||
| Keywords | Capability | Prefix |
|
||||
|----------|------------|--------|
|
||||
| strategy, plan, layers, scope | strategist | STRATEGY |
|
||||
| generate tests, write tests, create tests | generator | TESTGEN |
|
||||
| run tests, execute, coverage | executor | TESTRUN |
|
||||
| analyze, report, quality, defects | analyst | TESTANA |
|
||||
|
||||
## Pipeline Mode Detection
|
||||
|
||||
| Condition | Pipeline |
|
||||
|-----------|----------|
|
||||
| fileCount <= 3 AND moduleCount <= 1 | targeted |
|
||||
| fileCount <= 10 AND moduleCount <= 3 | standard |
|
||||
| Otherwise | comprehensive |
|
||||
|
||||
## Dependency Graph
|
||||
|
||||
Natural ordering for testing pipeline:
|
||||
- Tier 0: strategist (change analysis, no upstream dependency)
|
||||
- Tier 1: generator (requires strategy)
|
||||
- Tier 2: executor (requires generated tests; GC loop with generator)
|
||||
- Tier 3: analyst (requires execution results)
|
||||
|
||||
## Pipeline Definitions
|
||||
|
||||
```
|
||||
Targeted: STRATEGY -> TESTGEN(L1) -> TESTRUN(L1)
|
||||
Standard: STRATEGY -> TESTGEN(L1) -> TESTRUN(L1) -> TESTGEN(L2) -> TESTRUN(L2) -> TESTANA
|
||||
Comprehensive: STRATEGY -> [TESTGEN(L1) || TESTGEN(L2)] -> [TESTRUN(L1) || TESTRUN(L2)] -> TESTGEN(L3) -> TESTRUN(L3) -> TESTANA
|
||||
```
|
||||
|
||||
## Complexity Scoring
|
||||
|
||||
| Factor | Points |
|
||||
|--------|--------|
|
||||
| Per test layer | +1 |
|
||||
| Parallel tracks | +1 per track |
|
||||
| GC loop enabled | +1 |
|
||||
| Serial depth > 3 | +1 |
|
||||
|
||||
Results: 1-2 Low, 3-5 Medium, 6+ High
|
||||
|
||||
## Role Minimization
|
||||
|
||||
- Cap at 5 roles (coordinator + 4 workers)
|
||||
- GC loop: generator <-> executor iterate up to 3 rounds per layer
|
||||
|
||||
## Output
|
||||
|
||||
Write <session>/task-analysis.json:
|
||||
```json
|
||||
{
|
||||
"task_description": "<original>",
|
||||
"pipeline_mode": "<targeted|standard|comprehensive>",
|
||||
"capabilities": [{ "name": "<cap>", "prefix": "<PREFIX>", "keywords": ["..."] }],
|
||||
"dependency_graph": { "<TASK-ID>": { "role": "<role>", "blockedBy": ["..."], "layer": "L1|L2|L3" } },
|
||||
"roles": [{ "name": "<role>", "prefix": "<PREFIX>", "inner_loop": true }],
|
||||
"complexity": { "score": 0, "level": "Low|Medium|High" },
|
||||
"coverage_targets": { "L1": 80, "L2": 60, "L3": 40 },
|
||||
"gc_loop_enabled": true
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,106 @@
|
||||
# Dispatch Tasks
|
||||
|
||||
Create testing task chains with correct dependencies. Supports targeted, standard, and comprehensive pipelines.
|
||||
|
||||
## Workflow
|
||||
|
||||
1. Read task-analysis.json -> extract pipeline_mode and dependency_graph
|
||||
2. Read specs/pipelines.md -> get task registry for selected pipeline
|
||||
3. Topological sort tasks (respect deps)
|
||||
4. Validate all owners exist in role registry (SKILL.md)
|
||||
5. For each task (in order):
|
||||
- Add task entry to tasks.json `tasks` object (see template below)
|
||||
- Set deps array with upstream task IDs
|
||||
6. Update tasks.json metadata: total count
|
||||
7. Validate chain (no orphans, no cycles, all refs valid)
|
||||
|
||||
## Task Entry Template
|
||||
|
||||
Each task in tasks.json `tasks` object:
|
||||
```json
|
||||
{
|
||||
"<TASK-ID>": {
|
||||
"title": "<concise title>",
|
||||
"description": "PURPOSE: <goal> | Success: <criteria>\nTASK:\n - <step 1>\n - <step 2>\nCONTEXT:\n - Session: <session-folder>\n - Scope: <scope>\n - Layer: <L1-unit|L2-integration|L3-e2e>\n - Upstream artifacts: <artifact-1>, <artifact-2>\n - Shared memory: <session>/wisdom/.msg/meta.json\nEXPECTED: <deliverable path> + <quality criteria>\nCONSTRAINTS: <scope limits, focus areas>\n---\nInnerLoop: <true|false>\nRoleSpec: <project>/.codex/skills/team-testing/roles/<role>/role.md",
|
||||
"role": "<role-name>",
|
||||
"prefix": "<PREFIX>",
|
||||
"deps": ["<upstream-task-id>"],
|
||||
"status": "pending",
|
||||
"findings": null,
|
||||
"error": null
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Pipeline Task Registry
|
||||
|
||||
### Targeted Pipeline
|
||||
```
|
||||
STRATEGY-001 (strategist): Analyze change scope, define test strategy
|
||||
deps: []
|
||||
TESTGEN-001 (generator): Generate L1 unit tests
|
||||
deps: [STRATEGY-001], meta: layer=L1-unit
|
||||
TESTRUN-001 (executor): Execute L1 tests, collect coverage
|
||||
deps: [TESTGEN-001], inner_loop: true, meta: layer=L1-unit, coverage_target=80%
|
||||
```
|
||||
|
||||
### Standard Pipeline
|
||||
```
|
||||
STRATEGY-001 (strategist): Analyze change scope, define test strategy
|
||||
deps: []
|
||||
TESTGEN-001 (generator): Generate L1 unit tests
|
||||
deps: [STRATEGY-001], meta: layer=L1-unit
|
||||
TESTRUN-001 (executor): Execute L1 tests, collect coverage
|
||||
deps: [TESTGEN-001], inner_loop: true, meta: layer=L1-unit, coverage_target=80%
|
||||
TESTGEN-002 (generator): Generate L2 integration tests
|
||||
deps: [TESTRUN-001], meta: layer=L2-integration
|
||||
TESTRUN-002 (executor): Execute L2 tests, collect coverage
|
||||
deps: [TESTGEN-002], inner_loop: true, meta: layer=L2-integration, coverage_target=60%
|
||||
TESTANA-001 (analyst): Defect pattern analysis, quality report
|
||||
deps: [TESTRUN-002]
|
||||
```
|
||||
|
||||
### Comprehensive Pipeline
|
||||
```
|
||||
STRATEGY-001 (strategist): Analyze change scope, define test strategy
|
||||
deps: []
|
||||
TESTGEN-001 (generator-1): Generate L1 unit tests
|
||||
deps: [STRATEGY-001], meta: layer=L1-unit
|
||||
TESTGEN-002 (generator-2): Generate L2 integration tests
|
||||
deps: [STRATEGY-001], meta: layer=L2-integration
|
||||
TESTRUN-001 (executor-1): Execute L1 tests, collect coverage
|
||||
deps: [TESTGEN-001], inner_loop: true, meta: layer=L1-unit, coverage_target=80%
|
||||
TESTRUN-002 (executor-2): Execute L2 tests, collect coverage
|
||||
deps: [TESTGEN-002], inner_loop: true, meta: layer=L2-integration, coverage_target=60%
|
||||
TESTGEN-003 (generator): Generate L3 E2E tests
|
||||
deps: [TESTRUN-001, TESTRUN-002], meta: layer=L3-e2e
|
||||
TESTRUN-003 (executor): Execute L3 tests, collect coverage
|
||||
deps: [TESTGEN-003], inner_loop: true, meta: layer=L3-e2e, coverage_target=40%
|
||||
TESTANA-001 (analyst): Defect pattern analysis, quality report
|
||||
deps: [TESTRUN-003]
|
||||
```
|
||||
|
||||
## InnerLoop Flag Rules
|
||||
|
||||
- true: generator, executor roles (GC loop iterations)
|
||||
- false: strategist, analyst roles
|
||||
|
||||
## Dependency Validation
|
||||
|
||||
- No orphan tasks (all tasks have valid owner)
|
||||
- No circular dependencies
|
||||
- All deps references exist in tasks object
|
||||
- Session reference in every task description
|
||||
- RoleSpec reference in every task description
|
||||
|
||||
## Log After Creation
|
||||
|
||||
```
|
||||
mcp__ccw-tools__team_msg({
|
||||
operation: "log",
|
||||
session_id: <session-id>,
|
||||
from: "coordinator",
|
||||
type: "pipeline_selected",
|
||||
data: { pipeline: "<mode>", task_count: <N> }
|
||||
})
|
||||
```
|
||||
242
.codex/skills/team-testing/roles/coordinator/commands/monitor.md
Normal file
242
.codex/skills/team-testing/roles/coordinator/commands/monitor.md
Normal file
@@ -0,0 +1,242 @@
|
||||
# Monitor Pipeline
|
||||
|
||||
Synchronous pipeline coordination using spawn_agent + wait_agent.
|
||||
|
||||
## Constants
|
||||
|
||||
- WORKER_AGENT: team_worker
|
||||
- ONE_STEP_PER_INVOCATION: false (synchronous wait loop)
|
||||
- FAST_ADVANCE_AWARE: true
|
||||
- MAX_GC_ROUNDS: 3
|
||||
|
||||
## Handler Router
|
||||
|
||||
| Source | Handler |
|
||||
|--------|---------|
|
||||
| "capability_gap" | handleAdapt |
|
||||
| "check" or "status" | handleCheck |
|
||||
| "resume" or "continue" | handleResume |
|
||||
| All tasks completed | handleComplete |
|
||||
| Default | handleSpawnNext |
|
||||
|
||||
## Role-Worker Map
|
||||
|
||||
| Prefix | Role | Role Spec | inner_loop |
|
||||
|--------|------|-----------|------------|
|
||||
| STRATEGY-* | strategist | `<project>/.codex/skills/team-testing/roles/strategist/role.md` | false |
|
||||
| TESTGEN-* | generator | `<project>/.codex/skills/team-testing/roles/generator/role.md` | true |
|
||||
| TESTRUN-* | executor | `<project>/.codex/skills/team-testing/roles/executor/role.md` | true |
|
||||
| TESTANA-* | analyst | `<project>/.codex/skills/team-testing/roles/analyst/role.md` | false |
|
||||
|
||||
## handleCheck
|
||||
|
||||
Read-only status report from tasks.json, then STOP.
|
||||
|
||||
1. Read tasks.json
|
||||
2. Count tasks by status (pending, in_progress, completed, failed)
|
||||
|
||||
Output:
|
||||
```
|
||||
[coordinator] Testing Pipeline Status
|
||||
[coordinator] Mode: <pipeline_mode>
|
||||
[coordinator] Progress: <done>/<total> (<pct>%)
|
||||
[coordinator] GC Rounds: L1: <n>/3, L2: <n>/3
|
||||
|
||||
[coordinator] Pipeline Graph:
|
||||
STRATEGY-001: <done|run|wait> test-strategy.md
|
||||
TESTGEN-001: <done|run|wait> generating L1...
|
||||
TESTRUN-001: <done|run|wait> blocked by TESTGEN-001
|
||||
TESTGEN-002: <done|run|wait> blocked by TESTRUN-001
|
||||
TESTRUN-002: <done|run|wait> blocked by TESTGEN-002
|
||||
TESTANA-001: <done|run|wait> blocked by TESTRUN-*
|
||||
|
||||
[coordinator] Active agents: <list with elapsed time>
|
||||
[coordinator] Ready: <pending tasks with resolved deps>
|
||||
[coordinator] Commands: 'resume' to advance | 'check' to refresh
|
||||
```
|
||||
|
||||
Then STOP.
|
||||
|
||||
## handleResume
|
||||
|
||||
1. Read tasks.json, check active_agents
|
||||
2. No active agents -> handleSpawnNext
|
||||
3. Has active agents -> check each status
|
||||
- completed -> mark done
|
||||
- in_progress -> still running
|
||||
4. Some completed -> handleSpawnNext
|
||||
5. All running -> report status, STOP
|
||||
|
||||
## handleSpawnNext
|
||||
|
||||
Find ready tasks, spawn workers, wait for completion, process results.
|
||||
|
||||
1. Read tasks.json
|
||||
2. Collect:
|
||||
- completedTasks: status = completed
|
||||
- inProgressTasks: status = in_progress
|
||||
- readyTasks: status = pending AND all deps in completedTasks
|
||||
|
||||
3. No ready + work in progress -> report waiting, STOP
|
||||
4. No ready + nothing in progress -> handleComplete
|
||||
5. Has ready -> for each ready task:
|
||||
a. Determine role from prefix (use Role-Worker Map)
|
||||
b. Check if inner loop role (generator/executor) with active worker -> skip (worker picks up next task)
|
||||
c. Update task status in tasks.json -> in_progress
|
||||
d. team_msg log -> task_unblocked
|
||||
|
||||
### Spawn Workers
|
||||
|
||||
For each ready task:
|
||||
|
||||
```javascript
|
||||
// 1) Update status in tasks.json
|
||||
state.tasks[taskId].status = 'in_progress'
|
||||
|
||||
// 2) Spawn worker
|
||||
const agentId = spawn_agent({
|
||||
agent_type: "team_worker",
|
||||
items: [
|
||||
{ type: "text", text: `## Role Assignment
|
||||
role: ${task.role}
|
||||
role_spec: ${skillRoot}/roles/${task.role}/role.md
|
||||
session: ${sessionFolder}
|
||||
session_id: ${sessionId}
|
||||
team_name: testing
|
||||
requirement: ${task.description}
|
||||
inner_loop: ${task.role === 'generator' || task.role === 'executor'}
|
||||
|
||||
## Current Task
|
||||
- Task ID: ${taskId}
|
||||
- Task: ${task.title}` },
|
||||
|
||||
{ type: "text", text: `Read role_spec file (${skillRoot}/roles/${task.role}/role.md) to load Phase 2-4 domain instructions.
|
||||
Execute built-in Phase 1 (task discovery) -> role Phase 2-4 -> built-in Phase 5 (report).` },
|
||||
|
||||
{ type: "text", text: `## Task Context
|
||||
task_id: ${taskId}
|
||||
title: ${task.title}
|
||||
description: ${task.description}` },
|
||||
|
||||
{ type: "text", text: `## Upstream Context\n${prevContext}` }
|
||||
]
|
||||
})
|
||||
|
||||
// 3) Track agent
|
||||
state.active_agents[taskId] = { agentId, role: task.role, started_at: now }
|
||||
```
|
||||
|
||||
6. **Parallel spawn** (comprehensive pipeline):
|
||||
- TESTGEN-001 + TESTGEN-002 both unblocked -> spawn both in parallel (name: "generator-1", "generator-2")
|
||||
- TESTRUN-001 + TESTRUN-002 both unblocked -> spawn both in parallel (name: "executor-1", "executor-2")
|
||||
|
||||
### Wait and Process Results
|
||||
|
||||
After spawning all ready tasks:
|
||||
|
||||
```javascript
|
||||
// 4) Batch wait for all spawned workers
|
||||
const agentIds = Object.values(state.active_agents).map(a => a.agentId)
|
||||
wait_agent({ ids: agentIds, timeout_ms: 900000 })
|
||||
|
||||
// 5) Collect results
|
||||
for (const [taskId, agent] of Object.entries(state.active_agents)) {
|
||||
state.tasks[taskId].status = 'completed'
|
||||
close_agent({ id: agent.agentId })
|
||||
delete state.active_agents[taskId]
|
||||
}
|
||||
```
|
||||
|
||||
### GC Checkpoint (TESTRUN-* completes)
|
||||
|
||||
After TESTRUN-* completion, read meta.json for executor.pass_rate and executor.coverage:
|
||||
- (pass_rate >= 0.95 AND coverage >= target) OR gc_rounds[layer] >= MAX_GC_ROUNDS -> proceed
|
||||
- (pass_rate < 0.95 OR coverage < target) AND gc_rounds[layer] < MAX_GC_ROUNDS -> create GC fix tasks, increment gc_rounds[layer]
|
||||
|
||||
**GC Fix Task Creation** (when coverage below target):
|
||||
|
||||
Add to tasks.json:
|
||||
```json
|
||||
{
|
||||
"TESTGEN-<layer>-fix-<round>": {
|
||||
"title": "Revise <layer> tests (GC #<round>)",
|
||||
"description": "PURPOSE: Revise tests to fix failures and improve coverage | Success: pass_rate >= 0.95 AND coverage >= target\nTASK:\n - Read previous test results and failure details\n - Revise tests to address failures\n - Improve coverage for uncovered areas\nCONTEXT:\n - Session: <session-folder>\n - Layer: <layer>\n - Previous results: <session>/results/run-<N>.json\nEXPECTED: Revised test files in <session>/tests/<layer>/\nCONSTRAINTS: Only modify test files\n---\nInnerLoop: true\nRoleSpec: <project>/.codex/skills/team-testing/roles/generator/role.md",
|
||||
"role": "generator",
|
||||
"prefix": "TESTGEN",
|
||||
"deps": [],
|
||||
"status": "pending",
|
||||
"findings": null,
|
||||
"error": null
|
||||
},
|
||||
"TESTRUN-<layer>-fix-<round>": {
|
||||
"title": "Re-execute <layer> (GC #<round>)",
|
||||
"description": "PURPOSE: Re-execute tests after revision | Success: pass_rate >= 0.95\nCONTEXT:\n - Session: <session-folder>\n - Layer: <layer>\n - Input: tests/<layer>\nEXPECTED: <session>/results/run-<N>-gc.json\n---\nInnerLoop: true\nRoleSpec: <project>/.codex/skills/team-testing/roles/executor/role.md",
|
||||
"role": "executor",
|
||||
"prefix": "TESTRUN",
|
||||
"deps": ["TESTGEN-<layer>-fix-<round>"],
|
||||
"status": "pending",
|
||||
"findings": null,
|
||||
"error": null
|
||||
}
|
||||
}
|
||||
```
|
||||
Update tasks.json gc_rounds[layer]++
|
||||
|
||||
### Persist and Loop
|
||||
|
||||
After processing all results:
|
||||
1. Write updated tasks.json
|
||||
2. Check if more tasks are now ready (deps newly resolved)
|
||||
3. If yes -> loop back to step 1 of handleSpawnNext
|
||||
4. If no more ready and all done -> handleComplete
|
||||
5. If no more ready but some still blocked -> report status, STOP
|
||||
|
||||
## handleComplete
|
||||
|
||||
Pipeline done. Generate report and completion action.
|
||||
|
||||
1. Verify all tasks (including any GC fix tasks) have status "completed" or "failed"
|
||||
2. If any tasks incomplete -> return to handleSpawnNext
|
||||
3. If all complete:
|
||||
- Read final state from meta.json (analyst.quality_score, executor.coverage, gc_rounds)
|
||||
- Generate summary (deliverables, task count, GC rounds, coverage metrics)
|
||||
4. Execute completion action per tasks.json completion_action:
|
||||
- interactive -> request_user_input (Archive/Keep/Deepen Coverage)
|
||||
- auto_archive -> Archive & Clean (rm -rf session folder)
|
||||
- auto_keep -> Keep Active (status=paused)
|
||||
|
||||
## handleAdapt
|
||||
|
||||
Capability gap reported mid-pipeline.
|
||||
|
||||
1. Parse gap description
|
||||
2. Check if existing role covers it -> redirect
|
||||
3. Role count < 5 -> generate dynamic role-spec in <session>/role-specs/
|
||||
4. Add new task to tasks.json, spawn worker via spawn_agent + wait_agent
|
||||
5. Role count >= 5 -> merge or pause
|
||||
|
||||
## Fast-Advance Reconciliation
|
||||
|
||||
On every coordinator wake:
|
||||
1. Read team_msg entries with type="fast_advance"
|
||||
2. Sync active_agents with spawned successors
|
||||
3. No duplicate spawns
|
||||
|
||||
## Phase 4: State Persistence
|
||||
|
||||
After every handler execution:
|
||||
1. Reconcile active_agents with actual tasks.json states
|
||||
2. Remove entries for completed/failed tasks
|
||||
3. Write updated tasks.json
|
||||
4. STOP (wait for next invocation)
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Scenario | Resolution |
|
||||
|----------|------------|
|
||||
| Session file not found | Error, suggest re-initialization |
|
||||
| Unknown role in callback | Log info, scan for other completions |
|
||||
| GC loop exceeded (3 rounds) | Accept current coverage with warning, proceed |
|
||||
| Pipeline stall | Check deps chains, report to user |
|
||||
| Coverage tool unavailable | Degrade to pass rate judgment |
|
||||
| Worker crash | Reset task to pending in tasks.json, respawn via spawn_agent |
|
||||
151
.codex/skills/team-testing/roles/coordinator/role.md
Normal file
151
.codex/skills/team-testing/roles/coordinator/role.md
Normal file
@@ -0,0 +1,151 @@
|
||||
# Coordinator Role
|
||||
|
||||
Orchestrate team-testing: analyze -> dispatch -> spawn -> monitor -> report.
|
||||
|
||||
## Identity
|
||||
- Name: coordinator | Tag: [coordinator]
|
||||
- Responsibility: Change scope analysis -> Create session -> Dispatch tasks -> Monitor progress -> Report results
|
||||
|
||||
## Boundaries
|
||||
|
||||
### MUST
|
||||
- Spawn workers via `spawn_agent({ agent_type: "team_worker" })` and wait via `wait_agent`
|
||||
- Follow Command Execution Protocol for dispatch and monitor commands
|
||||
- Respect pipeline stage dependencies (deps)
|
||||
- Handle Generator-Critic cycles with max 3 iterations per layer
|
||||
- Execute completion action in Phase 5
|
||||
|
||||
### MUST NOT
|
||||
- Implement domain logic (test generation, execution, analysis) -- workers handle this
|
||||
- Spawn workers without creating tasks first
|
||||
- Skip quality gates when coverage is below target
|
||||
- Modify test files or source code directly -- delegate to workers
|
||||
- Force-advance pipeline past failed GC loops
|
||||
|
||||
## Command Execution Protocol
|
||||
When coordinator needs to execute a specific phase:
|
||||
1. Read `commands/<command>.md`
|
||||
2. Follow the workflow defined in the command
|
||||
3. Commands are inline execution guides, NOT separate agents
|
||||
4. Execute synchronously, complete before proceeding
|
||||
|
||||
## Entry Router
|
||||
|
||||
| Detection | Condition | Handler |
|
||||
|-----------|-----------|---------|
|
||||
| Status check | Args contain "check" or "status" | -> handleCheck (monitor.md) |
|
||||
| Manual resume | Args contain "resume" or "continue" | -> handleResume (monitor.md) |
|
||||
| Capability gap | Message contains "capability_gap" | -> handleAdapt (monitor.md) |
|
||||
| Pipeline complete | All tasks completed | -> handleComplete (monitor.md) |
|
||||
| Interrupted session | Active session in .workflow/.team/TST-* | -> Phase 0 |
|
||||
| New session | None of above | -> Phase 1 |
|
||||
|
||||
For check/resume/adapt/complete: load @commands/monitor.md, execute handler, STOP.
|
||||
|
||||
## Phase 0: Session Resume Check
|
||||
|
||||
1. Scan .workflow/.team/TST-*/tasks.json for active/paused sessions
|
||||
2. No sessions -> Phase 1
|
||||
3. Single session -> reconcile:
|
||||
a. Read tasks.json, reset in_progress -> pending
|
||||
b. Rebuild active_agents map
|
||||
c. Kick first ready task via handleSpawnNext
|
||||
4. Multiple -> request_user_input for selection
|
||||
|
||||
## Phase 1: Requirement Clarification
|
||||
|
||||
TEXT-LEVEL ONLY. No source code reading.
|
||||
|
||||
1. Parse task description from $ARGUMENTS
|
||||
2. Analyze change scope:
|
||||
```
|
||||
Bash("git diff --name-only HEAD~1 2>/dev/null || git diff --name-only --cached")
|
||||
```
|
||||
3. Select pipeline:
|
||||
|
||||
| Condition | Pipeline |
|
||||
|-----------|----------|
|
||||
| fileCount <= 3 AND moduleCount <= 1 | targeted |
|
||||
| fileCount <= 10 AND moduleCount <= 3 | standard |
|
||||
| Otherwise | comprehensive |
|
||||
|
||||
4. Clarify if ambiguous (request_user_input for scope)
|
||||
5. Delegate to @commands/analyze.md
|
||||
6. Output: task-analysis.json
|
||||
7. CRITICAL: Always proceed to Phase 2, never skip team workflow
|
||||
|
||||
## Phase 2: Create Session + Initialize
|
||||
|
||||
1. Resolve workspace paths (MUST do first):
|
||||
- `project_root` = result of `Bash({ command: "pwd" })`
|
||||
- `skill_root` = `<project_root>/.codex/skills/team-testing`
|
||||
2. Generate session ID: TST-<slug>-<date>
|
||||
3. Create session folder structure:
|
||||
```bash
|
||||
mkdir -p .workflow/.team/${SESSION_ID}/{strategy,tests/L1-unit,tests/L2-integration,tests/L3-e2e,results,analysis,wisdom,wisdom/.msg}
|
||||
```
|
||||
4. Read specs/pipelines.md -> select pipeline based on mode
|
||||
5. Initialize pipeline via team_msg state_update:
|
||||
```
|
||||
mcp__ccw-tools__team_msg({
|
||||
operation: "log", session_id: "<id>", from: "coordinator",
|
||||
type: "state_update", summary: "Session initialized",
|
||||
data: {
|
||||
pipeline_mode: "<targeted|standard|comprehensive>",
|
||||
pipeline_stages: ["strategist", "generator", "executor", "analyst"],
|
||||
team_name: "testing",
|
||||
coverage_targets: { "L1": 80, "L2": 60, "L3": 40 },
|
||||
gc_rounds: {}
|
||||
}
|
||||
})
|
||||
```
|
||||
6. Write initial tasks.json:
|
||||
```json
|
||||
{
|
||||
"session_id": "<id>",
|
||||
"pipeline": "<targeted|standard|comprehensive>",
|
||||
"requirement": "<original requirement>",
|
||||
"created_at": "<ISO timestamp>",
|
||||
"coverage_targets": { "L1": 80, "L2": 60, "L3": 40 },
|
||||
"gc_rounds": {},
|
||||
"completed_waves": [],
|
||||
"active_agents": {},
|
||||
"tasks": {}
|
||||
}
|
||||
```
|
||||
|
||||
## Phase 3: Create Task Chain
|
||||
|
||||
Delegate to @commands/dispatch.md:
|
||||
1. Read specs/pipelines.md for selected pipeline's task registry
|
||||
2. Topological sort tasks
|
||||
3. Write tasks to tasks.json with deps arrays
|
||||
4. Update tasks.json metadata
|
||||
|
||||
## Phase 4: Spawn-and-Wait
|
||||
|
||||
Delegate to @commands/monitor.md#handleSpawnNext:
|
||||
1. Find ready tasks (pending + deps resolved)
|
||||
2. Spawn team_worker agents via spawn_agent
|
||||
3. Wait for completion via wait_agent
|
||||
4. Process results, advance pipeline
|
||||
5. Repeat until all waves complete or pipeline blocked
|
||||
|
||||
## Phase 5: Report + Completion Action
|
||||
|
||||
1. Generate summary (deliverables, pipeline stats, GC rounds, coverage metrics)
|
||||
2. Execute completion action per tasks.json completion_action:
|
||||
- interactive -> request_user_input (Archive/Keep/Deepen Coverage)
|
||||
- auto_archive -> Archive & Clean (rm -rf session folder)
|
||||
- auto_keep -> Keep Active
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Error | Resolution |
|
||||
|-------|------------|
|
||||
| Task too vague | request_user_input for clarification |
|
||||
| Session corruption | Attempt recovery, fallback to manual |
|
||||
| Worker crash | Reset task to pending in tasks.json, respawn via spawn_agent |
|
||||
| Dependency cycle | Detect in analysis, halt |
|
||||
| GC loop exceeded (3 rounds) | Accept current coverage, log to wisdom, proceed |
|
||||
| Coverage tool unavailable | Degrade to pass rate judgment |
|
||||
96
.codex/skills/team-testing/roles/executor/role.md
Normal file
96
.codex/skills/team-testing/roles/executor/role.md
Normal file
@@ -0,0 +1,96 @@
|
||||
---
|
||||
role: executor
|
||||
prefix: TESTRUN
|
||||
inner_loop: true
|
||||
message_types:
|
||||
success: tests_passed
|
||||
failure: tests_failed
|
||||
coverage: coverage_report
|
||||
error: error
|
||||
---
|
||||
|
||||
# Test Executor
|
||||
|
||||
Execute tests, collect coverage, attempt auto-fix for failures. Acts as the Critic in the Generator-Critic loop. Reports pass rate and coverage for coordinator GC decisions.
|
||||
|
||||
## Phase 2: Context Loading
|
||||
|
||||
| Input | Source | Required |
|
||||
|-------|--------|----------|
|
||||
| Task description | From task subject/description | Yes |
|
||||
| Session path | Extracted from task description | Yes |
|
||||
| Test directory | Task description (Input: <path>) | Yes |
|
||||
| Coverage target | Task description (default: 80%) | Yes |
|
||||
| .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
|
||||
|
||||
1. Extract session path and test directory from task description
|
||||
2. Load test specs: Run `ccw spec load --category test` for test framework conventions and coverage targets
|
||||
3. Extract coverage target (default: 80%)
|
||||
3. Read .msg/meta.json for framework info (from strategist namespace)
|
||||
4. Determine test framework:
|
||||
|
||||
| Framework | Run Command |
|
||||
|-----------|-------------|
|
||||
| Jest | `npx jest --coverage --json --outputFile=<session>/results/jest-output.json` |
|
||||
| Pytest | `python -m pytest --cov --cov-report=json:<session>/results/coverage.json -v` |
|
||||
| Vitest | `npx vitest run --coverage --reporter=json` |
|
||||
|
||||
5. Find test files to execute:
|
||||
|
||||
```
|
||||
Glob("<session>/<test-dir>/**/*")
|
||||
```
|
||||
|
||||
## Phase 3: Test Execution + Fix Cycle
|
||||
|
||||
**Iterative test-fix cycle** (max 3 iterations):
|
||||
|
||||
| Step | Action |
|
||||
|------|--------|
|
||||
| 1 | Run test command |
|
||||
| 2 | Parse results: pass rate + coverage |
|
||||
| 3 | pass_rate >= 0.95 AND coverage >= target -> success, exit |
|
||||
| 4 | Extract failing test details |
|
||||
| 5 | Delegate fix to CLI tool (gemini write mode) |
|
||||
| 6 | Increment iteration; >= 3 -> exit with failures |
|
||||
|
||||
```
|
||||
Bash("<test-command> 2>&1 || true")
|
||||
```
|
||||
|
||||
**Auto-fix delegation** (on failure):
|
||||
|
||||
```
|
||||
Bash(`ccw cli -p "PURPOSE: Fix test failures to achieve pass rate >= 0.95; success = all tests pass
|
||||
TASK: • Analyze test failure output • Identify root causes • Fix test code only (not source) • Preserve test intent
|
||||
MODE: write
|
||||
CONTEXT: @<session>/<test-dir>/**/* | Memory: Test framework: <framework>, iteration <N>/3
|
||||
EXPECTED: Fixed test files with: corrected assertions, proper async handling, fixed imports, maintained coverage
|
||||
CONSTRAINTS: Only modify test files | Preserve test structure | No source code changes
|
||||
Test failures:
|
||||
<test-output>" --tool gemini --mode write --cd <session>`)
|
||||
```
|
||||
|
||||
**Save results**: `<session>/results/run-<N>.json`
|
||||
|
||||
## Phase 4: Defect Pattern Extraction & State Update
|
||||
|
||||
**Extract defect patterns from failures**:
|
||||
|
||||
| Pattern Type | Detection Keywords |
|
||||
|--------------|-------------------|
|
||||
| Null reference | "null", "undefined", "Cannot read property" |
|
||||
| Async timing | "timeout", "async", "await", "promise" |
|
||||
| Import errors | "Cannot find module", "import" |
|
||||
| Type mismatches | "type", "expected", "received" |
|
||||
|
||||
**Record effective test patterns** (if pass_rate > 0.8):
|
||||
|
||||
| Pattern | Detection |
|
||||
|---------|-----------|
|
||||
| Happy path | "should succeed", "valid input" |
|
||||
| Edge cases | "edge", "boundary", "limit" |
|
||||
| Error handling | "should fail", "error", "throw" |
|
||||
|
||||
Update `<session>/wisdom/.msg/meta.json` under `executor` namespace:
|
||||
- Merge `{ "executor": { pass_rate, coverage, defect_patterns, effective_patterns, coverage_history_entry } }`
|
||||
95
.codex/skills/team-testing/roles/generator/role.md
Normal file
95
.codex/skills/team-testing/roles/generator/role.md
Normal file
@@ -0,0 +1,95 @@
|
||||
---
|
||||
role: generator
|
||||
prefix: TESTGEN
|
||||
inner_loop: true
|
||||
message_types:
|
||||
success: tests_generated
|
||||
revision: tests_revised
|
||||
error: error
|
||||
---
|
||||
|
||||
# Test Generator
|
||||
|
||||
Generate test code by layer (L1 unit / L2 integration / L3 E2E). Acts as the Generator in the Generator-Critic loop. Supports revision mode for GC loop iterations.
|
||||
|
||||
## Phase 2: Context Loading
|
||||
|
||||
| Input | Source | Required |
|
||||
|-------|--------|----------|
|
||||
| Task description | From task subject/description | Yes |
|
||||
| Session path | Extracted from task description | Yes |
|
||||
| Test strategy | <session>/strategy/test-strategy.md | Yes |
|
||||
| .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
|
||||
|
||||
1. Extract session path and layer from task description
|
||||
2. Load test specs: Run `ccw spec load --category test` for test framework conventions and coverage targets
|
||||
3. Read test strategy:
|
||||
|
||||
```
|
||||
Read("<session>/strategy/test-strategy.md")
|
||||
```
|
||||
|
||||
3. Read source files to test (from strategy priority_files, limit 20)
|
||||
4. Read .msg/meta.json for framework and scope context
|
||||
|
||||
5. Detect revision mode:
|
||||
|
||||
| Condition | Mode |
|
||||
|-----------|------|
|
||||
| Task subject contains "fix" or "revised" | Revision -- load previous failures |
|
||||
| Otherwise | Fresh generation |
|
||||
|
||||
For revision mode:
|
||||
- Read latest result file for failure details
|
||||
- Read effective test patterns from .msg/meta.json
|
||||
|
||||
6. Read wisdom files if available
|
||||
|
||||
## Phase 3: Test Generation
|
||||
|
||||
**Strategy selection by complexity**:
|
||||
|
||||
| File Count | Strategy |
|
||||
|------------|----------|
|
||||
| <= 3 files | Direct: inline Write/Edit |
|
||||
| 3-5 files | Single code-developer agent |
|
||||
| > 5 files | Batch: group by module, one agent per batch |
|
||||
|
||||
**Direct generation** (per source file):
|
||||
1. Generate test path: `<session>/tests/<layer>/<test-file>`
|
||||
2. Generate test code: happy path, edge cases, error handling
|
||||
3. Write test file
|
||||
|
||||
**CLI delegation** (medium/high complexity):
|
||||
|
||||
```
|
||||
Bash(`ccw cli -p "PURPOSE: Generate <layer> tests using <framework> to achieve coverage target; success = all priority files covered with quality tests
|
||||
TASK: • Analyze source files • Generate test cases (happy path, edge cases, errors) • Write test files with proper structure • Ensure import resolution
|
||||
MODE: write
|
||||
CONTEXT: @<source-files> @<session>/strategy/test-strategy.md | Memory: Framework: <framework>, Layer: <layer>, Round: <round>
|
||||
<if-revision: Previous failures: <failure-details>
|
||||
Effective patterns: <patterns-from-meta>>
|
||||
EXPECTED: Test files in <session>/tests/<layer>/ with: proper test structure, comprehensive coverage, correct imports, framework conventions
|
||||
CONSTRAINTS: Follow test strategy priorities | Use framework best practices | <layer>-appropriate assertions
|
||||
Source files to test:
|
||||
<file-list-with-content>" --tool gemini --mode write --cd <session>`)
|
||||
```
|
||||
|
||||
**Output verification**:
|
||||
|
||||
```
|
||||
Glob("<session>/tests/<layer>/**/*")
|
||||
```
|
||||
|
||||
## Phase 4: Self-Validation & State Update
|
||||
|
||||
**Validation checks**:
|
||||
|
||||
| Check | Method | Action on Fail |
|
||||
|-------|--------|----------------|
|
||||
| Syntax | `tsc --noEmit` or equivalent | Auto-fix imports/types |
|
||||
| File count | Count generated files | Report issue |
|
||||
| Import resolution | Check broken imports | Fix import paths |
|
||||
|
||||
Update `<session>/wisdom/.msg/meta.json` under `generator` namespace:
|
||||
- Merge `{ "generator": { test_files, layer, round, is_revision } }`
|
||||
83
.codex/skills/team-testing/roles/strategist/role.md
Normal file
83
.codex/skills/team-testing/roles/strategist/role.md
Normal file
@@ -0,0 +1,83 @@
|
||||
---
|
||||
role: strategist
|
||||
prefix: STRATEGY
|
||||
inner_loop: false
|
||||
message_types:
|
||||
success: strategy_ready
|
||||
error: error
|
||||
---
|
||||
|
||||
# Test Strategist
|
||||
|
||||
Analyze git diff, determine test layers, define coverage targets, and formulate test strategy with prioritized execution order.
|
||||
|
||||
## Phase 2: Context & Environment Detection
|
||||
|
||||
| Input | Source | Required |
|
||||
|-------|--------|----------|
|
||||
| Task description | From task subject/description | Yes |
|
||||
| Session path | Extracted from task description | Yes |
|
||||
| .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
|
||||
|
||||
1. Extract session path and scope from task description
|
||||
2. Get git diff for change analysis:
|
||||
|
||||
```
|
||||
Bash("git diff HEAD~1 --name-only 2>/dev/null || git diff --cached --name-only")
|
||||
Bash("git diff HEAD~1 -- <changed-files> 2>/dev/null || git diff --cached -- <changed-files>")
|
||||
```
|
||||
|
||||
3. Detect test framework from project files:
|
||||
|
||||
| Signal File | Framework | Test Pattern |
|
||||
|-------------|-----------|-------------|
|
||||
| jest.config.js/ts | Jest | `**/*.test.{ts,tsx,js}` |
|
||||
| vitest.config.ts/js | Vitest | `**/*.test.{ts,tsx}` |
|
||||
| pytest.ini / pyproject.toml | Pytest | `**/test_*.py` |
|
||||
| No detection | Default | Jest patterns |
|
||||
|
||||
4. Scan existing test patterns:
|
||||
|
||||
```
|
||||
Glob("**/*.test.*")
|
||||
Glob("**/*.spec.*")
|
||||
```
|
||||
|
||||
5. Read .msg/meta.json if exists for session context
|
||||
|
||||
## Phase 3: Strategy Formulation
|
||||
|
||||
**Change analysis dimensions**:
|
||||
|
||||
| Change Type | Analysis | Priority |
|
||||
|-------------|----------|----------|
|
||||
| New files | Need new tests | High |
|
||||
| Modified functions | Need updated tests | Medium |
|
||||
| Deleted files | Need test cleanup | Low |
|
||||
| Config changes | May need integration tests | Variable |
|
||||
|
||||
**Strategy output structure**:
|
||||
|
||||
1. **Change Analysis Table**: File, Change Type, Impact, Priority
|
||||
2. **Test Layer Recommendations**:
|
||||
- L1 Unit: Scope, Coverage Target, Priority Files, Patterns
|
||||
- L2 Integration: Scope, Coverage Target, Integration Points
|
||||
- L3 E2E: Scope, Coverage Target, User Scenarios
|
||||
3. **Risk Assessment**: Risk, Probability, Impact, Mitigation
|
||||
4. **Test Execution Order**: Prioritized sequence
|
||||
|
||||
Write strategy to `<session>/strategy/test-strategy.md`
|
||||
|
||||
**Self-validation**:
|
||||
|
||||
| Check | Criteria | Fallback |
|
||||
|-------|----------|----------|
|
||||
| Has L1 scope | L1 scope not empty | Default to all changed files |
|
||||
| Has coverage targets | L1 target > 0 | Use defaults (80/60/40) |
|
||||
| Has priority files | List not empty | Use all changed files |
|
||||
|
||||
## Phase 4: Wisdom & State Update
|
||||
|
||||
1. Write discoveries to `<session>/wisdom/conventions.md` (detected framework, patterns)
|
||||
2. Update `<session>/wisdom/.msg/meta.json` under `strategist` namespace:
|
||||
- Read existing -> merge `{ "strategist": { framework, layers, coverage_targets, priority_files, risks } }` -> write back
|
||||
Reference in New Issue
Block a user