mirror of
https://github.com/catlog22/Claude-Code-Workflow.git
synced 2026-03-11 17:21:03 +08:00
Add unit tests for various components and stores in the terminal dashboard
- Implement tests for AssociationHighlight, DashboardToolbar, QueuePanel, SessionGroupTree, and TerminalDashboardPage to ensure proper functionality and state management. - Create tests for cliSessionStore, issueQueueIntegrationStore, queueExecutionStore, queueSchedulerStore, sessionManagerStore, and terminalGridStore to validate state resets and workspace scoping. - Mock necessary dependencies and state management hooks to isolate tests and ensure accurate behavior.
This commit is contained in:
776
.codex/skills/team-quality-assurance/SKILL.md
Normal file
776
.codex/skills/team-quality-assurance/SKILL.md
Normal file
@@ -0,0 +1,776 @@
|
||||
---
|
||||
name: team-quality-assurance
|
||||
description: Full closed-loop QA combining issue discovery and software testing. Scout -> Strategist -> Generator -> Executor -> Analyst with multi-perspective scanning, progressive test layers, GC loops, and quality scoring. Supports discovery, testing, and full QA modes.
|
||||
argument-hint: "[-y|--yes] [-c|--concurrency N] [--continue] [--mode=discovery|testing|full] \"task description\""
|
||||
allowed-tools: spawn_agents_on_csv, spawn_agent, wait, send_input, close_agent, Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion
|
||||
---
|
||||
|
||||
## Auto Mode
|
||||
|
||||
When `--yes` or `-y`: Auto-confirm task decomposition, skip interactive validation, use defaults.
|
||||
|
||||
# Team Quality Assurance
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
$team-quality-assurance "Full QA for the authentication module"
|
||||
$team-quality-assurance --mode=discovery "Scan codebase for security and bug issues"
|
||||
$team-quality-assurance --mode=testing "Test recent changes with progressive coverage"
|
||||
$team-quality-assurance -c 4 --mode=full "Complete QA cycle with regression scanning"
|
||||
$team-quality-assurance -y "QA all changed files since last commit"
|
||||
$team-quality-assurance --continue "qa-auth-module-20260308"
|
||||
```
|
||||
|
||||
**Flags**:
|
||||
- `-y, --yes`: Skip all confirmations (auto mode)
|
||||
- `-c, --concurrency N`: Max concurrent agents within each wave (default: 3)
|
||||
- `--continue`: Resume existing session
|
||||
- `--mode=discovery|testing|full`: Force QA mode (default: auto-detect or full)
|
||||
|
||||
**Output Directory**: `.workflow/.csv-wave/{session-id}/`
|
||||
**Core Output**: `tasks.csv` (master state) + `results.csv` (final) + `discoveries.ndjson` (shared exploration) + `context.md` (human-readable report)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Orchestrate multi-agent QA pipeline: scout -> strategist -> generator -> executor -> analyst. Supports three modes: **discovery** (issue scanning), **testing** (progressive test coverage), and **full** (closed-loop QA with regression). Multi-perspective scanning from bug, security, test-coverage, code-quality, and UX viewpoints. Progressive layer coverage (L1/L2/L3) with Generator-Critic loops for coverage convergence.
|
||||
|
||||
**Execution Model**: Hybrid -- CSV wave pipeline (primary) + individual agent spawn (secondary)
|
||||
|
||||
```
|
||||
+-------------------------------------------------------------------+
|
||||
| TEAM QUALITY ASSURANCE WORKFLOW |
|
||||
+-------------------------------------------------------------------+
|
||||
| |
|
||||
| Phase 0: Pre-Wave Interactive (Requirement Clarification) |
|
||||
| +- Parse task description, detect QA mode |
|
||||
| +- Mode selection (discovery/testing/full) |
|
||||
| +- Output: refined requirements for decomposition |
|
||||
| |
|
||||
| Phase 1: Requirement -> CSV + Classification |
|
||||
| +- Select pipeline based on QA mode |
|
||||
| +- Build dependency chain with appropriate roles |
|
||||
| +- Classify tasks: csv-wave | interactive (exec_mode) |
|
||||
| +- Compute dependency waves (topological sort) |
|
||||
| +- Generate tasks.csv with wave + exec_mode columns |
|
||||
| +- User validates task breakdown (skip if -y) |
|
||||
| |
|
||||
| Phase 2: Wave Execution Engine (Extended) |
|
||||
| +- For each wave (1..N): |
|
||||
| | +- Execute pre-wave interactive tasks (if any) |
|
||||
| | +- Build wave CSV (filter csv-wave tasks for this wave) |
|
||||
| | +- Inject previous findings into prev_context column |
|
||||
| | +- spawn_agents_on_csv(wave CSV) |
|
||||
| | +- Execute post-wave interactive tasks (if any) |
|
||||
| | +- Merge all results into master tasks.csv |
|
||||
| | +- GC Loop Check: coverage < target? -> spawn fix tasks |
|
||||
| | +- Check: any failed? -> skip dependents |
|
||||
| +- discoveries.ndjson shared across all modes (append-only) |
|
||||
| |
|
||||
| Phase 3: Post-Wave Interactive (Completion Action) |
|
||||
| +- Pipeline completion report with quality score |
|
||||
| +- Interactive completion choice (Archive/Keep/Export) |
|
||||
| +- Final aggregation / report |
|
||||
| |
|
||||
| Phase 4: Results Aggregation |
|
||||
| +- Export final results.csv |
|
||||
| +- Generate context.md with all findings |
|
||||
| +- Display summary: completed/failed/skipped per wave |
|
||||
| +- Offer: view results | retry failed | done |
|
||||
| |
|
||||
+-------------------------------------------------------------------+
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task Classification Rules
|
||||
|
||||
Each task is classified by `exec_mode`:
|
||||
|
||||
| exec_mode | Mechanism | Criteria |
|
||||
|-----------|-----------|----------|
|
||||
| `csv-wave` | `spawn_agents_on_csv` | One-shot, structured I/O, no multi-round interaction |
|
||||
| `interactive` | `spawn_agent`/`wait`/`send_input`/`close_agent` | Multi-round, needs iterative fix-verify cycles |
|
||||
|
||||
**Classification Decision**:
|
||||
|
||||
| Task Property | Classification |
|
||||
|---------------|---------------|
|
||||
| Multi-perspective code scanning (scout) | `csv-wave` |
|
||||
| Strategy formulation (single-pass analysis) | `csv-wave` |
|
||||
| Test generation (single-pass code creation) | `csv-wave` |
|
||||
| Test execution with auto-fix cycle | `interactive` |
|
||||
| Quality analysis (single-pass report) | `csv-wave` |
|
||||
| GC loop fix-verify iteration | `interactive` |
|
||||
| Regression scanning (post-fix) | `csv-wave` |
|
||||
|
||||
---
|
||||
|
||||
## CSV Schema
|
||||
|
||||
### tasks.csv (Master State)
|
||||
|
||||
```csv
|
||||
id,title,description,role,perspective,layer,coverage_target,deps,context_from,exec_mode,wave,status,findings,issues_found,pass_rate,coverage_achieved,test_files,quality_score,error
|
||||
"SCOUT-001","Multi-perspective code scan","Scan codebase from bug, security, test-coverage, code-quality perspectives. Produce severity-ranked findings with file:line references.","scout","bug;security;test-coverage;code-quality","","","","","csv-wave","1","pending","","","","","","",""
|
||||
"QASTRAT-001","Test strategy formulation","Analyze scout findings and code changes. Determine test layers, define coverage targets, generate test strategy document.","strategist","","","","SCOUT-001","SCOUT-001","csv-wave","2","pending","","","","","","",""
|
||||
"QAGEN-L1-001","Generate L1 unit tests","Generate L1 unit tests based on strategy. Cover priority files, include happy path, edge cases, error handling.","generator","","L1","80","QASTRAT-001","QASTRAT-001","csv-wave","3","pending","","","","","","",""
|
||||
```
|
||||
|
||||
**Columns**:
|
||||
|
||||
| Column | Phase | Description |
|
||||
|--------|-------|-------------|
|
||||
| `id` | Input | Unique task identifier (PREFIX-NNN format) |
|
||||
| `title` | Input | Short task title |
|
||||
| `description` | Input | Detailed task description (self-contained) |
|
||||
| `role` | Input | Worker role: `scout`, `strategist`, `generator`, `executor`, `analyst` |
|
||||
| `perspective` | Input | Scan perspectives (semicolon-separated, scout only) |
|
||||
| `layer` | Input | Test layer: `L1`, `L2`, `L3`, or empty for non-layer tasks |
|
||||
| `coverage_target` | Input | Target coverage percentage for this layer (empty if N/A) |
|
||||
| `deps` | Input | Semicolon-separated dependency task IDs |
|
||||
| `context_from` | Input | Semicolon-separated task IDs whose findings this task needs |
|
||||
| `exec_mode` | Input | `csv-wave` or `interactive` |
|
||||
| `wave` | Computed | Wave number (computed by topological sort, 1-based) |
|
||||
| `status` | Output | `pending` -> `completed` / `failed` / `skipped` |
|
||||
| `findings` | Output | Key discoveries or implementation notes (max 500 chars) |
|
||||
| `issues_found` | Output | Count of issues discovered (scout/analyst) |
|
||||
| `pass_rate` | Output | Test pass rate as decimal (executor only) |
|
||||
| `coverage_achieved` | Output | Actual coverage percentage achieved (executor only) |
|
||||
| `test_files` | Output | Semicolon-separated paths of test files (generator only) |
|
||||
| `quality_score` | Output | Quality score 0-100 (analyst only) |
|
||||
| `error` | Output | Error message if failed (empty if success) |
|
||||
|
||||
### Per-Wave CSV (Temporary)
|
||||
|
||||
Each wave generates a temporary `wave-{N}.csv` with extra `prev_context` column (csv-wave tasks only).
|
||||
|
||||
---
|
||||
|
||||
## Agent Registry (Interactive Agents)
|
||||
|
||||
| Agent | Role File | Pattern | Responsibility | Position |
|
||||
|-------|-----------|---------|----------------|----------|
|
||||
| Test Executor | agents/executor.md | 2.3 (send_input cycle) | Execute tests with iterative fix cycle, report pass rate and coverage | per-wave |
|
||||
| GC Loop Handler | agents/gc-loop-handler.md | 2.3 (send_input cycle) | Manage Generator-Critic loop: evaluate coverage, trigger fix rounds | post-wave |
|
||||
|
||||
> **COMPACT PROTECTION**: Agent files are execution documents. When context compression occurs, **you MUST immediately `Read` the corresponding agent.md** to reload.
|
||||
|
||||
---
|
||||
|
||||
## Output Artifacts
|
||||
|
||||
| File | Purpose | Lifecycle |
|
||||
|------|---------|-----------|
|
||||
| `tasks.csv` | Master state -- all tasks with status/findings | Updated after each wave |
|
||||
| `wave-{N}.csv` | Per-wave input (temporary, csv-wave tasks only) | Created before wave, deleted after |
|
||||
| `results.csv` | Final export of all task results | Created in Phase 4 |
|
||||
| `discoveries.ndjson` | Shared exploration board (all agents, both modes) | Append-only, carries across waves |
|
||||
| `context.md` | Human-readable execution report | Created in Phase 4 |
|
||||
| `scan/scan-results.json` | Scout output: multi-perspective scan results | Created in scout wave |
|
||||
| `strategy/test-strategy.md` | Strategist output: test strategy document | Created in strategy wave |
|
||||
| `tests/L1-unit/` | Generator output: L1 unit test files | Created in L1 wave |
|
||||
| `tests/L2-integration/` | Generator output: L2 integration test files | Created in L2 wave |
|
||||
| `tests/L3-e2e/` | Generator output: L3 E2E test files | Created in L3 wave |
|
||||
| `results/run-{layer}.json` | Executor output: per-layer test results | Created per execution |
|
||||
| `analysis/quality-report.md` | Analyst output: quality analysis report | Created in final wave |
|
||||
| `interactive/{id}-result.json` | Results from interactive tasks | Created per interactive task |
|
||||
|
||||
---
|
||||
|
||||
## Session Structure
|
||||
|
||||
```
|
||||
.workflow/.csv-wave/{session-id}/
|
||||
+-- tasks.csv # Master state (all tasks, both modes)
|
||||
+-- results.csv # Final results export
|
||||
+-- discoveries.ndjson # Shared discovery board (all agents)
|
||||
+-- context.md # Human-readable report
|
||||
+-- wave-{N}.csv # Temporary per-wave input (csv-wave only)
|
||||
+-- scan/ # Scout output
|
||||
| +-- scan-results.json
|
||||
+-- strategy/ # Strategist output
|
||||
| +-- test-strategy.md
|
||||
+-- tests/ # Generator output
|
||||
| +-- L1-unit/
|
||||
| +-- L2-integration/
|
||||
| +-- L3-e2e/
|
||||
+-- results/ # Executor output
|
||||
| +-- run-L1.json
|
||||
| +-- run-L2.json
|
||||
+-- analysis/ # Analyst output
|
||||
| +-- quality-report.md
|
||||
+-- wisdom/ # Cross-task knowledge
|
||||
| +-- learnings.md
|
||||
| +-- conventions.md
|
||||
| +-- decisions.md
|
||||
| +-- issues.md
|
||||
+-- interactive/ # Interactive task artifacts
|
||||
| +-- {id}-result.json
|
||||
+-- gc-state.json # GC loop tracking state
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation
|
||||
|
||||
### Session Initialization
|
||||
|
||||
```javascript
|
||||
const getUtc8ISOString = () => new Date(Date.now() + 8 * 60 * 60 * 1000).toISOString()
|
||||
|
||||
const AUTO_YES = $ARGUMENTS.includes('--yes') || $ARGUMENTS.includes('-y')
|
||||
const continueMode = $ARGUMENTS.includes('--continue')
|
||||
const concurrencyMatch = $ARGUMENTS.match(/(?:--concurrency|-c)\s+(\d+)/)
|
||||
const maxConcurrency = concurrencyMatch ? parseInt(concurrencyMatch[1]) : 3
|
||||
|
||||
// Parse QA mode flag
|
||||
const modeMatch = $ARGUMENTS.match(/--mode=(\w+)/)
|
||||
const explicitMode = modeMatch ? modeMatch[1] : null
|
||||
|
||||
const requirement = $ARGUMENTS
|
||||
.replace(/--yes|-y|--continue|--concurrency\s+\d+|-c\s+\d+|--mode=\w+/g, '')
|
||||
.trim()
|
||||
|
||||
const slug = requirement.toLowerCase()
|
||||
.replace(/[^a-z0-9\u4e00-\u9fa5]+/g, '-')
|
||||
.substring(0, 40)
|
||||
const dateStr = getUtc8ISOString().substring(0, 10).replace(/-/g, '')
|
||||
const sessionId = `qa-${slug}-${dateStr}`
|
||||
const sessionFolder = `.workflow/.csv-wave/${sessionId}`
|
||||
|
||||
Bash(`mkdir -p ${sessionFolder}/scan ${sessionFolder}/strategy ${sessionFolder}/tests/L1-unit ${sessionFolder}/tests/L2-integration ${sessionFolder}/tests/L3-e2e ${sessionFolder}/results ${sessionFolder}/analysis ${sessionFolder}/wisdom ${sessionFolder}/interactive`)
|
||||
|
||||
// Initialize discoveries.ndjson
|
||||
Write(`${sessionFolder}/discoveries.ndjson`, '')
|
||||
|
||||
// Initialize wisdom files
|
||||
Write(`${sessionFolder}/wisdom/learnings.md`, '# Learnings\n')
|
||||
Write(`${sessionFolder}/wisdom/conventions.md`, '# Conventions\n')
|
||||
Write(`${sessionFolder}/wisdom/decisions.md`, '# Decisions\n')
|
||||
Write(`${sessionFolder}/wisdom/issues.md`, '# Issues\n')
|
||||
|
||||
// Initialize GC state
|
||||
Write(`${sessionFolder}/gc-state.json`, JSON.stringify({
|
||||
rounds: {}, coverage_history: [], max_rounds_per_layer: 3
|
||||
}, null, 2))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Phase 0: Pre-Wave Interactive (Requirement Clarification)
|
||||
|
||||
**Objective**: Parse task description, detect QA mode, prepare for decomposition.
|
||||
|
||||
**Workflow**:
|
||||
|
||||
1. **Parse user task description** from $ARGUMENTS
|
||||
|
||||
2. **Check for existing sessions** (continue mode):
|
||||
- Scan `.workflow/.csv-wave/qa-*/tasks.csv` for sessions with pending tasks
|
||||
- If `--continue`: resume the specified or most recent session, skip to Phase 2
|
||||
- If active session found: ask user whether to resume or start new
|
||||
|
||||
3. **QA Mode Selection**:
|
||||
|
||||
| Condition | Mode | Description |
|
||||
|-----------|------|-------------|
|
||||
| Explicit `--mode=discovery` | discovery | Scout-first: issue discovery then testing |
|
||||
| Explicit `--mode=testing` | testing | Skip scout, direct test pipeline |
|
||||
| Explicit `--mode=full` | full | Complete QA closed loop + regression scan |
|
||||
| Keywords: discovery, scan, issue, audit | discovery | Auto-detected discovery mode |
|
||||
| Keywords: test, coverage, TDD, verify | testing | Auto-detected testing mode |
|
||||
| No explicit flag and no keyword match | full | Default to full QA |
|
||||
|
||||
4. **Clarify if ambiguous** (skip if AUTO_YES):
|
||||
```javascript
|
||||
AskUserQuestion({
|
||||
questions: [{
|
||||
question: "Detected QA mode: '" + qaMode + "'. Confirm?",
|
||||
header: "QA Mode Selection",
|
||||
multiSelect: false,
|
||||
options: [
|
||||
{ label: "Proceed with " + qaMode, description: "Detected mode is appropriate" },
|
||||
{ label: "Use discovery", description: "Scout-first: scan for issues, then test" },
|
||||
{ label: "Use testing", description: "Direct testing pipeline (skip scout)" },
|
||||
{ label: "Use full", description: "Complete QA closed loop with regression" }
|
||||
]
|
||||
}]
|
||||
})
|
||||
```
|
||||
|
||||
5. **Output**: Refined requirement, QA mode, scope
|
||||
|
||||
**Success Criteria**:
|
||||
- QA mode selected
|
||||
- Refined requirements available for Phase 1 decomposition
|
||||
|
||||
---
|
||||
|
||||
### Phase 1: Requirement -> CSV + Classification
|
||||
|
||||
**Objective**: Decompose QA task into dependency-ordered CSV tasks based on selected mode.
|
||||
|
||||
**Decomposition Rules**:
|
||||
|
||||
1. **Select pipeline based on QA mode**:
|
||||
|
||||
| Mode | Pipeline |
|
||||
|------|----------|
|
||||
| discovery | SCOUT-001 -> QASTRAT-001 -> QAGEN-001 -> QARUN-001 -> QAANA-001 |
|
||||
| testing | QASTRAT-001 -> QAGEN-L1-001 -> QARUN-L1-001 -> QAGEN-L2-001 -> QARUN-L2-001 -> QAANA-001 |
|
||||
| full | SCOUT-001 -> QASTRAT-001 -> [QAGEN-L1-001, QAGEN-L2-001] -> [QARUN-L1-001, QARUN-L2-001] -> QAANA-001 -> SCOUT-002 |
|
||||
|
||||
2. **Assign roles, layers, perspectives, and coverage targets** per task
|
||||
|
||||
3. **Assign exec_mode**:
|
||||
- Scout, Strategist, Generator, Analyst tasks: `csv-wave` (single-pass)
|
||||
- Executor tasks: `interactive` (iterative fix cycle)
|
||||
|
||||
**Classification Rules**:
|
||||
|
||||
| Task Property | exec_mode |
|
||||
|---------------|-----------|
|
||||
| Multi-perspective scanning (single-pass) | `csv-wave` |
|
||||
| Strategy analysis (single-pass read + write) | `csv-wave` |
|
||||
| Test code generation (single-pass write) | `csv-wave` |
|
||||
| Test execution with fix loop (multi-round) | `interactive` |
|
||||
| Quality analysis (single-pass read + write) | `csv-wave` |
|
||||
| Regression scanning (single-pass) | `csv-wave` |
|
||||
|
||||
**Wave Computation**: Kahn's BFS topological sort with depth tracking.
|
||||
|
||||
**User Validation**: Display task breakdown with wave + exec_mode + role assignment (skip if AUTO_YES).
|
||||
|
||||
**Success Criteria**:
|
||||
- tasks.csv created with valid schema, wave, and exec_mode assignments
|
||||
- No circular dependencies
|
||||
- User approved (or AUTO_YES)
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Wave Execution Engine (Extended)
|
||||
|
||||
**Objective**: Execute tasks wave-by-wave with hybrid mechanism support, GC loop handling, and cross-wave context propagation.
|
||||
|
||||
```javascript
|
||||
const masterCsv = Read(`${sessionFolder}/tasks.csv`)
|
||||
let tasks = parseCsv(masterCsv)
|
||||
const maxWave = Math.max(...tasks.map(t => t.wave))
|
||||
|
||||
for (let wave = 1; wave <= maxWave; wave++) {
|
||||
console.log(`\nWave ${wave}/${maxWave}`)
|
||||
|
||||
// 1. Separate tasks by exec_mode
|
||||
const waveTasks = tasks.filter(t => t.wave === wave && t.status === 'pending')
|
||||
const csvTasks = waveTasks.filter(t => t.exec_mode === 'csv-wave')
|
||||
const interactiveTasks = waveTasks.filter(t => t.exec_mode === 'interactive')
|
||||
|
||||
// 2. Check dependencies -- skip tasks whose deps failed
|
||||
for (const task of waveTasks) {
|
||||
const depIds = (task.deps || '').split(';').filter(Boolean)
|
||||
const depStatuses = depIds.map(id => tasks.find(t => t.id === id)?.status)
|
||||
if (depStatuses.some(s => s === 'failed' || s === 'skipped')) {
|
||||
task.status = 'skipped'
|
||||
task.error = `Dependency failed: ${depIds.filter((id, i) =>
|
||||
['failed','skipped'].includes(depStatuses[i])).join(', ')}`
|
||||
}
|
||||
}
|
||||
|
||||
// 3. Execute csv-wave tasks
|
||||
const pendingCsvTasks = csvTasks.filter(t => t.status === 'pending')
|
||||
if (pendingCsvTasks.length > 0) {
|
||||
for (const task of pendingCsvTasks) {
|
||||
task.prev_context = buildPrevContext(task, tasks)
|
||||
}
|
||||
|
||||
Write(`${sessionFolder}/wave-${wave}.csv`, toCsv(pendingCsvTasks))
|
||||
|
||||
// Read instruction template
|
||||
Read(`instructions/agent-instruction.md`)
|
||||
|
||||
// Build instruction with session folder baked in
|
||||
const instruction = buildQAInstruction(sessionFolder, wave)
|
||||
|
||||
spawn_agents_on_csv({
|
||||
csv_path: `${sessionFolder}/wave-${wave}.csv`,
|
||||
id_column: "id",
|
||||
instruction: instruction,
|
||||
max_concurrency: maxConcurrency,
|
||||
max_runtime_seconds: 900,
|
||||
output_csv_path: `${sessionFolder}/wave-${wave}-results.csv`,
|
||||
output_schema: {
|
||||
type: "object",
|
||||
properties: {
|
||||
id: { type: "string" },
|
||||
status: { type: "string", enum: ["completed", "failed"] },
|
||||
findings: { type: "string" },
|
||||
issues_found: { type: "string" },
|
||||
pass_rate: { type: "string" },
|
||||
coverage_achieved: { type: "string" },
|
||||
test_files: { type: "string" },
|
||||
quality_score: { type: "string" },
|
||||
error: { type: "string" }
|
||||
}
|
||||
}
|
||||
})
|
||||
|
||||
// Merge results
|
||||
const results = parseCsv(Read(`${sessionFolder}/wave-${wave}-results.csv`))
|
||||
for (const r of results) {
|
||||
const t = tasks.find(t => t.id === r.id)
|
||||
if (t) Object.assign(t, r)
|
||||
}
|
||||
}
|
||||
|
||||
// 4. Execute interactive tasks (executor with fix cycle)
|
||||
const pendingInteractive = interactiveTasks.filter(t => t.status === 'pending')
|
||||
for (const task of pendingInteractive) {
|
||||
Read(`agents/executor.md`)
|
||||
|
||||
const prevContext = buildPrevContext(task, tasks)
|
||||
const agent = spawn_agent({
|
||||
message: `## TASK ASSIGNMENT\n\n### MANDATORY FIRST STEPS\n1. Read: agents/executor.md\n2. Read: ${sessionFolder}/discoveries.ndjson\n3. Read: .workflow/project-tech.json (if exists)\n\n---\n\nGoal: ${task.description}\nLayer: ${task.layer}\nCoverage Target: ${task.coverage_target}%\nSession: ${sessionFolder}\n\n### Previous Context\n${prevContext}`
|
||||
})
|
||||
const result = wait({ ids: [agent], timeout_ms: 900000 })
|
||||
if (result.timed_out) {
|
||||
send_input({ id: agent, message: "Please finalize current test results and report." })
|
||||
wait({ ids: [agent], timeout_ms: 120000 })
|
||||
}
|
||||
Write(`${sessionFolder}/interactive/${task.id}-result.json`, JSON.stringify({
|
||||
task_id: task.id, status: "completed", findings: parseFindings(result),
|
||||
timestamp: getUtc8ISOString()
|
||||
}))
|
||||
close_agent({ id: agent })
|
||||
task.status = result.success ? 'completed' : 'failed'
|
||||
task.findings = parseFindings(result)
|
||||
}
|
||||
|
||||
// 5. GC Loop Check (after executor completes)
|
||||
for (const task of pendingInteractive.filter(t => t.role === 'executor')) {
|
||||
const gcState = JSON.parse(Read(`${sessionFolder}/gc-state.json`))
|
||||
const layer = task.layer
|
||||
const rounds = gcState.rounds[layer] || 0
|
||||
const coverageAchieved = parseFloat(task.coverage_achieved || '0')
|
||||
const coverageTarget = parseFloat(task.coverage_target || '80')
|
||||
const passRate = parseFloat(task.pass_rate || '0')
|
||||
|
||||
if (coverageAchieved < coverageTarget && passRate < 0.95 && rounds < 3) {
|
||||
gcState.rounds[layer] = rounds + 1
|
||||
Write(`${sessionFolder}/gc-state.json`, JSON.stringify(gcState, null, 2))
|
||||
|
||||
Read(`agents/gc-loop-handler.md`)
|
||||
const gcAgent = spawn_agent({
|
||||
message: `## GC LOOP ROUND ${rounds + 1}\n\n### MANDATORY FIRST STEPS\n1. Read: agents/gc-loop-handler.md\n2. Read: ${sessionFolder}/discoveries.ndjson\n\nLayer: ${layer}\nRound: ${rounds + 1}/3\nCurrent Coverage: ${coverageAchieved}%\nTarget: ${coverageTarget}%\nPass Rate: ${passRate}\nSession: ${sessionFolder}\nPrevious Results: ${sessionFolder}/results/run-${layer}.json\nTest Directory: ${sessionFolder}/tests/${layer === 'L1' ? 'L1-unit' : layer === 'L2' ? 'L2-integration' : 'L3-e2e'}/`
|
||||
})
|
||||
const gcResult = wait({ ids: [gcAgent], timeout_ms: 900000 })
|
||||
close_agent({ id: gcAgent })
|
||||
}
|
||||
}
|
||||
|
||||
// 6. Update master CSV
|
||||
Write(`${sessionFolder}/tasks.csv`, toCsv(tasks))
|
||||
|
||||
// 7. Cleanup temp files
|
||||
Bash(`rm -f ${sessionFolder}/wave-${wave}.csv ${sessionFolder}/wave-${wave}-results.csv`)
|
||||
|
||||
// 8. Display wave summary
|
||||
const completed = waveTasks.filter(t => t.status === 'completed').length
|
||||
const failed = waveTasks.filter(t => t.status === 'failed').length
|
||||
const skipped = waveTasks.filter(t => t.status === 'skipped').length
|
||||
console.log(`Wave ${wave} Complete: ${completed} completed, ${failed} failed, ${skipped} skipped`)
|
||||
}
|
||||
```
|
||||
|
||||
**Success Criteria**:
|
||||
- All waves executed in order
|
||||
- Both csv-wave and interactive tasks handled per wave
|
||||
- Each wave's results merged into master CSV before next wave starts
|
||||
- GC loops triggered when coverage below target (max 3 rounds per layer)
|
||||
- Dependent tasks skipped when predecessor failed
|
||||
- discoveries.ndjson accumulated across all waves and mechanisms
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Post-Wave Interactive (Completion Action)
|
||||
|
||||
**Objective**: Pipeline completion report with quality score and interactive completion choice.
|
||||
|
||||
```javascript
|
||||
const tasks = parseCsv(Read(`${sessionFolder}/tasks.csv`))
|
||||
const completed = tasks.filter(t => t.status === 'completed')
|
||||
const failed = tasks.filter(t => t.status === 'failed')
|
||||
|
||||
// Quality score from analyst
|
||||
const analystTask = tasks.find(t => t.role === 'analyst' && t.status === 'completed')
|
||||
const qualityScore = analystTask?.quality_score || 'N/A'
|
||||
|
||||
// Scout issues count
|
||||
const scoutTasks = tasks.filter(t => t.role === 'scout' && t.status === 'completed')
|
||||
const totalIssues = scoutTasks.reduce((sum, t) => sum + parseInt(t.issues_found || '0'), 0)
|
||||
|
||||
// Coverage summary per layer
|
||||
const layerSummary = ['L1', 'L2', 'L3'].map(layer => {
|
||||
const execTask = tasks.find(t => t.role === 'executor' && t.layer === layer && t.status === 'completed')
|
||||
return execTask ? ` ${layer}: ${execTask.coverage_achieved}% coverage, ${execTask.pass_rate} pass rate` : null
|
||||
}).filter(Boolean).join('\n')
|
||||
|
||||
console.log(`
|
||||
============================================
|
||||
QA PIPELINE COMPLETE
|
||||
|
||||
Quality Score: ${qualityScore}/100
|
||||
Issues Discovered: ${totalIssues}
|
||||
|
||||
Deliverables:
|
||||
${completed.map(t => ` - ${t.id}: ${t.title} (${t.role})`).join('\n')}
|
||||
|
||||
Coverage:
|
||||
${layerSummary}
|
||||
|
||||
Pipeline: ${completed.length}/${tasks.length} tasks
|
||||
Session: ${sessionFolder}
|
||||
============================================
|
||||
`)
|
||||
|
||||
if (!AUTO_YES) {
|
||||
AskUserQuestion({
|
||||
questions: [{
|
||||
question: "Quality Assurance pipeline complete. What would you like to do?",
|
||||
header: "Completion",
|
||||
multiSelect: false,
|
||||
options: [
|
||||
{ label: "Archive & Clean (Recommended)", description: "Archive session, output final summary" },
|
||||
{ label: "Keep Active", description: "Keep session for follow-up work" },
|
||||
{ label: "Export Results", description: "Export deliverables to target directory" }
|
||||
]
|
||||
}]
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
**Success Criteria**:
|
||||
- Post-wave interactive processing complete
|
||||
- Quality score and coverage metrics displayed
|
||||
- User informed of results
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Results Aggregation
|
||||
|
||||
**Objective**: Generate final results and human-readable report.
|
||||
|
||||
```javascript
|
||||
// 1. Export results.csv
|
||||
Bash(`cp ${sessionFolder}/tasks.csv ${sessionFolder}/results.csv`)
|
||||
|
||||
// 2. Generate context.md
|
||||
const tasks = parseCsv(Read(`${sessionFolder}/tasks.csv`))
|
||||
const gcState = JSON.parse(Read(`${sessionFolder}/gc-state.json`))
|
||||
const analystTask = tasks.find(t => t.role === 'analyst' && t.status === 'completed')
|
||||
|
||||
let contextMd = `# Team Quality Assurance Report\n\n`
|
||||
contextMd += `**Session**: ${sessionId}\n`
|
||||
contextMd += `**Date**: ${getUtc8ISOString().substring(0, 10)}\n`
|
||||
contextMd += `**QA Mode**: ${explicitMode || 'full'}\n`
|
||||
contextMd += `**Quality Score**: ${analystTask?.quality_score || 'N/A'}/100\n\n`
|
||||
|
||||
contextMd += `## Summary\n`
|
||||
contextMd += `| Status | Count |\n|--------|-------|\n`
|
||||
contextMd += `| Completed | ${tasks.filter(t => t.status === 'completed').length} |\n`
|
||||
contextMd += `| Failed | ${tasks.filter(t => t.status === 'failed').length} |\n`
|
||||
contextMd += `| Skipped | ${tasks.filter(t => t.status === 'skipped').length} |\n\n`
|
||||
|
||||
// Scout findings
|
||||
const scoutTasks = tasks.filter(t => t.role === 'scout' && t.status === 'completed')
|
||||
if (scoutTasks.length > 0) {
|
||||
contextMd += `## Scout Findings\n\n`
|
||||
for (const t of scoutTasks) {
|
||||
contextMd += `**${t.title}**: ${t.issues_found || 0} issues found\n${t.findings || ''}\n\n`
|
||||
}
|
||||
}
|
||||
|
||||
// Coverage results
|
||||
contextMd += `## Coverage Results\n\n`
|
||||
contextMd += `| Layer | Coverage | Target | Pass Rate | GC Rounds |\n`
|
||||
contextMd += `|-------|----------|--------|-----------|----------|\n`
|
||||
for (const layer of ['L1', 'L2', 'L3']) {
|
||||
const execTask = tasks.find(t => t.role === 'executor' && t.layer === layer)
|
||||
if (execTask) {
|
||||
contextMd += `| ${layer} | ${execTask.coverage_achieved || 'N/A'}% | ${execTask.coverage_target}% | ${execTask.pass_rate || 'N/A'} | ${gcState.rounds[layer] || 0} |\n`
|
||||
}
|
||||
}
|
||||
contextMd += '\n'
|
||||
|
||||
// Wave execution details
|
||||
const maxWave = Math.max(...tasks.map(t => t.wave))
|
||||
contextMd += `## Wave Execution\n\n`
|
||||
for (let w = 1; w <= maxWave; w++) {
|
||||
const waveTasks = tasks.filter(t => t.wave === w)
|
||||
contextMd += `### Wave ${w}\n\n`
|
||||
for (const t of waveTasks) {
|
||||
const icon = t.status === 'completed' ? '[DONE]' : t.status === 'failed' ? '[FAIL]' : '[SKIP]'
|
||||
contextMd += `${icon} **${t.title}** [${t.role}/${t.layer || '-'}] ${t.findings || ''}\n\n`
|
||||
}
|
||||
}
|
||||
|
||||
Write(`${sessionFolder}/context.md`, contextMd)
|
||||
|
||||
console.log(`Results exported to: ${sessionFolder}/results.csv`)
|
||||
console.log(`Report generated at: ${sessionFolder}/context.md`)
|
||||
```
|
||||
|
||||
**Success Criteria**:
|
||||
- results.csv exported (all tasks, both modes)
|
||||
- context.md generated with quality score, scout findings, and coverage breakdown
|
||||
- Summary displayed to user
|
||||
|
||||
---
|
||||
|
||||
## Shared Discovery Board Protocol
|
||||
|
||||
All agents (csv-wave and interactive) share a single `discoveries.ndjson` file for cross-task knowledge exchange.
|
||||
|
||||
**Format**: One JSON object per line (NDJSON):
|
||||
|
||||
```jsonl
|
||||
{"ts":"2026-03-08T10:00:00Z","worker":"SCOUT-001","type":"issue_found","data":{"file":"src/auth.ts","line":42,"severity":"high","perspective":"security","description":"Hardcoded secret key in auth module"}}
|
||||
{"ts":"2026-03-08T10:05:00Z","worker":"QASTRAT-001","type":"framework_detected","data":{"framework":"vitest","config_file":"vitest.config.ts","test_pattern":"**/*.test.ts"}}
|
||||
{"ts":"2026-03-08T10:10:00Z","worker":"QAGEN-L1-001","type":"test_generated","data":{"file":"tests/L1-unit/auth.test.ts","source_file":"src/auth.ts","test_count":8}}
|
||||
{"ts":"2026-03-08T10:15:00Z","worker":"QARUN-L1-001","type":"defect_found","data":{"file":"src/auth.ts","line":42,"pattern":"null_reference","description":"Missing null check on token payload"}}
|
||||
```
|
||||
|
||||
**Discovery Types**:
|
||||
|
||||
| Type | Data Schema | Description |
|
||||
|------|-------------|-------------|
|
||||
| `issue_found` | `{file, line, severity, perspective, description}` | Issue discovered by scout |
|
||||
| `framework_detected` | `{framework, config_file, test_pattern}` | Test framework identified |
|
||||
| `test_generated` | `{file, source_file, test_count}` | Test file created |
|
||||
| `defect_found` | `{file, line, pattern, description}` | Defect pattern discovered during testing |
|
||||
| `coverage_gap` | `{file, current, target, gap}` | Coverage gap identified |
|
||||
| `convention_found` | `{pattern, example_file, description}` | Test convention detected |
|
||||
| `fix_applied` | `{test_file, fix_type, description}` | Test fix during GC loop |
|
||||
| `quality_metric` | `{dimension, score, details}` | Quality dimension score |
|
||||
|
||||
**Protocol**:
|
||||
1. Agents MUST read discoveries.ndjson at start of execution
|
||||
2. Agents MUST append relevant discoveries during execution
|
||||
3. Agents MUST NOT modify or delete existing entries
|
||||
4. Deduplication by `{type, data.file, data.line}` key (where applicable)
|
||||
|
||||
---
|
||||
|
||||
## Pipeline Definitions
|
||||
|
||||
### Discovery Mode (5 tasks, serial)
|
||||
|
||||
```
|
||||
SCOUT-001 -> QASTRAT-001 -> QAGEN-001 -> QARUN-001 -> QAANA-001
|
||||
```
|
||||
|
||||
| Task ID | Role | Layer | Wave | exec_mode |
|
||||
|---------|------|-------|------|-----------|
|
||||
| SCOUT-001 | scout | - | 1 | csv-wave |
|
||||
| QASTRAT-001 | strategist | - | 2 | csv-wave |
|
||||
| QAGEN-001 | generator | L1 | 3 | csv-wave |
|
||||
| QARUN-001 | executor | L1 | 4 | interactive |
|
||||
| QAANA-001 | analyst | - | 5 | csv-wave |
|
||||
|
||||
### Testing Mode (6 tasks, progressive layers)
|
||||
|
||||
```
|
||||
QASTRAT-001 -> QAGEN-L1-001 -> QARUN-L1-001 -> QAGEN-L2-001 -> QARUN-L2-001 -> QAANA-001
|
||||
```
|
||||
|
||||
| Task ID | Role | Layer | Wave | exec_mode |
|
||||
|---------|------|-------|------|-----------|
|
||||
| QASTRAT-001 | strategist | - | 1 | csv-wave |
|
||||
| QAGEN-L1-001 | generator | L1 | 2 | csv-wave |
|
||||
| QARUN-L1-001 | executor | L1 | 3 | interactive |
|
||||
| QAGEN-L2-001 | generator | L2 | 4 | csv-wave |
|
||||
| QARUN-L2-001 | executor | L2 | 5 | interactive |
|
||||
| QAANA-001 | analyst | - | 6 | csv-wave |
|
||||
|
||||
### Full Mode (8 tasks, parallel windows + regression)
|
||||
|
||||
```
|
||||
SCOUT-001 -> QASTRAT-001 -> [QAGEN-L1-001 // QAGEN-L2-001] -> [QARUN-L1-001 // QARUN-L2-001] -> QAANA-001 -> SCOUT-002
|
||||
```
|
||||
|
||||
| Task ID | Role | Layer | Wave | exec_mode |
|
||||
|---------|------|-------|------|-----------|
|
||||
| SCOUT-001 | scout | - | 1 | csv-wave |
|
||||
| QASTRAT-001 | strategist | - | 2 | csv-wave |
|
||||
| QAGEN-L1-001 | generator | L1 | 3 | csv-wave |
|
||||
| QAGEN-L2-001 | generator | L2 | 3 | csv-wave |
|
||||
| QARUN-L1-001 | executor | L1 | 4 | interactive |
|
||||
| QARUN-L2-001 | executor | L2 | 4 | interactive |
|
||||
| QAANA-001 | analyst | - | 5 | csv-wave |
|
||||
| SCOUT-002 | scout | - | 6 | csv-wave |
|
||||
|
||||
---
|
||||
|
||||
## GC Loop (Generator-Critic)
|
||||
|
||||
Generator and executor iterate per test layer until coverage converges:
|
||||
|
||||
```
|
||||
QAGEN -> QARUN -> (if coverage < target) -> GC Loop Handler
|
||||
(if coverage >= target) -> next wave
|
||||
```
|
||||
|
||||
- Max iterations: 3 per layer
|
||||
- After 3 iterations: accept current coverage with warning
|
||||
- GC loop runs as interactive agent (gc-loop-handler.md) which internally generates fixes and re-runs tests
|
||||
|
||||
---
|
||||
|
||||
## Scan Perspectives (Scout)
|
||||
|
||||
| Perspective | Focus |
|
||||
|-------------|-------|
|
||||
| bug | Logic errors, crash paths, null references |
|
||||
| security | Vulnerabilities, auth bypass, data exposure |
|
||||
| test-coverage | Untested code paths, missing assertions |
|
||||
| code-quality | Anti-patterns, complexity, maintainability |
|
||||
| ux | User-facing issues, accessibility (optional, when task mentions UX/UI) |
|
||||
|
||||
---
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Error | Resolution |
|
||||
|-------|------------|
|
||||
| Circular dependency | Detect in wave computation, abort with error message |
|
||||
| CSV agent timeout | Mark as failed in results, continue with wave |
|
||||
| CSV agent failed | Mark as failed, skip dependent tasks in later waves |
|
||||
| Interactive agent timeout | Urge convergence via send_input, then close if still timed out |
|
||||
| Interactive agent failed | Mark as failed, skip dependents |
|
||||
| All agents in wave failed | Log error, offer retry or abort |
|
||||
| CSV parse error | Validate CSV format before execution, show line number |
|
||||
| discoveries.ndjson corrupt | Ignore malformed lines, continue with valid entries |
|
||||
| Scout finds no issues | Report clean scan, proceed to testing (skip discovery-specific tasks) |
|
||||
| GC loop exceeded (3 rounds) | Accept current coverage with warning, proceed to next layer |
|
||||
| Test framework not detected | Default to Jest patterns |
|
||||
| Coverage tool unavailable | Degrade to pass rate judgment |
|
||||
| quality_score < 60 | Report with WARNING, suggest re-run with deeper coverage |
|
||||
| Continue mode: no session found | List available sessions, prompt user to select |
|
||||
|
||||
---
|
||||
|
||||
## Core Rules
|
||||
|
||||
1. **Start Immediately**: First action is session initialization, then Phase 0/1
|
||||
2. **Wave Order is Sacred**: Never execute wave N before wave N-1 completes and results are merged
|
||||
3. **CSV is Source of Truth**: Master tasks.csv holds all state (both csv-wave and interactive)
|
||||
4. **CSV First**: Default to csv-wave for tasks; only use interactive when multi-round interaction is required
|
||||
5. **Context Propagation**: prev_context built from master CSV, not from memory
|
||||
6. **Discovery Board is Append-Only**: Never clear, modify, or recreate discoveries.ndjson
|
||||
7. **Skip on Failure**: If a dependency failed, skip the dependent task
|
||||
8. **GC Loop Discipline**: Max 3 rounds per layer; never infinite-loop on coverage
|
||||
9. **Scout Feeds Strategy**: Scout findings flow into strategist via prev_context and discoveries.ndjson
|
||||
10. **Cleanup Temp Files**: Remove wave-{N}.csv after results are merged
|
||||
11. **DO NOT STOP**: Continuous execution until all waves complete or all remaining tasks are skipped
|
||||
192
.codex/skills/team-quality-assurance/agents/executor.md
Normal file
192
.codex/skills/team-quality-assurance/agents/executor.md
Normal file
@@ -0,0 +1,192 @@
|
||||
# Test Executor Agent
|
||||
|
||||
Interactive agent that executes test suites, collects coverage, and performs iterative auto-fix cycles. Acts as the Critic in the Generator-Critic loop within the QA pipeline.
|
||||
|
||||
## Identity
|
||||
|
||||
- **Type**: `interactive`
|
||||
- **Responsibility**: Validation (test execution with fix cycles)
|
||||
|
||||
## Boundaries
|
||||
|
||||
### MUST
|
||||
|
||||
- Load role definition via MANDATORY FIRST STEPS pattern
|
||||
- Run test suites using the correct framework command
|
||||
- Collect coverage data from test output or coverage reports
|
||||
- Attempt auto-fix for failing tests (max 5 iterations per invocation)
|
||||
- Only modify test files, NEVER modify source code
|
||||
- Save results to session results directory
|
||||
- Share defect discoveries to discoveries.ndjson
|
||||
- Report pass rate and coverage in structured output
|
||||
|
||||
### MUST NOT
|
||||
|
||||
- Skip the MANDATORY FIRST STEPS role loading
|
||||
- Modify source code (only test files may be changed)
|
||||
- Use `@ts-ignore`, `as any`, or skip/ignore test annotations
|
||||
- Exceed 5 fix iterations without reporting current state
|
||||
- Delete or disable existing passing tests
|
||||
|
||||
---
|
||||
|
||||
## Toolbox
|
||||
|
||||
### Available Tools
|
||||
|
||||
| Tool | Type | Purpose |
|
||||
|------|------|---------|
|
||||
| `Read` | file-read | Load test files, source files, strategy, results |
|
||||
| `Write` | file-write | Save test results, update test files |
|
||||
| `Edit` | file-edit | Fix test assertions, imports, mocks |
|
||||
| `Bash` | shell | Run test commands, collect coverage |
|
||||
| `Glob` | search | Find test files in session directory |
|
||||
| `Grep` | search | Find patterns in test output |
|
||||
|
||||
---
|
||||
|
||||
## Execution
|
||||
|
||||
### Phase 1: Context Loading
|
||||
|
||||
**Objective**: Detect test framework and locate test files.
|
||||
|
||||
**Input**:
|
||||
|
||||
| Source | Required | Description |
|
||||
|--------|----------|-------------|
|
||||
| Session folder | Yes | Path to session directory |
|
||||
| Layer | Yes | Target test layer (L1/L2/L3) |
|
||||
| Coverage target | Yes | Minimum coverage percentage |
|
||||
| Previous context | No | Findings from generator and scout |
|
||||
|
||||
**Steps**:
|
||||
|
||||
1. Read discoveries.ndjson for framework detection info
|
||||
2. Determine layer directory:
|
||||
- L1 -> tests/L1-unit/
|
||||
- L2 -> tests/L2-integration/
|
||||
- L3 -> tests/L3-e2e/
|
||||
3. Find test files in the layer directory
|
||||
4. Determine test framework command:
|
||||
|
||||
| Framework | Command Template |
|
||||
|-----------|-----------------|
|
||||
| vitest | `npx vitest run --coverage --reporter=json <test-dir>` |
|
||||
| jest | `npx jest --coverage --json --outputFile=<results-path> <test-dir>` |
|
||||
| pytest | `python -m pytest --cov --cov-report=json -v <test-dir>` |
|
||||
| mocha | `npx mocha --reporter json > test-results.json` |
|
||||
| default | `npm test -- --coverage` |
|
||||
|
||||
**Output**: Framework, test command, test file list
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Iterative Test-Fix Cycle
|
||||
|
||||
**Objective**: Run tests and fix failures up to 5 iterations.
|
||||
|
||||
**Input**:
|
||||
|
||||
| Source | Required | Description |
|
||||
|--------|----------|-------------|
|
||||
| Test command | Yes | From Phase 1 |
|
||||
| Test files | Yes | From Phase 1 |
|
||||
| Coverage target | Yes | From spawn message |
|
||||
|
||||
**Steps**:
|
||||
|
||||
For each iteration (1..5):
|
||||
|
||||
1. Run test command, capture stdout/stderr
|
||||
2. Parse results: extract passed/failed counts, parse coverage
|
||||
3. Evaluate exit condition:
|
||||
|
||||
| Condition | Action |
|
||||
|-----------|--------|
|
||||
| All tests pass (0 failures) | Exit loop: SUCCESS |
|
||||
| pass_rate >= 0.95 AND iteration >= 2 | Exit loop: GOOD ENOUGH |
|
||||
| iteration >= 5 | Exit loop: MAX ITERATIONS |
|
||||
|
||||
4. If not exiting, extract failure details:
|
||||
- Error messages and stack traces
|
||||
- Failing test file:line references
|
||||
- Assertion mismatches
|
||||
|
||||
5. Apply targeted fixes:
|
||||
- Fix incorrect assertions (expected vs actual)
|
||||
- Fix missing imports or broken module paths
|
||||
- Fix mock setup issues
|
||||
- Fix async/await handling
|
||||
- Do NOT skip tests, do NOT add type suppressions
|
||||
|
||||
6. Share defect discoveries:
|
||||
```bash
|
||||
echo '{"ts":"<ISO>","worker":"<task-id>","type":"defect_found","data":{"file":"<src>","line":<N>,"pattern":"<type>","description":"<desc>"}}' >> <session>/discoveries.ndjson
|
||||
```
|
||||
|
||||
**Output**: Final pass rate, coverage achieved, iteration count
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Result Recording
|
||||
|
||||
**Objective**: Save execution results and update state.
|
||||
|
||||
**Steps**:
|
||||
|
||||
1. Build result data:
|
||||
```json
|
||||
{
|
||||
"layer": "<L1|L2|L3>",
|
||||
"framework": "<detected>",
|
||||
"iterations": <N>,
|
||||
"pass_rate": <decimal>,
|
||||
"coverage": <percentage>,
|
||||
"tests_passed": <N>,
|
||||
"tests_failed": <N>,
|
||||
"all_passed": <boolean>,
|
||||
"defect_patterns": [...]
|
||||
}
|
||||
```
|
||||
|
||||
2. Save results to `<session>/results/run-<layer>.json`
|
||||
3. Save last test output to `<session>/results/output-<layer>.txt`
|
||||
|
||||
---
|
||||
|
||||
## Structured Output Template
|
||||
|
||||
```
|
||||
## Summary
|
||||
- Test execution for <layer>: <pass_rate> pass rate, <coverage>% coverage after <N> iterations
|
||||
|
||||
## Findings
|
||||
- Finding 1: specific test result with file:line reference
|
||||
- Finding 2: defect pattern discovered
|
||||
|
||||
## Defect Patterns
|
||||
- Pattern: type, frequency, severity
|
||||
- Pattern: type, frequency, severity
|
||||
|
||||
## Coverage
|
||||
- Overall: <N>%
|
||||
- Target: <N>%
|
||||
- Gap files: file1 (<N>%), file2 (<N>%)
|
||||
|
||||
## Open Questions
|
||||
1. Any unresolvable test failures (if any)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Scenario | Resolution |
|
||||
|----------|------------|
|
||||
| Test command not found | Try alternative commands (npx, npm test), report if all fail |
|
||||
| No test files found | Report in findings, status = failed |
|
||||
| Coverage tool unavailable | Degrade to pass rate only, report in findings |
|
||||
| All tests timeout | Report with partial results, status = failed |
|
||||
| Import resolution fails after fix | Report remaining failures, continue with other tests |
|
||||
| Timeout approaching | Output current findings with "PARTIAL" status |
|
||||
163
.codex/skills/team-quality-assurance/agents/gc-loop-handler.md
Normal file
163
.codex/skills/team-quality-assurance/agents/gc-loop-handler.md
Normal file
@@ -0,0 +1,163 @@
|
||||
# GC Loop Handler Agent
|
||||
|
||||
Interactive agent that manages Generator-Critic loop iterations within the QA pipeline. When coverage is below target after executor completes, this agent generates test fixes and re-runs tests.
|
||||
|
||||
## Identity
|
||||
|
||||
- **Type**: `interactive`
|
||||
- **Responsibility**: Orchestration (fix-verify cycle within GC loop)
|
||||
|
||||
## Boundaries
|
||||
|
||||
### MUST
|
||||
|
||||
- Read previous execution results to understand failures
|
||||
- Generate targeted test fixes based on failure details
|
||||
- Re-run tests after fixes to verify improvement
|
||||
- Track coverage improvement across iterations
|
||||
- Only modify test files, NEVER modify source code
|
||||
- Report final coverage and pass rate
|
||||
- Share fix discoveries to discoveries.ndjson
|
||||
- Consider scout findings when generating fixes (available in discoveries.ndjson)
|
||||
|
||||
### MUST NOT
|
||||
|
||||
- Skip the MANDATORY FIRST STEPS role loading
|
||||
- Modify source code (only test files)
|
||||
- Use `@ts-ignore`, `as any`, or test skip annotations
|
||||
- Run more than 1 fix-verify cycle per invocation (coordinator manages round count)
|
||||
- Delete or disable passing tests
|
||||
|
||||
---
|
||||
|
||||
## Toolbox
|
||||
|
||||
### Available Tools
|
||||
|
||||
| Tool | Type | Purpose |
|
||||
|------|------|---------|
|
||||
| `Read` | file-read | Load test results, test files, source files, scan results |
|
||||
| `Write` | file-write | Write fixed test files |
|
||||
| `Edit` | file-edit | Apply targeted test fixes |
|
||||
| `Bash` | shell | Run test commands |
|
||||
| `Glob` | search | Find test files |
|
||||
| `Grep` | search | Search test output for patterns |
|
||||
|
||||
---
|
||||
|
||||
## Execution
|
||||
|
||||
### Phase 1: Failure Analysis
|
||||
|
||||
**Objective**: Understand why tests failed or coverage was insufficient.
|
||||
|
||||
**Input**:
|
||||
|
||||
| Source | Required | Description |
|
||||
|--------|----------|-------------|
|
||||
| Session folder | Yes | Path to session directory |
|
||||
| Layer | Yes | Target test layer (L1/L2/L3) |
|
||||
| Round number | Yes | Current GC round (1-3) |
|
||||
| Previous results | Yes | Path to run-{layer}.json |
|
||||
|
||||
**Steps**:
|
||||
|
||||
1. Read previous execution results from results/run-{layer}.json
|
||||
2. Read test output from results/output-{layer}.txt
|
||||
3. Read discoveries.ndjson for scout-found issues (may inform additional test cases)
|
||||
4. Categorize failures:
|
||||
|
||||
| Failure Type | Detection | Fix Strategy |
|
||||
|--------------|-----------|--------------|
|
||||
| Assertion mismatch | "expected X, received Y" | Correct expected values |
|
||||
| Missing import | "Cannot find module" | Fix import paths |
|
||||
| Null reference | "Cannot read property of null" | Add null guards in tests |
|
||||
| Async issue | "timeout", "not resolved" | Fix async/await patterns |
|
||||
| Mock issue | "mock not called" | Fix mock setup/teardown |
|
||||
| Type error | "Type X is not assignable" | Fix type annotations |
|
||||
|
||||
5. Identify uncovered files from coverage report
|
||||
6. Cross-reference with scout findings for targeted coverage improvement
|
||||
|
||||
**Output**: Failure categories, fix targets, uncovered areas
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Fix Generation + Re-execution
|
||||
|
||||
**Objective**: Apply fixes and verify improvement.
|
||||
|
||||
**Steps**:
|
||||
|
||||
1. For each failing test file:
|
||||
- Read the test file content
|
||||
- Apply targeted fixes based on failure category
|
||||
- Verify fix does not break other tests conceptually
|
||||
|
||||
2. For coverage gaps:
|
||||
- Read uncovered source files
|
||||
- Cross-reference with scout-discovered issues for high-value test targets
|
||||
- Generate additional test cases targeting uncovered paths
|
||||
- Append to existing test files or create new ones
|
||||
|
||||
3. Re-run test suite with coverage:
|
||||
```bash
|
||||
<test-command> 2>&1 || true
|
||||
```
|
||||
|
||||
4. Parse new results: pass rate, coverage
|
||||
5. Calculate improvement delta
|
||||
|
||||
6. Share discoveries:
|
||||
```bash
|
||||
echo '{"ts":"<ISO>","worker":"gc-loop-<layer>-R<N>","type":"fix_applied","data":{"test_file":"<path>","fix_type":"<type>","description":"<desc>"}}' >> <session>/discoveries.ndjson
|
||||
```
|
||||
|
||||
**Output**: Updated pass rate, coverage, improvement delta
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Result Update
|
||||
|
||||
**Objective**: Save updated results for coordinator evaluation.
|
||||
|
||||
**Steps**:
|
||||
|
||||
1. Overwrite results/run-{layer}.json with new data
|
||||
2. Save test output to results/output-{layer}.txt
|
||||
3. Report improvement delta in findings
|
||||
|
||||
---
|
||||
|
||||
## Structured Output Template
|
||||
|
||||
```
|
||||
## Summary
|
||||
- GC Loop Round <N> for <layer>: coverage <before>% -> <after>% (delta: +<N>%)
|
||||
|
||||
## Fixes Applied
|
||||
- Fix 1: <test-file> - <fix-type> - <description>
|
||||
- Fix 2: <test-file> - <fix-type> - <description>
|
||||
|
||||
## Coverage Update
|
||||
- Before: <N>%, After: <N>%, Target: <N>%
|
||||
- Pass Rate: <before> -> <after>
|
||||
|
||||
## Scout-Informed Additions
|
||||
- Added test for scout issue #<N>: <description> (if applicable)
|
||||
|
||||
## Remaining Issues
|
||||
- Issue 1: <description> (if any)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Scenario | Resolution |
|
||||
|----------|------------|
|
||||
| No previous results found | Report error, cannot proceed without baseline |
|
||||
| All fixes cause new failures | Revert fixes, report inability to improve |
|
||||
| Coverage tool unavailable | Use pass rate as proxy metric |
|
||||
| Scout findings not available | Proceed without scout context |
|
||||
| Timeout approaching | Output partial results with current state |
|
||||
@@ -0,0 +1,185 @@
|
||||
# Agent Instruction Template -- Team Quality Assurance
|
||||
|
||||
Base instruction template for CSV wave agents in the QA pipeline. Used by scout, strategist, generator, and analyst roles (csv-wave tasks).
|
||||
|
||||
## Purpose
|
||||
|
||||
| Phase | Usage |
|
||||
|-------|-------|
|
||||
| Phase 1 | Coordinator builds instruction from this template with session folder baked in |
|
||||
| Phase 2 | Injected as `instruction` parameter to `spawn_agents_on_csv` |
|
||||
|
||||
---
|
||||
|
||||
## Base Instruction Template
|
||||
|
||||
```markdown
|
||||
## TASK ASSIGNMENT -- Team Quality Assurance
|
||||
|
||||
### MANDATORY FIRST STEPS
|
||||
1. Read shared discoveries: <session-folder>/discoveries.ndjson (if exists, skip if not)
|
||||
2. Read project context: .workflow/project-tech.json (if exists)
|
||||
3. Read scan results: <session-folder>/scan/scan-results.json (if exists, for non-scout roles)
|
||||
4. Read test strategy: <session-folder>/strategy/test-strategy.md (if exists, for generator/analyst)
|
||||
|
||||
---
|
||||
|
||||
## Your Task
|
||||
|
||||
**Task ID**: {id}
|
||||
**Title**: {title}
|
||||
**Role**: {role}
|
||||
**Perspectives**: {perspective}
|
||||
**Layer**: {layer}
|
||||
**Coverage Target**: {coverage_target}%
|
||||
|
||||
### Task Description
|
||||
{description}
|
||||
|
||||
### Previous Tasks' Findings (Context)
|
||||
{prev_context}
|
||||
|
||||
---
|
||||
|
||||
## Execution Protocol
|
||||
|
||||
### If Role = scout
|
||||
|
||||
1. **Determine scan scope**: Use git diff and task description to identify target files
|
||||
```bash
|
||||
git diff --name-only HEAD~5 2>/dev/null || echo ""
|
||||
```
|
||||
2. **Load historical patterns**: Read discoveries.ndjson for known defect patterns
|
||||
3. **Execute multi-perspective scan**: For each perspective in {perspective} (semicolon-separated):
|
||||
- **bug**: Scan for logic errors, crash paths, null references, unhandled exceptions
|
||||
- **security**: Scan for vulnerabilities, hardcoded secrets, auth bypass, data exposure
|
||||
- **test-coverage**: Identify untested code paths, missing assertions, uncovered branches
|
||||
- **code-quality**: Detect anti-patterns, high complexity, duplicated logic, maintainability issues
|
||||
- **ux** (if present): Check for user-facing issues, accessibility problems
|
||||
4. **Aggregate and rank**: Deduplicate by file:line, rank by severity (critical > high > medium > low)
|
||||
5. **Write scan results**: Save to <session-folder>/scan/scan-results.json:
|
||||
```json
|
||||
{
|
||||
"scan_date": "<ISO8601>",
|
||||
"perspectives": ["bug", "security", ...],
|
||||
"total_findings": <N>,
|
||||
"by_severity": { "critical": <N>, "high": <N>, "medium": <N>, "low": <N> },
|
||||
"findings": [{ "id": "<N>", "severity": "<level>", "perspective": "<name>", "file": "<path>", "line": <N>, "description": "<text>" }]
|
||||
}
|
||||
```
|
||||
6. **Share discoveries**: For each critical/high finding:
|
||||
```bash
|
||||
echo '{"ts":"<ISO8601>","worker":"{id}","type":"issue_found","data":{"file":"<path>","line":<N>,"severity":"<level>","perspective":"<name>","description":"<text>"}}' >> <session-folder>/discoveries.ndjson
|
||||
```
|
||||
|
||||
### If Role = strategist
|
||||
|
||||
1. **Read scout results**: Load <session-folder>/scan/scan-results.json (if discovery or full mode)
|
||||
2. **Analyze change scope**: Run `git diff --name-only HEAD~5` to identify changed files
|
||||
3. **Detect test framework**: Check for vitest.config.ts, jest.config.js, pytest.ini, pyproject.toml
|
||||
4. **Categorize files**: Source, Test, Config patterns
|
||||
5. **Select test layers**:
|
||||
|
||||
| Condition | Layer | Target |
|
||||
|-----------|-------|--------|
|
||||
| Has source file changes | L1: Unit Tests | 80% |
|
||||
| >= 3 source files OR critical issues | L2: Integration Tests | 60% |
|
||||
| >= 3 critical/high severity issues | L3: E2E Tests | 40% |
|
||||
|
||||
6. **Generate strategy**: Write to <session-folder>/strategy/test-strategy.md with scope analysis, layer configs, priority issues, risk assessment
|
||||
7. **Share discoveries**: Append framework detection to board:
|
||||
```bash
|
||||
echo '{"ts":"<ISO8601>","worker":"{id}","type":"framework_detected","data":{"framework":"<name>","config_file":"<path>","test_pattern":"<pattern>"}}' >> <session-folder>/discoveries.ndjson
|
||||
```
|
||||
|
||||
### If Role = generator
|
||||
|
||||
1. **Read strategy**: Load <session-folder>/strategy/test-strategy.md for layer config and priority files
|
||||
2. **Read source files**: Load files listed in strategy for the target layer (limit 20 files)
|
||||
3. **Learn test patterns**: Find 3 existing test files to understand conventions (imports, structure, naming)
|
||||
4. **Detect if GC fix mode**: If task description contains "fix" -> read failure info from results/run-{layer}.json, fix failing tests only
|
||||
5. **Generate tests**: For each priority source file:
|
||||
- Determine test file path following project conventions
|
||||
- Generate test cases: happy path, edge cases, error handling
|
||||
- Use proper test framework API
|
||||
- Include proper imports and mocks
|
||||
6. **Write test files**: Save to <session-folder>/tests/<layer-dir>/
|
||||
- L1 -> tests/L1-unit/
|
||||
- L2 -> tests/L2-integration/
|
||||
- L3 -> tests/L3-e2e/
|
||||
7. **Syntax check**: Run `tsc --noEmit` or equivalent to verify syntax
|
||||
8. **Share discoveries**: Append test generation info to discoveries board
|
||||
|
||||
### If Role = analyst
|
||||
|
||||
1. **Read all results**: Load <session-folder>/results/run-*.json for execution data
|
||||
2. **Read scan results**: Load <session-folder>/scan/scan-results.json (if exists)
|
||||
3. **Read strategy**: Load <session-folder>/strategy/test-strategy.md
|
||||
4. **Read discoveries**: Parse <session-folder>/discoveries.ndjson for all findings
|
||||
5. **Analyze five dimensions**:
|
||||
- **Defect patterns**: Group issues by type, identify patterns with >= 2 occurrences
|
||||
- **Coverage gaps**: Compare achieved vs target per layer, identify per-file gaps
|
||||
- **Test effectiveness**: Per layer -- pass rate, iterations, coverage achieved
|
||||
- **Quality trend**: Compare against coverage_history if available
|
||||
- **Quality score** (0-100): Start from 100, deduct for issues, gaps, failures; bonus for effective layers
|
||||
6. **Score-based recommendations**:
|
||||
|
||||
| Score | Recommendation |
|
||||
|-------|----------------|
|
||||
| >= 80 | Quality is GOOD. Maintain current practices. |
|
||||
| 60-79 | Quality needs IMPROVEMENT. Focus on gaps and patterns. |
|
||||
| < 60 | Quality is CONCERNING. Recommend comprehensive review. |
|
||||
|
||||
7. **Generate report**: Write to <session-folder>/analysis/quality-report.md
|
||||
8. **Share discoveries**: Append quality metrics to board
|
||||
|
||||
---
|
||||
|
||||
## Output (report_agent_job_result)
|
||||
|
||||
Return JSON:
|
||||
{
|
||||
"id": "{id}",
|
||||
"status": "completed" | "failed",
|
||||
"findings": "Key discoveries and implementation notes (max 500 chars)",
|
||||
"issues_found": "count of issues discovered (scout/analyst, empty for others)",
|
||||
"pass_rate": "test pass rate as decimal (empty for non-executor tasks)",
|
||||
"coverage_achieved": "actual coverage percentage (empty for non-executor tasks)",
|
||||
"test_files": "semicolon-separated paths of test files (empty for non-generator tasks)",
|
||||
"quality_score": "quality score 0-100 (analyst only, empty for others)",
|
||||
"error": ""
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quality Requirements
|
||||
|
||||
All agents must verify before reporting complete:
|
||||
|
||||
| Requirement | Criteria |
|
||||
|-------------|----------|
|
||||
| Scan results written | Verify scan-results.json exists (scout) |
|
||||
| Strategy written | Verify test-strategy.md exists (strategist) |
|
||||
| Tests generated | Verify test files exist in correct layer dir (generator) |
|
||||
| Syntax clean | No compilation errors in generated tests (generator) |
|
||||
| Report written | Verify quality-report.md exists (analyst) |
|
||||
| Findings accuracy | Findings reflect actual work done |
|
||||
| Discovery sharing | At least 1 discovery shared to board |
|
||||
| Error reporting | Non-empty error field if status is failed |
|
||||
|
||||
---
|
||||
|
||||
## Placeholder Reference
|
||||
|
||||
| Placeholder | Resolved By | When |
|
||||
|-------------|------------|------|
|
||||
| `<session-folder>` | Skill designer (Phase 1) | Literal path baked into instruction |
|
||||
| `{id}` | spawn_agents_on_csv | Runtime from CSV row |
|
||||
| `{title}` | spawn_agents_on_csv | Runtime from CSV row |
|
||||
| `{description}` | spawn_agents_on_csv | Runtime from CSV row |
|
||||
| `{role}` | spawn_agents_on_csv | Runtime from CSV row |
|
||||
| `{perspective}` | spawn_agents_on_csv | Runtime from CSV row |
|
||||
| `{layer}` | spawn_agents_on_csv | Runtime from CSV row |
|
||||
| `{coverage_target}` | spawn_agents_on_csv | Runtime from CSV row |
|
||||
| `{prev_context}` | spawn_agents_on_csv | Runtime from CSV row |
|
||||
190
.codex/skills/team-quality-assurance/schemas/tasks-schema.md
Normal file
190
.codex/skills/team-quality-assurance/schemas/tasks-schema.md
Normal file
@@ -0,0 +1,190 @@
|
||||
# Team Quality Assurance -- CSV Schema
|
||||
|
||||
## Master CSV: tasks.csv
|
||||
|
||||
### Column Definitions
|
||||
|
||||
#### Input Columns (Set by Decomposer)
|
||||
|
||||
| Column | Type | Required | Description | Example |
|
||||
|--------|------|----------|-------------|---------|
|
||||
| `id` | string | Yes | Unique task identifier (PREFIX-NNN) | `"SCOUT-001"` |
|
||||
| `title` | string | Yes | Short task title | `"Multi-perspective code scan"` |
|
||||
| `description` | string | Yes | Detailed task description (self-contained) | `"Scan codebase from multiple perspectives..."` |
|
||||
| `role` | enum | Yes | Worker role: `scout`, `strategist`, `generator`, `executor`, `analyst` | `"scout"` |
|
||||
| `perspective` | string | No | Scan perspectives (semicolon-separated, scout only) | `"bug;security;test-coverage;code-quality"` |
|
||||
| `layer` | string | No | Test layer: `L1`, `L2`, `L3`, or empty | `"L1"` |
|
||||
| `coverage_target` | string | No | Target coverage percentage for this layer | `"80"` |
|
||||
| `deps` | string | No | Semicolon-separated dependency task IDs | `"SCOUT-001"` |
|
||||
| `context_from` | string | No | Semicolon-separated task IDs for context | `"SCOUT-001"` |
|
||||
| `exec_mode` | enum | Yes | Execution mechanism: `csv-wave` or `interactive` | `"csv-wave"` |
|
||||
|
||||
#### Computed Columns (Set by Wave Engine)
|
||||
|
||||
| Column | Type | Description | Example |
|
||||
|--------|------|-------------|---------|
|
||||
| `wave` | integer | Wave number (1-based, from topological sort) | `2` |
|
||||
| `prev_context` | string | Aggregated findings from context_from tasks (per-wave CSV only) | `"[SCOUT-001] Found 5 security issues..."` |
|
||||
|
||||
#### Output Columns (Set by Agent)
|
||||
|
||||
| Column | Type | Description | Example |
|
||||
|--------|------|-------------|---------|
|
||||
| `status` | enum | `pending` -> `completed` / `failed` / `skipped` | `"completed"` |
|
||||
| `findings` | string | Key discoveries (max 500 chars) | `"Found 3 critical security issues..."` |
|
||||
| `issues_found` | string | Count of issues discovered (scout/analyst) | `"5"` |
|
||||
| `pass_rate` | string | Test pass rate as decimal (executor only) | `"0.95"` |
|
||||
| `coverage_achieved` | string | Actual coverage percentage (executor only) | `"82"` |
|
||||
| `test_files` | string | Semicolon-separated test file paths (generator only) | `"tests/L1-unit/auth.test.ts"` |
|
||||
| `quality_score` | string | Quality score 0-100 (analyst only) | `"78"` |
|
||||
| `error` | string | Error message if failed | `""` |
|
||||
|
||||
---
|
||||
|
||||
### exec_mode Values
|
||||
|
||||
| Value | Mechanism | Description |
|
||||
|-------|-----------|-------------|
|
||||
| `csv-wave` | `spawn_agents_on_csv` | One-shot batch execution within wave |
|
||||
| `interactive` | `spawn_agent`/`wait`/`send_input`/`close_agent` | Multi-round individual execution (executor fix cycles) |
|
||||
|
||||
Interactive tasks appear in master CSV for dependency tracking but are NOT included in wave-{N}.csv files.
|
||||
|
||||
---
|
||||
|
||||
### Role Prefixes
|
||||
|
||||
| Role | Prefix | Responsibility Type |
|
||||
|------|--------|---------------------|
|
||||
| scout | SCOUT | read-only analysis (multi-perspective scan) |
|
||||
| strategist | QASTRAT | read-only analysis (strategy formulation) |
|
||||
| generator | QAGEN | code-gen (test file generation) |
|
||||
| executor | QARUN | validation (test execution + fix cycles) |
|
||||
| analyst | QAANA | read-only analysis (quality reporting) |
|
||||
|
||||
---
|
||||
|
||||
### Example Data
|
||||
|
||||
```csv
|
||||
id,title,description,role,perspective,layer,coverage_target,deps,context_from,exec_mode,wave,status,findings,issues_found,pass_rate,coverage_achieved,test_files,quality_score,error
|
||||
"SCOUT-001","Multi-perspective code scan","Scan codebase from bug, security, test-coverage, code-quality perspectives. Identify issues with severity ranking (critical/high/medium/low) and file:line references. Write scan results to <session>/scan/scan-results.json","scout","bug;security;test-coverage;code-quality","","","","","csv-wave","1","pending","","","","","","",""
|
||||
"QASTRAT-001","Test strategy formulation","Analyze scout findings and code changes. Determine test layers (L1/L2/L3), define coverage targets, detect test framework, identify priority files. Write strategy to <session>/strategy/test-strategy.md","strategist","","","","SCOUT-001","SCOUT-001","csv-wave","2","pending","","","","","","",""
|
||||
"QAGEN-L1-001","Generate L1 unit tests","Generate L1 unit tests based on strategy. Read source files, identify exports, generate test cases for happy path, edge cases, error handling. Follow project test conventions. Write tests to <session>/tests/L1-unit/","generator","","L1","80","QASTRAT-001","QASTRAT-001","csv-wave","3","pending","","","","","","",""
|
||||
"QAGEN-L2-001","Generate L2 integration tests","Generate L2 integration tests based on strategy. Focus on module interaction points and integration boundaries. Write tests to <session>/tests/L2-integration/","generator","","L2","60","QASTRAT-001","QASTRAT-001","csv-wave","3","pending","","","","","","",""
|
||||
"QARUN-L1-001","Execute L1 tests and collect coverage","Run L1 test suite with coverage collection. Parse results for pass rate and coverage. If pass_rate < 0.95 or coverage < 80%, attempt auto-fix (max 3 iterations). Save results to <session>/results/run-L1.json","executor","","L1","80","QAGEN-L1-001","QAGEN-L1-001","interactive","4","pending","","","","","","",""
|
||||
"QARUN-L2-001","Execute L2 tests and collect coverage","Run L2 integration test suite with coverage. Auto-fix up to 3 iterations. Save results to <session>/results/run-L2.json","executor","","L2","60","QAGEN-L2-001","QAGEN-L2-001","interactive","4","pending","","","","","","",""
|
||||
"QAANA-001","Quality analysis report","Analyze defect patterns, coverage gaps, test effectiveness. Calculate quality score (0-100). Generate comprehensive report with recommendations. Write to <session>/analysis/quality-report.md","analyst","","","","QARUN-L1-001;QARUN-L2-001","QARUN-L1-001;QARUN-L2-001","csv-wave","5","pending","","","","","","",""
|
||||
"SCOUT-002","Regression scan","Post-fix regression scan. Verify no new issues introduced by test fixes. Focus on areas modified during GC loops.","scout","bug;security;code-quality","","","QAANA-001","QAANA-001","csv-wave","6","pending","","","","","","",""
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Column Lifecycle
|
||||
|
||||
```
|
||||
Decomposer (Phase 1) Wave Engine (Phase 2) Agent (Execution)
|
||||
--------------------- -------------------- -----------------
|
||||
id ----------> id ----------> id
|
||||
title ----------> title ----------> (reads)
|
||||
description ----------> description ----------> (reads)
|
||||
role ----------> role ----------> (reads)
|
||||
perspective ----------> perspective ----------> (reads)
|
||||
layer ----------> layer ----------> (reads)
|
||||
coverage_target -------> coverage_target -------> (reads)
|
||||
deps ----------> deps ----------> (reads)
|
||||
context_from----------> context_from----------> (reads)
|
||||
exec_mode ----------> exec_mode ----------> (reads)
|
||||
wave ----------> (reads)
|
||||
prev_context ----------> (reads)
|
||||
status
|
||||
findings
|
||||
issues_found
|
||||
pass_rate
|
||||
coverage_achieved
|
||||
test_files
|
||||
quality_score
|
||||
error
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Output Schema (JSON)
|
||||
|
||||
Agent output via `report_agent_job_result` (csv-wave tasks):
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "SCOUT-001",
|
||||
"status": "completed",
|
||||
"findings": "Multi-perspective scan found 5 issues: 2 security (hardcoded keys, missing auth), 1 bug (null reference), 2 code-quality (duplicated logic, high complexity). All issues logged to discoveries.ndjson.",
|
||||
"issues_found": "5",
|
||||
"pass_rate": "",
|
||||
"coverage_achieved": "",
|
||||
"test_files": "",
|
||||
"quality_score": "",
|
||||
"error": ""
|
||||
}
|
||||
```
|
||||
|
||||
Interactive tasks output via structured text or JSON written to `interactive/{id}-result.json`.
|
||||
|
||||
---
|
||||
|
||||
## Discovery Types
|
||||
|
||||
| Type | Dedup Key | Data Schema | Description |
|
||||
|------|-----------|-------------|-------------|
|
||||
| `issue_found` | `data.file+data.line` | `{file, line, severity, perspective, description}` | Issue discovered by scout |
|
||||
| `framework_detected` | `data.framework` | `{framework, config_file, test_pattern}` | Test framework identified |
|
||||
| `test_generated` | `data.file` | `{file, source_file, test_count}` | Test file created |
|
||||
| `defect_found` | `data.file+data.line` | `{file, line, pattern, description}` | Defect found during testing |
|
||||
| `coverage_gap` | `data.file` | `{file, current, target, gap}` | Coverage gap identified |
|
||||
| `convention_found` | `data.pattern` | `{pattern, example_file, description}` | Test convention detected |
|
||||
| `fix_applied` | `data.test_file+data.fix_type` | `{test_file, fix_type, description}` | Test fix during GC loop |
|
||||
| `quality_metric` | `data.dimension` | `{dimension, score, details}` | Quality dimension score |
|
||||
|
||||
### Discovery NDJSON Format
|
||||
|
||||
```jsonl
|
||||
{"ts":"2026-03-08T10:00:00Z","worker":"SCOUT-001","type":"issue_found","data":{"file":"src/auth.ts","line":42,"severity":"high","perspective":"security","description":"Hardcoded secret key in auth module"}}
|
||||
{"ts":"2026-03-08T10:02:00Z","worker":"SCOUT-001","type":"issue_found","data":{"file":"src/user.ts","line":15,"severity":"medium","perspective":"bug","description":"Missing null check on user object"}}
|
||||
{"ts":"2026-03-08T10:05:00Z","worker":"QASTRAT-001","type":"framework_detected","data":{"framework":"vitest","config_file":"vitest.config.ts","test_pattern":"**/*.test.ts"}}
|
||||
{"ts":"2026-03-08T10:10:00Z","worker":"QAGEN-L1-001","type":"test_generated","data":{"file":"tests/L1-unit/auth.test.ts","source_file":"src/auth.ts","test_count":8}}
|
||||
{"ts":"2026-03-08T10:15:00Z","worker":"QARUN-L1-001","type":"defect_found","data":{"file":"src/auth.ts","line":42,"pattern":"null_reference","description":"Missing null check on token payload"}}
|
||||
{"ts":"2026-03-08T10:20:00Z","worker":"QAANA-001","type":"quality_metric","data":{"dimension":"coverage_achievement","score":85,"details":"L1: 82%, L2: 68%"}}
|
||||
```
|
||||
|
||||
> Both csv-wave and interactive agents read/write the same discoveries.ndjson file.
|
||||
|
||||
---
|
||||
|
||||
## Cross-Mechanism Context Flow
|
||||
|
||||
| Source | Target | Mechanism |
|
||||
|--------|--------|-----------|
|
||||
| Scout findings | Strategist prev_context | CSV context_from column |
|
||||
| CSV task findings | Interactive task | Injected via spawn message |
|
||||
| Interactive task result | CSV task prev_context | Read from interactive/{id}-result.json |
|
||||
| Any agent discovery | Any agent | Shared via discoveries.ndjson |
|
||||
| Executor coverage data | GC loop handler | Read from results/run-{layer}.json |
|
||||
| Analyst quality score | Regression scout | Injected via prev_context |
|
||||
|
||||
---
|
||||
|
||||
## Validation Rules
|
||||
|
||||
| Rule | Check | Error |
|
||||
|------|-------|-------|
|
||||
| Unique IDs | No duplicate `id` values | "Duplicate task ID: {id}" |
|
||||
| Valid deps | All dep IDs exist in tasks | "Unknown dependency: {dep_id}" |
|
||||
| No self-deps | Task cannot depend on itself | "Self-dependency: {id}" |
|
||||
| No circular deps | Topological sort completes | "Circular dependency detected involving: {ids}" |
|
||||
| context_from valid | All context IDs exist and in earlier waves | "Invalid context_from: {id}" |
|
||||
| exec_mode valid | Value is `csv-wave` or `interactive` | "Invalid exec_mode: {value}" |
|
||||
| Description non-empty | Every task has description | "Empty description for task: {id}" |
|
||||
| Status enum | status in {pending, completed, failed, skipped} | "Invalid status: {status}" |
|
||||
| Role valid | role in {scout, strategist, generator, executor, analyst} | "Invalid role: {role}" |
|
||||
| Layer valid | layer in {L1, L2, L3, ""} | "Invalid layer: {layer}" |
|
||||
| Perspective valid | If scout, perspective contains valid values | "Invalid perspective: {value}" |
|
||||
| Coverage target valid | If layer present, coverage_target is numeric | "Invalid coverage target: {value}" |
|
||||
Reference in New Issue
Block a user