Optimizations: - Reduced file size: 791 → 540 lines (-32%) - Merged redundant sections (Overview + Philosophy + Responsibility Matrix) - Simplified JSON examples and execution flow descriptions Enhancements: 1. Intelligent Strategy Engine - 4 strategies: conservative/aggressive/exploratory/surgical - Auto-selection based on pass rate, stuck tests, regression 2. Progressive Testing - Run affected tests during iterations (70-90% faster) - Full suite on final validation 3. Parallel Strategy Exploration - Trigger on stuck tests (3+ consecutive failures) - Git worktree-based parallel execution - Select best result by pass rate Changes: - Max iterations: 5 → 10 - CLI tools: Gemini/Qwen → Gemini/Qwen/Codex (fallback chain) - Add strategy metadata to iteration-state.json - Add test_execution config to fix task JSON Impact: - Faster iterations via progressive testing - Better stuck situation handling via parallel exploration - Adaptive strategy selection for complex scenarios
18 KiB
name, description, argument-hint, allowed-tools
| name | description | argument-hint | allowed-tools |
|---|---|---|---|
| test-cycle-execute | Execute test-fix workflow with dynamic task generation and iterative fix cycles until test pass rate >= 95% or max iterations reached. Uses @cli-planning-agent for failure analysis and task generation. | [--resume-session="session-id"] [--max-iterations=N] | SlashCommand(*), TodoWrite(*), Read(*), Bash(*), Task(*) |
Workflow Test-Cycle-Execute Command
Architecture & Philosophy
Core Concept: Dynamic test-fix orchestrator with adaptive task generation based on runtime analysis.
Quality Gate: Iterates until test pass rate >= 95% (with criticality assessment) or 100% (full pass).
Key Differences from Standard Execute:
- Standard: Pre-defined task queue → Sequential execution → Complete
- Test-Cycle: Initial tasks → Test → Analyze → Generate fix tasks → Execute → Re-test → Repeat
Orchestrator Boundary (CRITICAL):
- This command is the ONLY place where test failures are handled
- All failure analysis delegated to @cli-planning-agent (Gemini/Qwen/Codex)
- All fixes applied by @test-fix-agent
- Orchestrator manages: pass rate calculation, criticality assessment, iteration loop, strategy selection
Agent Coordination:
- Orchestrator: Loop control, strategy selection, pass rate calculation, threshold decisions
- @cli-planning-agent: CLI analysis (Gemini/Qwen/Codex), root cause extraction, fix task generation
- @test-fix-agent: Test execution, code fixes, result reporting
Resume Mode: --resume-session flag skips discovery and continues from interruption point.
Enhanced Features
1. Intelligent Strategy Engine
Strategy Types:
- Conservative (default): Single fix per iteration, full validation
- Aggressive: Batch fix similar failures (when pass rate >80% + high pattern similarity)
- Exploratory: Parallel multi-strategy attempt (when stuck on same failures 3+ times)
- Surgical: Minimal changes, rollback on regression
Strategy Selection Logic:
// In orchestrator, before invoking @cli-planning-agent
function selectStrategy(context) {
const { passRate, iteration, stuckTests, regressionDetected } = context;
if (iteration <= 2) return "conservative";
if (passRate > 80 && context.failurePattern.similarity > 0.7) return "aggressive";
if (stuckTests > 0 && iteration >= 3) return "exploratory";
if (regressionDetected) return "surgical";
return "conservative";
}
Integration: Orchestrator passes selected strategy to @cli-planning-agent in prompt.
2. Progressive Testing
Concept: Run affected tests during iterations, full suite on final validation.
Affected Test Detection:
- @cli-planning-agent analyzes modified files from fix strategy
- Maps to test files via imports and integration patterns
- Returns
affected_tests[]in fix task JSON
Execution:
- Iteration N:
npm test -- ${affected_tests.join(' ')}(fast feedback) - Final validation:
npm test(comprehensive check)
Benefits: 70-90% iteration speed improvement.
Task JSON Integration:
{
"context": {
"fix_strategy": {
"test_execution": {
"mode": "affected_only",
"affected_tests": ["tests/auth/client.test.ts"],
"fallback_to_full": true
}
}
}
}
3. Parallel Strategy Exploration
Trigger: Iteration >= 3 AND stuck tests detected (same test failed 3+ consecutive times)
Workflow:
Stuck State Detected → Switch to "exploratory" strategy:
├── @cli-planning-agent generates 3 alternative fix approaches
├── Orchestrator creates 3 task JSONs: IMPL-fix-Na, IMPL-fix-Nb, IMPL-fix-Nc
├── Execute in parallel via git worktree:
│ ├── worktree-a/ → Apply fix-Na → Test → Score: 92%
│ ├── worktree-b/ → Apply fix-Nb → Test → Score: 88%
│ └── worktree-c/ → Apply fix-Nc → Test → Score: 95% ✓
└── Select best result, apply to main branch
Implementation:
# Orchestrator creates worktrees
git worktree add ../worktree-a
git worktree add ../worktree-b
git worktree add ../worktree-c
# Launch 3 @test-fix-agent instances in parallel
Task(subagent_type="test-fix-agent", prompt="...", model="haiku") x3
# Collect results, select winner by pass_rate
Task JSON:
{
"id": "IMPL-fix-3-parallel",
"meta": {
"type": "parallel-exploration",
"strategies": ["IMPL-fix-3a", "IMPL-fix-3b", "IMPL-fix-3c"],
"selection_criteria": "highest_pass_rate"
}
}
Core Responsibilities
Orchestrator:
- Session discovery and task queue management
- Pass rate calculation:
(passed / total) * 100from test-results.json - Criticality assessment from test-results.json
- Strategy selection (conservative/aggressive/exploratory/surgical)
- Stuck test detection (same test failed 3+ times)
- Regression detection (pass rate dropped >10%)
- Iteration control (max 10 iterations, default)
- CLI tool selection (Gemini → Qwen → Codex fallback chain)
- Failure analysis delegation to @cli-planning-agent
- TodoWrite progress tracking
- Session auto-complete when pass rate >= 95%
@cli-planning-agent:
- Execute CLI analysis (Gemini/Qwen/Codex) with bug diagnosis template
- Parse CLI output for root causes and fix strategy
- Generate IMPL-fix-N.json with structured task definition
- Detect affected tests for progressive testing
- Save analysis reports and CLI output
@test-fix-agent:
- Execute tests and save results to test-results.json
- Apply fixes from task JSON fix_strategy
- Assign criticality to test failures
- Update task status
Execution Flow
Phase 1: Discovery & Initialization
- Detect test-fix session from
workflow_type: "test_session" - Load workflow-session.json, IMPL_PLAN.md, .task/*.json
- Initialize TodoWrite with initial tasks
- Setup iteration context (current=0, max=10, strategy="conservative")
Resume Mode: Load iteration-state.json, continue from next_action.
Phase 2: Task Execution Loop
For each task in queue:
1. Load task JSON and context
2. Determine task type (test-gen, test-fix, fix-iteration, parallel-exploration)
3. Execute via appropriate agent
4. Collect results from test-results.json
5. Calculate pass rate: (passed / total) * 100
6. Assess criticality from test-results.json
7. Detect regression: if passRate < previousPassRate - 10% → REGRESSION
8. Detect stuck tests: if same test failed 3+ consecutive iterations → STUCK
9. Make threshold decision:
IF passRate === 100%:
→ SUCCESS: Mark complete, continue to next task
ELSE IF passRate >= 95%:
→ CHECK criticality:
- All "low" → PARTIAL SUCCESS (auto-approve with note)
- Any "high"/"medium" → Enter fix loop (step 10)
ELSE IF passRate < 95%:
→ FAILED: Enter fix loop (step 10)
10. Fix Loop (if needed):
a. Select strategy: conservative/aggressive/exploratory/surgical
b. If strategy === "exploratory":
- Invoke @cli-planning-agent for 3 alternative approaches
- Create parallel tasks (IMPL-fix-Na, Nb, Nc)
- Execute in parallel via git worktree
- Select best result (highest pass_rate)
- Apply to main branch
c. Else:
- Invoke @cli-planning-agent with selected strategy
- Generate single IMPL-fix-N.json
- Insert at queue front
d. Continue loop
11. Check max iterations (abort if exceeded)
Phase 3: Iteration Cycle (Per Fix Task)
Structure:
Iteration N:
├── 1. Test Execution (via @test-fix-agent)
│ ├── Mode: affected_only or full_suite (from task JSON)
│ ├── Save results: test-results.json, test-output.log
│ └── Report to orchestrator
├── 2. Pass Rate Validation (orchestrator)
│ ├── Calculate: (passed / total) * 100
│ ├── Assess criticality
│ ├── Detect regression/stuck
│ └── Select strategy
├── 3. Failure Analysis (via @cli-planning-agent)
│ ├── Input: failure context + selected strategy
│ ├── CLI tool: Gemini (fallback: Qwen → Codex)
│ ├── Output: IMPL-fix-N.json, iteration-N-analysis.md, cli-output.txt
│ └── Return task ID to orchestrator
└── 4. Fix Execution (via @test-fix-agent)
├── Load fix_strategy from task JSON
├── Apply changes (surgical fixes)
└── Return to step 1 for re-test
Phase 4: Completion
Success Conditions:
- All tasks completed
- Pass rate === 100% (full success)
Partial Success Conditions:
- All tasks completed
- Pass rate >= 95% and < 100%
- All failures are "low" criticality
- Action: Auto-approve with review note
Completion Steps:
- Final validation: Run full test suite
- Calculate final pass rate
- Assess status (full/partial/failure)
- Update session state
- Generate summary with pass rate metrics
- Call
/workflow:session:complete
Failure Conditions:
- Max iterations (10) reached without 95% pass rate
- Pass rate < 95% after max iterations
- Unrecoverable errors
Failure Handling:
- Document final state with pass rate
- Generate failure report (remaining failures, iteration history, recommendations)
- Mark tasks as blocked
- Return control to user
Agent Coordination Details
@cli-planning-agent Invocation
CLI Tool Selection (fallback chain):
- Gemini (primary):
gemini-2.5-pro - Qwen (fallback):
coder-model - Codex (fallback):
gpt-5.1-codex
Template: ~/.claude/workflows/cli-templates/prompts/analysis/01-diagnose-bug-root-cause.txt
Timeout: 40min (2400000ms)
Prompt Structure:
Task(
subagent_type="cli-planning-agent",
description=`Analyze test failures (iteration ${N}) - ${strategy} strategy`,
prompt=`
## Task Objective
Analyze test failures and generate fix task JSON for iteration ${N}
## Strategy
${selectedStrategy} - ${strategyDescription}
## MANDATORY FIRST STEPS
1. Read test results: ${session.test_results_path}
2. Read test output: ${session.test_output_path}
3. Read iteration state: ${session.iteration_state_path}
## Session Paths
- Workflow Dir: ${session.workflow_dir}
- Test Results: ${session.test_results_path}
- Task Output Dir: ${session.task_dir}
- Analysis Output: ${session.process_dir}/iteration-${N}-analysis.md
## Context Metadata
- Session ID: ${sessionId}
- Current Iteration: ${N}
- Max Iterations: 10
- Current Pass Rate: ${passRate}%
- Selected Strategy: ${selectedStrategy}
- Stuck Tests: ${stuckTests}
## CLI Configuration
- Tool Priority: gemini → qwen → codex
- Template: 01-diagnose-bug-root-cause.txt
- Timeout: 2400000ms
## Expected Deliverables
1. Task JSON: ${session.task_dir}/IMPL-fix-${N}.json
- Must include: fix_strategy.test_execution.affected_tests[]
- Must include: fix_strategy.confidence_score
2. Analysis report: ${session.process_dir}/iteration-${N}-analysis.md
3. CLI output: ${session.process_dir}/iteration-${N}-cli-output.txt
4. Return task ID to orchestrator
## Strategy-Specific Requirements
${strategyRequirements[selectedStrategy]}
## Success Criteria
- Valid IMPL-fix-${N}.json with all required fields
- Concrete fix strategy with modification points (file:function:lines)
- Affected tests list for progressive testing
- Root cause analysis (not just symptoms)
`
)
Strategy-Specific Requirements:
const strategyRequirements = {
conservative: "Single targeted fix, high confidence required",
aggressive: "Batch fix similar failures, pattern-based approach",
exploratory: "Generate 3 alternative approaches with different root cause hypotheses",
surgical: "Minimal changes, focus on rollback safety"
};
@test-fix-agent Context Loading
File Loading Sequence:
- Task JSON (
task_json_path) - Iteration state (
iteration_state_path) - Test results (
test_results_path) - Test output (
test_output_path) - Analysis report (
analysis_path) - Fix history (
fix_history_path)
Paths Provided by Orchestrator:
{
"task_json_path": ".workflow/active/WFS-test-{session}/.task/IMPL-fix-N.json",
"iteration_state_path": ".workflow/active/WFS-test-{session}/.process/iteration-state.json",
"test_results_path": ".workflow/active/WFS-test-{session}/.process/test-results.json",
"test_output_path": ".workflow/active/WFS-test-{session}/.process/test-output.log",
"analysis_path": ".workflow/active/WFS-test-{session}/.process/iteration-N-analysis.md",
"workflow_dir": ".workflow/active/WFS-test-{session}/",
"session_id": "WFS-test-{session}",
"current_iteration": N,
"max_iterations": 10
}
Data Structures
Session File Structure
.workflow/active/WFS-test-{session}/
├── workflow-session.json
├── IMPL_PLAN.md, TODO_LIST.md
├── .task/
│ ├── IMPL-{001,002}.json # Initial tasks
│ ├── IMPL-fix-{N}.json # Generated fix tasks
│ └── IMPL-fix-{N}-parallel.json # Parallel exploration tasks
├── .process/
│ ├── iteration-state.json # Current iteration + strategy
│ ├── test-results.json # Latest test results
│ ├── test-output.log # Full test output
│ ├── fix-history.json # All fix attempts
│ ├── iteration-{N}-analysis.md # CLI analysis
│ └── iteration-{N}-cli-output.txt # Raw CLI output
└── .summaries/iteration-summaries/
Iteration State JSON
{
"session_id": "WFS-test-user-auth",
"current_task": "IMPL-002",
"current_iteration": 3,
"max_iterations": 10,
"selected_strategy": "aggressive",
"iterations": [
{
"iteration": 1,
"pass_rate": 70,
"strategy": "conservative",
"stuck_tests": [],
"regression_detected": false
},
{
"iteration": 2,
"pass_rate": 82,
"strategy": "conservative",
"stuck_tests": [],
"regression_detected": false
},
{
"iteration": 3,
"pass_rate": 89,
"strategy": "aggressive",
"stuck_tests": ["test_auth_edge_case"],
"regression_detected": false
}
],
"stuck_tests": ["test_auth_edge_case"],
"next_action": "execute_fix_task"
}
Task JSON Template (Fix Iteration)
{
"id": "IMPL-fix-3",
"title": "Fix test failures - Iteration 3 (aggressive)",
"status": "pending",
"meta": {
"type": "test-fix-iteration",
"agent": "@test-fix-agent",
"iteration": 3,
"strategy": "aggressive",
"max_iterations": 10
},
"context": {
"requirements": ["Fix identified test failures", "Address root causes"],
"failure_context": {
"failed_tests": ["test1", "test2"],
"error_messages": ["error1", "error2"],
"pass_rate": 89,
"stuck_tests": ["test_auth_edge_case"]
},
"fix_strategy": {
"approach": "Batch fix authentication validation across multiple endpoints",
"modification_points": ["src/auth/middleware.ts:validateToken:45-60"],
"confidence_score": 0.85,
"test_execution": {
"mode": "affected_only",
"affected_tests": ["tests/auth/*.test.ts"],
"fallback_to_full": true
}
}
}
}
Error Handling
| Scenario | Action |
|---|---|
| Test execution error | Log error, retry with error context |
| CLI analysis failure | Fallback: Gemini → Qwen → Codex → manual |
| Agent execution error | Save state, retry with simplified context |
| Max iterations reached | Generate failure report, mark blocked |
| Regression detected | Rollback last fix, switch to surgical strategy |
| Stuck tests detected | Switch to exploratory strategy (parallel) |
CLI Fallback Chain:
- Try Gemini (primary)
- If HTTP 429 or error: Try Qwen
- If Qwen fails: Try Codex
- If all fail: Mark as degraded, use basic pattern matching
TodoWrite Coordination
Structure:
TodoWrite({
todos: [
{
content: "Execute IMPL-001: Generate tests [code-developer]",
status: "completed",
activeForm: "Executing test generation"
},
{
content: "Execute IMPL-002: Test & Fix Cycle [ITERATION]",
status: "in_progress",
activeForm: "Running test-fix iteration cycle"
},
{
content: " → Iteration 1: Initial test (pass: 70%, conservative)",
status: "completed",
activeForm: "Running initial tests"
},
{
content: " → Iteration 2: Fix validation (pass: 82%, conservative)",
status: "completed",
activeForm: "Fixing validation issues"
},
{
content: " → Iteration 3: Batch fix auth (pass: 89%, aggressive)",
status: "in_progress",
activeForm: "Fixing authentication issues"
}
]
});
Update Rules:
- Add iteration item with strategy and pass rate
- Mark completed after each iteration
- Update parent task when all complete
Usage
# Execute test-fix workflow (discovers active session)
/workflow:test-cycle-execute
# Resume interrupted session
/workflow:test-cycle-execute --resume-session="WFS-test-user-auth"
# Custom iteration limit
/workflow:test-cycle-execute --max-iterations=15
Best Practices
- Iteration Limits: Default 10, adjust based on complexity
- Commit Between Iterations: Enables easier rollback
- Monitor Iteration Logs: Review CLI analysis quality
- Trust Strategy Engine: Let orchestrator select optimal approach
- Leverage Progressive Testing: 70-90% faster iterations
- Watch for Stuck Tests: Parallel exploration auto-triggers at iteration 3