--- name: test-cycle-execute description: Execute test-fix workflow with dynamic task generation and iterative fix cycles until test pass rate >= 95% or max iterations reached. Uses @cli-planning-agent for failure analysis and task generation. argument-hint: "[--resume-session=\"session-id\"] [--max-iterations=N]" allowed-tools: SlashCommand(*), TodoWrite(*), Read(*), Bash(*), Task(*) --- # Workflow Test-Cycle-Execute Command ## Architecture & Philosophy **Core Concept**: Dynamic test-fix orchestrator with adaptive task generation based on runtime analysis. **Quality Gate**: Iterates until test pass rate >= 95% (with criticality assessment) or 100% (full pass). **Key Differences from Standard Execute**: - **Standard**: Pre-defined task queue → Sequential execution → Complete - **Test-Cycle**: Initial tasks → Test → Analyze → Generate fix tasks → Execute → Re-test → Repeat **Orchestrator Boundary (CRITICAL)**: - This command is the **ONLY place** where test failures are handled - All failure analysis delegated to **@cli-planning-agent** (Gemini/Qwen/Codex) - All fixes applied by **@test-fix-agent** - Orchestrator manages: pass rate calculation, criticality assessment, iteration loop, strategy selection **Agent Coordination**: - **Orchestrator**: Loop control, strategy selection, pass rate calculation, threshold decisions - **@cli-planning-agent**: CLI analysis (Gemini/Qwen/Codex), root cause extraction, fix task generation - **@test-fix-agent**: Test execution, code fixes, result reporting **Resume Mode**: `--resume-session` flag skips discovery and continues from interruption point. ## Enhanced Features ### 1. Intelligent Strategy Engine **Strategy Types**: - **Conservative** (default): Single fix per iteration, full validation - **Aggressive**: Batch fix similar failures (when pass rate >80% + high pattern similarity) - **Exploratory**: Parallel multi-strategy attempt (when stuck on same failures 3+ times) - **Surgical**: Minimal changes, rollback on regression **Strategy Selection Logic**: ```javascript // In orchestrator, before invoking @cli-planning-agent function selectStrategy(context) { const { passRate, iteration, stuckTests, regressionDetected } = context; if (iteration <= 2) return "conservative"; if (passRate > 80 && context.failurePattern.similarity > 0.7) return "aggressive"; if (stuckTests > 0 && iteration >= 3) return "exploratory"; if (regressionDetected) return "surgical"; return "conservative"; } ``` **Integration**: Orchestrator passes selected strategy to @cli-planning-agent in prompt. ### 2. Progressive Testing **Concept**: Run affected tests during iterations, full suite on final validation. **Affected Test Detection**: - @cli-planning-agent analyzes modified files from fix strategy - Maps to test files via imports and integration patterns - Returns `affected_tests[]` in fix task JSON **Execution**: - **Iteration N**: `npm test -- ${affected_tests.join(' ')}` (fast feedback) - **Final validation**: `npm test` (comprehensive check) **Benefits**: 70-90% iteration speed improvement. **Task JSON Integration**: ```json { "context": { "fix_strategy": { "test_execution": { "mode": "affected_only", "affected_tests": ["tests/auth/client.test.ts"], "fallback_to_full": true } } } } ``` ### 3. Parallel Strategy Exploration **Trigger**: Iteration >= 3 AND stuck tests detected (same test failed 3+ consecutive times) **Workflow**: ``` Stuck State Detected → Switch to "exploratory" strategy: ├── @cli-planning-agent generates 3 alternative fix approaches ├── Orchestrator creates 3 task JSONs: IMPL-fix-Na, IMPL-fix-Nb, IMPL-fix-Nc ├── Execute in parallel via git worktree: │ ├── worktree-a/ → Apply fix-Na → Test → Score: 92% │ ├── worktree-b/ → Apply fix-Nb → Test → Score: 88% │ └── worktree-c/ → Apply fix-Nc → Test → Score: 95% ✓ └── Select best result, apply to main branch ``` **Implementation**: ```bash # Orchestrator creates worktrees git worktree add ../worktree-a git worktree add ../worktree-b git worktree add ../worktree-c # Launch 3 @test-fix-agent instances in parallel Task(subagent_type="test-fix-agent", prompt="...", model="haiku") x3 # Collect results, select winner by pass_rate ``` **Task JSON**: ```json { "id": "IMPL-fix-3-parallel", "meta": { "type": "parallel-exploration", "strategies": ["IMPL-fix-3a", "IMPL-fix-3b", "IMPL-fix-3c"], "selection_criteria": "highest_pass_rate" } } ``` ## Core Responsibilities **Orchestrator**: - Session discovery and task queue management - Pass rate calculation: `(passed / total) * 100` from test-results.json - Criticality assessment from test-results.json - Strategy selection (conservative/aggressive/exploratory/surgical) - Stuck test detection (same test failed 3+ times) - Regression detection (pass rate dropped >10%) - Iteration control (max 10 iterations, default) - CLI tool selection (Gemini → Qwen → Codex fallback chain) - Failure analysis delegation to @cli-planning-agent - TodoWrite progress tracking - Session auto-complete when pass rate >= 95% **@cli-planning-agent**: - Execute CLI analysis (Gemini/Qwen/Codex) with bug diagnosis template - Parse CLI output for root causes and fix strategy - Generate IMPL-fix-N.json with structured task definition - Detect affected tests for progressive testing - Save analysis reports and CLI output **@test-fix-agent**: - Execute tests and save results to test-results.json - Apply fixes from task JSON fix_strategy - Assign criticality to test failures - Update task status ## Execution Flow ### Phase 1: Discovery & Initialization 1. Detect test-fix session from `workflow_type: "test_session"` 2. Load workflow-session.json, IMPL_PLAN.md, .task/*.json 3. Initialize TodoWrite with initial tasks 4. Setup iteration context (current=0, max=10, strategy="conservative") **Resume Mode**: Load iteration-state.json, continue from `next_action`. ### Phase 2: Task Execution Loop ```javascript For each task in queue: 1. Load task JSON and context 2. Determine task type (test-gen, test-fix, fix-iteration, parallel-exploration) 3. Execute via appropriate agent 4. Collect results from test-results.json 5. Calculate pass rate: (passed / total) * 100 6. Assess criticality from test-results.json 7. Detect regression: if passRate < previousPassRate - 10% → REGRESSION 8. Detect stuck tests: if same test failed 3+ consecutive iterations → STUCK 9. Make threshold decision: IF passRate === 100%: → SUCCESS: Mark complete, continue to next task ELSE IF passRate >= 95%: → CHECK criticality: - All "low" → PARTIAL SUCCESS (auto-approve with note) - Any "high"/"medium" → Enter fix loop (step 10) ELSE IF passRate < 95%: → FAILED: Enter fix loop (step 10) 10. Fix Loop (if needed): a. Select strategy: conservative/aggressive/exploratory/surgical b. If strategy === "exploratory": - Invoke @cli-planning-agent for 3 alternative approaches - Create parallel tasks (IMPL-fix-Na, Nb, Nc) - Execute in parallel via git worktree - Select best result (highest pass_rate) - Apply to main branch c. Else: - Invoke @cli-planning-agent with selected strategy - Generate single IMPL-fix-N.json - Insert at queue front d. Continue loop 11. Check max iterations (abort if exceeded) ``` ### Phase 3: Iteration Cycle (Per Fix Task) **Structure**: ``` Iteration N: ├── 1. Test Execution (via @test-fix-agent) │ ├── Mode: affected_only or full_suite (from task JSON) │ ├── Save results: test-results.json, test-output.log │ └── Report to orchestrator ├── 2. Pass Rate Validation (orchestrator) │ ├── Calculate: (passed / total) * 100 │ ├── Assess criticality │ ├── Detect regression/stuck │ └── Select strategy ├── 3. Failure Analysis (via @cli-planning-agent) │ ├── Input: failure context + selected strategy │ ├── CLI tool: Gemini (fallback: Qwen → Codex) │ ├── Output: IMPL-fix-N.json, iteration-N-analysis.md, cli-output.txt │ └── Return task ID to orchestrator └── 4. Fix Execution (via @test-fix-agent) ├── Load fix_strategy from task JSON ├── Apply changes (surgical fixes) └── Return to step 1 for re-test ``` ### Phase 4: Completion **Success Conditions**: - All tasks completed - Pass rate === 100% (full success) **Partial Success Conditions**: - All tasks completed - Pass rate >= 95% and < 100% - All failures are "low" criticality - **Action**: Auto-approve with review note **Completion Steps**: 1. Final validation: Run full test suite 2. Calculate final pass rate 3. Assess status (full/partial/failure) 4. Update session state 5. Generate summary with pass rate metrics 6. Call `/workflow:session:complete` **Failure Conditions**: - Max iterations (10) reached without 95% pass rate - Pass rate < 95% after max iterations - Unrecoverable errors **Failure Handling**: 1. Document final state with pass rate 2. Generate failure report (remaining failures, iteration history, recommendations) 3. Mark tasks as blocked 4. Return control to user ## Agent Coordination Details ### @cli-planning-agent Invocation **CLI Tool Selection** (fallback chain): 1. **Gemini** (primary): `gemini-2.5-pro` 2. **Qwen** (fallback): `coder-model` 3. **Codex** (fallback): `gpt-5.1-codex` **Template**: `~/.claude/workflows/cli-templates/prompts/analysis/01-diagnose-bug-root-cause.txt` **Timeout**: 40min (2400000ms) **Prompt Structure**: ```javascript Task( subagent_type="cli-planning-agent", description=`Analyze test failures (iteration ${N}) - ${strategy} strategy`, prompt=` ## Task Objective Analyze test failures and generate fix task JSON for iteration ${N} ## Strategy ${selectedStrategy} - ${strategyDescription} ## MANDATORY FIRST STEPS 1. Read test results: ${session.test_results_path} 2. Read test output: ${session.test_output_path} 3. Read iteration state: ${session.iteration_state_path} ## Session Paths - Workflow Dir: ${session.workflow_dir} - Test Results: ${session.test_results_path} - Task Output Dir: ${session.task_dir} - Analysis Output: ${session.process_dir}/iteration-${N}-analysis.md ## Context Metadata - Session ID: ${sessionId} - Current Iteration: ${N} - Max Iterations: 10 - Current Pass Rate: ${passRate}% - Selected Strategy: ${selectedStrategy} - Stuck Tests: ${stuckTests} ## CLI Configuration - Tool Priority: gemini → qwen → codex - Template: 01-diagnose-bug-root-cause.txt - Timeout: 2400000ms ## Expected Deliverables 1. Task JSON: ${session.task_dir}/IMPL-fix-${N}.json - Must include: fix_strategy.test_execution.affected_tests[] - Must include: fix_strategy.confidence_score 2. Analysis report: ${session.process_dir}/iteration-${N}-analysis.md 3. CLI output: ${session.process_dir}/iteration-${N}-cli-output.txt 4. Return task ID to orchestrator ## Strategy-Specific Requirements ${strategyRequirements[selectedStrategy]} ## Success Criteria - Valid IMPL-fix-${N}.json with all required fields - Concrete fix strategy with modification points (file:function:lines) - Affected tests list for progressive testing - Root cause analysis (not just symptoms) ` ) ``` **Strategy-Specific Requirements**: ```javascript const strategyRequirements = { conservative: "Single targeted fix, high confidence required", aggressive: "Batch fix similar failures, pattern-based approach", exploratory: "Generate 3 alternative approaches with different root cause hypotheses", surgical: "Minimal changes, focus on rollback safety" }; ``` ### @test-fix-agent Context Loading **File Loading Sequence**: 1. Task JSON (`task_json_path`) 2. Iteration state (`iteration_state_path`) 3. Test results (`test_results_path`) 4. Test output (`test_output_path`) 5. Analysis report (`analysis_path`) 6. Fix history (`fix_history_path`) **Paths Provided by Orchestrator**: ```javascript { "task_json_path": ".workflow/active/WFS-test-{session}/.task/IMPL-fix-N.json", "iteration_state_path": ".workflow/active/WFS-test-{session}/.process/iteration-state.json", "test_results_path": ".workflow/active/WFS-test-{session}/.process/test-results.json", "test_output_path": ".workflow/active/WFS-test-{session}/.process/test-output.log", "analysis_path": ".workflow/active/WFS-test-{session}/.process/iteration-N-analysis.md", "workflow_dir": ".workflow/active/WFS-test-{session}/", "session_id": "WFS-test-{session}", "current_iteration": N, "max_iterations": 10 } ``` ## Data Structures ### Session File Structure ``` .workflow/active/WFS-test-{session}/ ├── workflow-session.json ├── IMPL_PLAN.md, TODO_LIST.md ├── .task/ │ ├── IMPL-{001,002}.json # Initial tasks │ ├── IMPL-fix-{N}.json # Generated fix tasks │ └── IMPL-fix-{N}-parallel.json # Parallel exploration tasks ├── .process/ │ ├── iteration-state.json # Current iteration + strategy │ ├── test-results.json # Latest test results │ ├── test-output.log # Full test output │ ├── fix-history.json # All fix attempts │ ├── iteration-{N}-analysis.md # CLI analysis │ └── iteration-{N}-cli-output.txt # Raw CLI output └── .summaries/iteration-summaries/ ``` ### Iteration State JSON ```json { "session_id": "WFS-test-user-auth", "current_task": "IMPL-002", "current_iteration": 3, "max_iterations": 10, "selected_strategy": "aggressive", "iterations": [ { "iteration": 1, "pass_rate": 70, "strategy": "conservative", "stuck_tests": [], "regression_detected": false }, { "iteration": 2, "pass_rate": 82, "strategy": "conservative", "stuck_tests": [], "regression_detected": false }, { "iteration": 3, "pass_rate": 89, "strategy": "aggressive", "stuck_tests": ["test_auth_edge_case"], "regression_detected": false } ], "stuck_tests": ["test_auth_edge_case"], "next_action": "execute_fix_task" } ``` ### Task JSON Template (Fix Iteration) ```json { "id": "IMPL-fix-3", "title": "Fix test failures - Iteration 3 (aggressive)", "status": "pending", "meta": { "type": "test-fix-iteration", "agent": "@test-fix-agent", "iteration": 3, "strategy": "aggressive", "max_iterations": 10 }, "context": { "requirements": ["Fix identified test failures", "Address root causes"], "failure_context": { "failed_tests": ["test1", "test2"], "error_messages": ["error1", "error2"], "pass_rate": 89, "stuck_tests": ["test_auth_edge_case"] }, "fix_strategy": { "approach": "Batch fix authentication validation across multiple endpoints", "modification_points": ["src/auth/middleware.ts:validateToken:45-60"], "confidence_score": 0.85, "test_execution": { "mode": "affected_only", "affected_tests": ["tests/auth/*.test.ts"], "fallback_to_full": true } } } } ``` ## Error Handling | Scenario | Action | |----------|--------| | Test execution error | Log error, retry with error context | | CLI analysis failure | Fallback: Gemini → Qwen → Codex → manual | | Agent execution error | Save state, retry with simplified context | | Max iterations reached | Generate failure report, mark blocked | | Regression detected | Rollback last fix, switch to surgical strategy | | Stuck tests detected | Switch to exploratory strategy (parallel) | **CLI Fallback Chain**: 1. Try Gemini (primary) 2. If HTTP 429 or error: Try Qwen 3. If Qwen fails: Try Codex 4. If all fail: Mark as degraded, use basic pattern matching ## TodoWrite Coordination **Structure**: ```javascript TodoWrite({ todos: [ { content: "Execute IMPL-001: Generate tests [code-developer]", status: "completed", activeForm: "Executing test generation" }, { content: "Execute IMPL-002: Test & Fix Cycle [ITERATION]", status: "in_progress", activeForm: "Running test-fix iteration cycle" }, { content: " → Iteration 1: Initial test (pass: 70%, conservative)", status: "completed", activeForm: "Running initial tests" }, { content: " → Iteration 2: Fix validation (pass: 82%, conservative)", status: "completed", activeForm: "Fixing validation issues" }, { content: " → Iteration 3: Batch fix auth (pass: 89%, aggressive)", status: "in_progress", activeForm: "Fixing authentication issues" } ] }); ``` **Update Rules**: 1. Add iteration item with strategy and pass rate 2. Mark completed after each iteration 3. Update parent task when all complete ## Usage ```bash # Execute test-fix workflow (discovers active session) /workflow:test-cycle-execute # Resume interrupted session /workflow:test-cycle-execute --resume-session="WFS-test-user-auth" # Custom iteration limit /workflow:test-cycle-execute --max-iterations=15 ``` ## Best Practices 1. **Iteration Limits**: Default 10, adjust based on complexity 2. **Commit Between Iterations**: Enables easier rollback 3. **Monitor Iteration Logs**: Review CLI analysis quality 4. **Trust Strategy Engine**: Let orchestrator select optimal approach 5. **Leverage Progressive Testing**: 70-90% faster iterations 6. **Watch for Stuck Tests**: Parallel exploration auto-triggers at iteration 3