# Phase 2: Test-Cycle Execution Dynamic test-fix execution with **adaptive task generation** based on runtime analysis. Iterative fix cycles until test pass rate >= 95% or max iterations reached. All agent interactions use spawn_agent → wait → close_agent lifecycle. **vs Standard Execute**: - **Standard**: Pre-defined tasks → Execute sequentially → Done - **Test-Cycle**: Initial tasks → **Test → Analyze failures → Generate fix tasks → Fix → Re-test** → Repeat until pass ## Agent Roles | Agent | Responsibility | |-------|---------------| | **Orchestrator** | Loop control, strategy selection, pass rate calculation, threshold decisions | | **@cli-planning-agent** | CLI analysis (Gemini/Qwen/Codex), root cause extraction, task generation, affected test detection | | **@test-fix-agent** | Test execution, code fixes, criticality assignment, result reporting | ## Core Responsibilities ### Orchestrator - Session discovery, task queue management - Pass rate calculation: `(passed / total) * 100` from test-results.json - Criticality assessment (high/medium/low) - Strategy selection based on context - **Runtime Calculations** (from iteration-state.json): - Current iteration: `iterations.length + 1` - Stuck tests: Tests appearing in `failed_tests` for 3+ consecutive iterations - Regression: Compare consecutive `pass_rate` values (>10% drop) - Max iterations: Read from `task.meta.max_iterations` - Iteration control (max 10, default) - CLI tool fallback chain: Gemini → Qwen → Codex - Progress tracking - Session auto-complete (pass rate >= 95%) - **Explicit Lifecycle**: Always close_agent after wait completes ### @cli-planning-agent - Execute CLI analysis with bug diagnosis template - Parse output for root causes and fix strategy - Generate IMPL-fix-N.json task definition - Detect affected tests for progressive testing - Save: analysis.md, cli-output.txt ### @test-fix-agent - Execute tests, save results to test-results.json - Apply fixes from task.context.fix_strategy - Assign criticality to failures - Update task status ## Intelligent Strategy Engine **Auto-selects optimal strategy based on iteration context:** | Strategy | Trigger | Behavior | |----------|---------|----------| | **Conservative** | Iteration 1-2 (default) | Single targeted fix, full validation | | **Aggressive** | Pass rate >80% + similar failures | Batch fix related issues | | **Surgical** | Regression detected (pass rate drops >10%) | Minimal changes, rollback focus | **Selection Logic** (in orchestrator): ```javascript if (iteration <= 2) return "conservative"; if (passRate > 80 && failurePattern.similarity > 0.7) return "aggressive"; if (regressionDetected) return "surgical"; return "conservative"; ``` **Integration**: Strategy passed to @cli-planning-agent in prompt for tailored analysis. ## Progressive Testing **Runs affected tests during iterations, full suite only for final validation.** **How It Works**: 1. @cli-planning-agent analyzes fix_strategy.modification_points 2. Maps modified files to test files (via imports + integration patterns) 3. Returns `affected_tests[]` in task JSON 4. @test-fix-agent runs: `npm test -- ${affected_tests.join(' ')}` 5. Final validation: `npm test` (full suite) **Benefits**: 70-90% iteration speed improvement, instant feedback on fix effectiveness. ## Agent Invocation Template **@cli-planning-agent** (failure analysis): ```javascript // Spawn agent for failure analysis const analysisAgentId = spawn_agent({ message: ` ## TASK ASSIGNMENT ### MANDATORY FIRST STEPS (Agent Execute) 1. **Read role definition**: ~/.codex/agents/cli-planning-agent.md (MUST read first) 2. Read: {projectRoot}/.workflow/project-tech.json 3. Read: {projectRoot}/.workflow/project-guidelines.json --- ## Task Objective Analyze test failures and generate fix task JSON for iteration ${N} ## Strategy ${selectedStrategy} - ${strategyDescription} ## MANDATORY FIRST STEPS 1. Read test results: ${session.test_results_path} 2. Read test output: ${session.test_output_path} 3. Read iteration state: ${session.iteration_state_path} ## Context Metadata (Orchestrator-Calculated) - Session ID: ${sessionId} (from file path) - Current Iteration: ${N} (= iterations.length + 1) - Max Iterations: ${maxIterations} (from task.meta.max_iterations) - Current Pass Rate: ${passRate}% - Selected Strategy: ${selectedStrategy} (from iteration-state.json) - Stuck Tests: ${stuckTests} (calculated from iterations[].failed_tests history) ## CLI Configuration - Tool Priority: gemini & codex - Template: 01-diagnose-bug-root-cause.txt - Timeout: 2400000ms ## Expected Deliverables 1. Task JSON: ${session.task_dir}/IMPL-fix-${N}.json - Must include: fix_strategy.test_execution.affected_tests[] - Must include: fix_strategy.confidence_score 2. Analysis report: ${session.process_dir}/iteration-${N}-analysis.md 3. CLI output: ${session.process_dir}/iteration-${N}-cli-output.txt ## Strategy-Specific Requirements - Conservative: Single targeted fix, high confidence required - Aggressive: Batch fix similar failures, pattern-based approach - Surgical: Minimal changes, focus on rollback safety ## Success Criteria - Concrete fix strategy with modification points (file:function:lines) - Affected tests list for progressive testing - Root cause analysis (not just symptoms) ` }); // Wait for analysis completion const analysisResult = wait({ ids: [analysisAgentId], timeout_ms: 2400000 // 40 minutes (CLI analysis timeout) }); // Clean up close_agent({ id: analysisAgentId }); ``` **@test-fix-agent** (execution): ```javascript // Spawn agent for test execution/fixing const fixAgentId = spawn_agent({ message: ` ## TASK ASSIGNMENT ### MANDATORY FIRST STEPS (Agent Execute) 1. **Read role definition**: ~/.codex/agents/test-fix-agent.md (MUST read first) 2. Read: {projectRoot}/.workflow/project-tech.json 3. Read: {projectRoot}/.workflow/project-guidelines.json --- ## Task Objective ${taskTypeObjective[task.meta.type]} ## MANDATORY FIRST STEPS 1. Read task JSON: ${session.task_json_path} 2. Read iteration state: ${session.iteration_state_path} 3. ${taskTypeSpecificReads[task.meta.type]} ## CRITICAL: Syntax Check Priority **Before any code modification or test execution:** - Run project syntax checker (TypeScript: tsc --noEmit, ESLint, etc.) - Verify zero syntax errors before proceeding - If syntax errors found: Fix immediately before other work - Syntax validation is MANDATORY gate - no exceptions ## Session Paths - Workflow Dir: ${session.workflow_dir} - Task JSON: ${session.task_json_path} - Test Results Output: ${session.test_results_path} - Test Output Log: ${session.test_output_path} - Iteration State: ${session.iteration_state_path} ## Task Type: ${task.meta.type} ${taskTypeGuidance[task.meta.type]} ## Expected Deliverables ${taskTypeDeliverables[task.meta.type]} ## Success Criteria - ${taskTypeSuccessCriteria[task.meta.type]} - Update task status in task JSON - Save all outputs to specified paths - Report completion to orchestrator ` }); // Wait for execution completion const fixResult = wait({ ids: [fixAgentId], timeout_ms: 600000 // 10 minutes }); // Clean up close_agent({ id: fixAgentId }); // Task Type Configurations const taskTypeObjective = { "test-gen": "Generate comprehensive tests based on requirements", "test-fix": "Execute test suite and report results with criticality assessment", "test-fix-iteration": "Apply fixes from strategy and validate with tests" }; const taskTypeSpecificReads = { "test-gen": "Read test context: ${session.test_context_path}", "test-fix": "Read previous results (if exists): ${session.test_results_path}", "test-fix-iteration": "Read fix strategy: ${session.analysis_path}, fix history: ${session.fix_history_path}" }; const taskTypeGuidance = { "test-gen": ` - Review task.context.requirements for test scenarios - Analyze codebase to understand implementation - Generate tests covering: happy paths, edge cases, error handling - Follow existing test patterns and framework conventions `, "test-fix": ` - Run test command from task.context or project config - Capture: pass/fail counts, error messages, stack traces - Assess criticality for each failure: * high: core functionality broken, security issues * medium: feature degradation, data integrity issues * low: edge cases, flaky tests, env-specific issues - Save structured results to test-results.json `, "test-fix-iteration": ` - Load fix_strategy from task.context.fix_strategy - Identify modification_points: ${task.context.fix_strategy.modification_points} - Apply surgical fixes (minimal changes) - Test execution mode: ${task.context.fix_strategy.test_execution.mode} * affected_only: Run ${task.context.fix_strategy.test_execution.affected_tests} * full_suite: Run complete test suite - If failures persist: Document in test-results.json, DO NOT analyze (orchestrator handles) ` }; const taskTypeDeliverables = { "test-gen": "- Test files in target directories\n - Test coverage report\n - Summary in .summaries/", "test-fix": "- test-results.json (pass_rate, criticality, failures)\n - test-output.log (full test output)\n - Summary in .summaries/", "test-fix-iteration": "- Modified source files\n - test-results.json (updated pass_rate)\n - test-output.log\n - Summary in .summaries/" }; const taskTypeSuccessCriteria = { "test-gen": "All test files created, executable without errors, coverage documented", "test-fix": "Test results saved with accurate pass_rate and criticality, all failures documented", "test-fix-iteration": "Fixes applied per strategy, tests executed, results reported (pass/fail to orchestrator)" }; ``` ## CLI Tool Configuration **Fallback Chain**: Gemini → Qwen → Codex **Template**: `~/.ccw/workflows/cli-templates/prompts/analysis/01-diagnose-bug-root-cause.txt` **Timeout**: 40min (2400000ms) **Tool Details**: 1. **Gemini** (primary): `gemini-2.5-pro` 2. **Qwen** (fallback): `coder-model` 3. **Codex** (fallback): `gpt-5.1-codex` **When to Fallback**: HTTP 429, timeout, analysis quality degraded ## Session File Structure ``` {projectRoot}/.workflow/active/WFS-test-{session}/ ├── workflow-session.json # Session metadata ├── IMPL_PLAN.md, TODO_LIST.md ├── .task/ │ ├── IMPL-{001,002}.json # Initial tasks │ └── IMPL-fix-{N}.json # Generated fix tasks ├── .process/ │ ├── iteration-state.json # Current iteration + strategy + stuck tests │ ├── test-results.json # Latest results (pass_rate, criticality) │ ├── test-output.log # Full test output │ ├── fix-history.json # All fix attempts │ ├── iteration-{N}-analysis.md # CLI analysis report │ └── iteration-{N}-cli-output.txt └── .summaries/iteration-summaries/ ``` ## Iteration State JSON **Purpose**: Persisted state machine for iteration loop - enables Resume and historical analysis. ```json { "current_task": "IMPL-002", "selected_strategy": "aggressive", "next_action": "execute_fix_task", "iterations": [ { "iteration": 1, "pass_rate": 70, "strategy": "conservative", "failed_tests": ["test_auth_flow", "test_user_permissions"] }, { "iteration": 2, "pass_rate": 82, "strategy": "conservative", "failed_tests": ["test_user_permissions", "test_token_expiry"] }, { "iteration": 3, "pass_rate": 89, "strategy": "aggressive", "failed_tests": ["test_auth_edge_case"] } ] } ``` **Field Descriptions**: - `current_task`: Pointer to active task (essential for Resume) - `selected_strategy`: Current iteration strategy (runtime state) - `next_action`: State machine next step (`execute_fix_task` | `retest` | `complete`) - `iterations[]`: Historical log of all iterations (source of truth for trends) ## Completion Conditions **Full Success**: - All tasks completed - Pass rate === 100% - Action: Auto-complete session **Partial Success**: - All tasks completed - Pass rate >= 95% and < 100% - All failures are "low" criticality - Action: Auto-approve with review note **Failure**: - Max iterations (10) reached without 95% pass rate - Pass rate < 95% after max iterations - Action: Generate failure report, mark blocked, return to user ## Error Handling | Scenario | Action | |----------|--------| | Test execution error | Log, retry with error context | | CLI analysis failure | Fallback: Gemini → Qwen → Codex → manual | | Agent execution error | Save state, close_agent, retry with simplified context | | Max iterations reached | Generate failure report, mark blocked | | Regression detected | Rollback last fix, switch to surgical strategy | | Stuck tests detected | Continue with alternative strategy, document in failure report | **CLI Fallback Triggers** (Gemini → Qwen → Codex → manual): Fallback is triggered when any of these conditions occur: 1. **Invalid Output**: - CLI tool fails to generate valid `IMPL-fix-N.json` (JSON parse error) - Missing required fields: `fix_strategy.modification_points` or `fix_strategy.affected_tests` 2. **Low Confidence**: - `fix_strategy.confidence_score < 0.4` (indicates uncertain analysis) 3. **Technical Failures**: - HTTP 429 (rate limit) or 5xx errors - Timeout (exceeds 2400000ms / 40min) - Connection errors 4. **Quality Degradation**: - Analysis report < 100 words (too brief, likely incomplete) - No concrete modification points provided (only general suggestions) - Same root cause identified 3+ consecutive times (stuck analysis) **Fallback Sequence**: - Try primary tool (Gemini) - If trigger detected → Try fallback (Qwen) - If trigger detected again → Try final fallback (Codex) - If all fail → Mark as degraded, use basic pattern matching from fix-history.json, notify user **Lifecycle Error Handling**: ```javascript try { const agentId = spawn_agent({ message: "..." }); const result = wait({ ids: [agentId], timeout_ms: 2400000 }); // ... process result ... close_agent({ id: agentId }); } catch (error) { if (agentId) close_agent({ id: agentId }); // Save state for resume capability throw error; } ``` ## Progress Tracking Pattern ```javascript [ { content: "Execute IMPL-001: Generate tests [code-developer]", status: "completed" }, { content: "Execute IMPL-002: Test & Fix Cycle [ITERATION]", status: "in_progress" }, { content: " → Iteration 1: Initial test (pass: 70%, conservative)", status: "completed" }, { content: " → Iteration 2: Fix validation (pass: 82%, conservative)", status: "completed" }, { content: " → Iteration 3: Batch fix auth (pass: 89%, aggressive)", status: "in_progress" } ] ``` **Update Rules**: - Add iteration item with: strategy, pass rate - Mark completed after each iteration - Update parent task when all complete ## Commit Strategy **Automatic Commits** (orchestrator-managed): The orchestrator automatically creates git commits at key checkpoints to enable safe rollback: 1. **After Successful Iteration** (pass rate increased): ```bash git add . git commit -m "test-cycle: iteration ${N} - ${strategy} strategy (pass: ${oldRate}% → ${newRate}%)" ``` 2. **Before Rollback** (regression detected): ```bash # Current state preserved, then: git revert HEAD git commit -m "test-cycle: rollback iteration ${N} - regression detected (pass: ${newRate}% < ${oldRate}%)" ``` **Commit Content**: - Modified source files from fix application - Updated test-results.json, iteration-state.json - Excludes: temporary files, logs **Benefits**: - **Rollback Safety**: Each iteration is a revert point - **Progress Tracking**: Git history shows iteration evolution - **Audit Trail**: Clear record of which strategy/iteration caused issues - **Resume Capability**: Can resume from any checkpoint **Note**: Final session completion creates additional commit with full summary. ## Post-Completion Expansion 完成后询问用户是否扩展为issue(test/enhance/refactor/doc),选中项创建新 issue: `"{summary} - {dimension}"` ## Best Practices 1. **Default Settings Work**: 10 iterations sufficient for most cases 2. **Automatic Commits**: Orchestrator commits after each successful iteration - no manual intervention needed 3. **Trust Strategy Engine**: Auto-selection based on proven heuristics 4. **Monitor Logs**: Check `.process/iteration-N-analysis.md` for CLI analysis insights 5. **Progressive Testing**: Saves 70-90% iteration time automatically 6. **Always Close Agents**: Ensure close_agent is called after every wait completes, including error paths