diff --git a/.claude/commands/workflow/test-cycle-execute.md b/.claude/commands/workflow/test-cycle-execute.md index 8631d001..31740748 100644 --- a/.claude/commands/workflow/test-cycle-execute.md +++ b/.claude/commands/workflow/test-cycle-execute.md @@ -160,7 +160,26 @@ Task(subagent_type="test-fix-agent", model="haiku", ...) x3 # Collect results, select winner ``` -**Benefits**: Escapes stuck situations, explores solution space efficiently. +**Operational Lifecycle**: +1. **Setup**: Create 3 isolated worktrees (`../worktree-{a,b,c}`) +2. **Execute**: Run 3 agents in parallel, each in its own worktree +3. **Collect**: Gather test results and pass rates from all branches +4. **Select**: Choose branch with highest pass_rate (ties: prefer higher confidence_score) +5. **Apply**: Merge winning branch changes to main branch +6. **Cleanup**: Remove all worktrees: `git worktree remove ../worktree-{a,b,c} --force` +7. **Commit**: Create checkpoint with winning strategy details + +**Resource Considerations**: +- Parallel execution requires ~3x CPU/memory of single iteration +- Timeout per branch: 10min (aborts if exceeded) +- If any branch setup fails: Fall back to sequential exploratory mode (try strategies one by one) + +**Error Handling**: +- **Worktree creation fails**: Skip parallel mode, use sequential exploration +- **All branches fail to improve**: Document in failure report, increment iteration normally +- **Partial success** (1-2 branches succeed): Still select best result, proceed + +**Benefits**: Escapes stuck situations, explores solution space efficiently, automatic cleanup. ## Core Responsibilities @@ -462,6 +481,33 @@ const taskTypeSuccessCriteria = { | Regression detected | Rollback last fix, switch to surgical strategy | | Stuck tests detected | Switch to exploratory strategy (parallel) | +**CLI Fallback Triggers** (Gemini → Qwen → Codex → manual): + +Fallback is triggered when any of these conditions occur: + +1. **Invalid Output**: + - CLI tool fails to generate valid `IMPL-fix-N.json` (JSON parse error) + - Missing required fields: `fix_strategy.modification_points` or `fix_strategy.affected_tests` + +2. **Low Confidence**: + - `fix_strategy.confidence_score < 0.4` (indicates uncertain analysis) + +3. **Technical Failures**: + - HTTP 429 (rate limit) or 5xx errors + - Timeout (exceeds 2400000ms / 40min) + - Connection errors + +4. **Quality Degradation**: + - Analysis report < 100 words (too brief, likely incomplete) + - No concrete modification points provided (only general suggestions) + - Same root cause identified 3+ consecutive times (stuck analysis) + +**Fallback Sequence**: +- Try primary tool (Gemini) +- If trigger detected → Try fallback (Qwen) +- If trigger detected again → Try final fallback (Codex) +- If all fail → Mark as degraded, use basic pattern matching from fix-history.json, notify user + ### TodoWrite Structure ```javascript @@ -501,11 +547,49 @@ TodoWrite({ - Mark completed after each iteration - Update parent task when all complete +## Commit Strategy + +**Automatic Commits** (orchestrator-managed): + +The orchestrator automatically creates git commits at key checkpoints to enable safe rollback: + +1. **After Successful Iteration** (pass rate increased): + ```bash + git add . + git commit -m "test-cycle: iteration ${N} - ${strategy} strategy (pass: ${oldRate}% → ${newRate}%)" + ``` + +2. **After Parallel Exploration** (winning strategy applied): + ```bash + git add . + git commit -m "test-cycle: iteration ${N} - exploratory (selected strategy ${winner}, pass: ${rate}%)" + ``` + +3. **Before Rollback** (regression detected): + ```bash + # Current state preserved, then: + git revert HEAD + git commit -m "test-cycle: rollback iteration ${N} - regression detected (pass: ${newRate}% < ${oldRate}%)" + ``` + +**Commit Content**: +- Modified source files from fix application +- Updated test-results.json, iteration-state.json +- Excludes: temporary files, logs, worktrees + +**Benefits**: +- **Rollback Safety**: Each iteration is a revert point +- **Progress Tracking**: Git history shows iteration evolution +- **Audit Trail**: Clear record of which strategy/iteration caused issues +- **Resume Capability**: Can resume from any checkpoint + +**Note**: Final session completion creates additional commit with full summary. + ## Best Practices 1. **Default Settings Work**: 10 iterations sufficient for most cases -2. **Commit Between Iterations**: Enables rollback if needed +2. **Automatic Commits**: Orchestrator commits after each successful iteration - no manual intervention needed 3. **Trust Strategy Engine**: Auto-selection based on proven heuristics -4. **Monitor Logs**: Check `.process/iteration-N-analysis.md` for insights -5. **Progressive Testing**: Saves 70-90% iteration time -6. **Parallel Exploration**: Auto-triggers at iteration 3 if stuck +4. **Monitor Logs**: Check `.process/iteration-N-analysis.md` for CLI analysis insights +5. **Progressive Testing**: Saves 70-90% iteration time automatically +6. **Parallel Exploration**: Auto-triggers at iteration 3 if stuck (no configuration needed)