From 2e8fe1e77aa2b15573de091d3f96cc7b983d6dc5 Mon Sep 17 00:00:00 2001
From: catlog22 <catlog22@github.com>
Date: Mon, 24 Nov 2025 23:16:22 +0800
Subject: [PATCH] refactor: address Gemini review recommendations
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Based on comprehensive Gemini analysis (Overall: Excellent), implemented 3 improvements:

1. Clarified CLI Fallback Triggers
   - Added 4 concrete trigger conditions (invalid output, low confidence, technical failures, quality degradation)
   - Defined confidence_score threshold: < 0.4
   - Specified fallback sequence with final degraded mode

2. Detailed Parallel Exploration Operations
   - Added "Operational Lifecycle" (7-step process: setup → execute → collect → select → apply → cleanup → commit)
   - Resource considerations: 3x CPU/memory, 10min timeout per branch
   - Error handling: worktree failure fallback, partial success handling

3. Explicit Commit Strategy
   - New "Commit Strategy" section with automatic commits at 3 checkpoints
   - After successful iteration (pass rate increased)
   - After parallel exploration (winning strategy)
   - Before rollback (regression detected)
   - Updated Best Practices #2: "Automatic Commits" (no manual intervention)

Changes:
- Lines: 515 → 596 (+81 for clarity, still -25% from original 791)
- Reduced ambiguity in critical automation points
- Enhanced operational robustness documentation

Gemini Assessment: "Exceptional technical documentation", "Robust design", "Clear and comprehensive"
---
 .../commands/workflow/test-cycle-execute.md   | 94 ++++++++++++++++++-
 1 file changed, 89 insertions(+), 5 deletions(-)

diff --git a/.claude/commands/workflow/test-cycle-execute.md b/.claude/commands/workflow/test-cycle-execute.md
index 8631d001..31740748 100644
--- a/.claude/commands/workflow/test-cycle-execute.md
+++ b/.claude/commands/workflow/test-cycle-execute.md
@@ -160,7 +160,26 @@ Task(subagent_type="test-fix-agent", model="haiku", ...) x3
 # Collect results, select winner
 ```
 
-**Benefits**: Escapes stuck situations, explores solution space efficiently.
+**Operational Lifecycle**:
+1. **Setup**: Create 3 isolated worktrees (`../worktree-{a,b,c}`)
+2. **Execute**: Run 3 agents in parallel, each in its own worktree
+3. **Collect**: Gather test results and pass rates from all branches
+4. **Select**: Choose branch with highest pass_rate (ties: prefer higher confidence_score)
+5. **Apply**: Merge winning branch changes to main branch
+6. **Cleanup**: Remove all worktrees: `git worktree remove ../worktree-{a,b,c} --force`
+7. **Commit**: Create checkpoint with winning strategy details
+
+**Resource Considerations**:
+- Parallel execution requires ~3x CPU/memory of single iteration
+- Timeout per branch: 10min (aborts if exceeded)
+- If any branch setup fails: Fall back to sequential exploratory mode (try strategies one by one)
+
+**Error Handling**:
+- **Worktree creation fails**: Skip parallel mode, use sequential exploration
+- **All branches fail to improve**: Document in failure report, increment iteration normally
+- **Partial success** (1-2 branches succeed): Still select best result, proceed
+
+**Benefits**: Escapes stuck situations, explores solution space efficiently, automatic cleanup.
 
 ## Core Responsibilities
 
@@ -462,6 +481,33 @@ const taskTypeSuccessCriteria = {
 | Regression detected | Rollback last fix, switch to surgical strategy |
 | Stuck tests detected | Switch to exploratory strategy (parallel) |
 
+**CLI Fallback Triggers** (Gemini → Qwen → Codex → manual):
+
+Fallback is triggered when any of these conditions occur:
+
+1. **Invalid Output**:
+   - CLI tool fails to generate valid `IMPL-fix-N.json` (JSON parse error)
+   - Missing required fields: `fix_strategy.modification_points` or `fix_strategy.affected_tests`
+
+2. **Low Confidence**:
+   - `fix_strategy.confidence_score < 0.4` (indicates uncertain analysis)
+
+3. **Technical Failures**:
+   - HTTP 429 (rate limit) or 5xx errors
+   - Timeout (exceeds 2400000ms / 40min)
+   - Connection errors
+
+4. **Quality Degradation**:
+   - Analysis report < 100 words (too brief, likely incomplete)
+   - No concrete modification points provided (only general suggestions)
+   - Same root cause identified 3+ consecutive times (stuck analysis)
+
+**Fallback Sequence**:
+- Try primary tool (Gemini)
+- If trigger detected → Try fallback (Qwen)
+- If trigger detected again → Try final fallback (Codex)
+- If all fail → Mark as degraded, use basic pattern matching from fix-history.json, notify user
+
 ### TodoWrite Structure
 
 ```javascript
@@ -501,11 +547,49 @@ TodoWrite({
 - Mark completed after each iteration
 - Update parent task when all complete
 
+## Commit Strategy
+
+**Automatic Commits** (orchestrator-managed):
+
+The orchestrator automatically creates git commits at key checkpoints to enable safe rollback:
+
+1. **After Successful Iteration** (pass rate increased):
+   ```bash
+   git add .
+   git commit -m "test-cycle: iteration ${N} - ${strategy} strategy (pass: ${oldRate}% → ${newRate}%)"
+   ```
+
+2. **After Parallel Exploration** (winning strategy applied):
+   ```bash
+   git add .
+   git commit -m "test-cycle: iteration ${N} - exploratory (selected strategy ${winner}, pass: ${rate}%)"
+   ```
+
+3. **Before Rollback** (regression detected):
+   ```bash
+   # Current state preserved, then:
+   git revert HEAD
+   git commit -m "test-cycle: rollback iteration ${N} - regression detected (pass: ${newRate}% < ${oldRate}%)"
+   ```
+
+**Commit Content**:
+- Modified source files from fix application
+- Updated test-results.json, iteration-state.json
+- Excludes: temporary files, logs, worktrees
+
+**Benefits**:
+- **Rollback Safety**: Each iteration is a revert point
+- **Progress Tracking**: Git history shows iteration evolution
+- **Audit Trail**: Clear record of which strategy/iteration caused issues
+- **Resume Capability**: Can resume from any checkpoint
+
+**Note**: Final session completion creates additional commit with full summary.
+
 ## Best Practices
 
 1. **Default Settings Work**: 10 iterations sufficient for most cases
-2. **Commit Between Iterations**: Enables rollback if needed
+2. **Automatic Commits**: Orchestrator commits after each successful iteration - no manual intervention needed
 3. **Trust Strategy Engine**: Auto-selection based on proven heuristics
-4. **Monitor Logs**: Check `.process/iteration-N-analysis.md` for insights
-5. **Progressive Testing**: Saves 70-90% iteration time
-6. **Parallel Exploration**: Auto-triggers at iteration 3 if stuck
+4. **Monitor Logs**: Check `.process/iteration-N-analysis.md` for CLI analysis insights
+5. **Progressive Testing**: Saves 70-90% iteration time automatically
+6. **Parallel Exploration**: Auto-triggers at iteration 3 if stuck (no configuration needed)