Files
Claude-Code-Workflow/.claude/commands/workflow/test-cycle-execute.md
catlog22 97b2247896 feat: enhance test-cycle-execute with intelligent iteration strategies
Optimizations:
- Reduced file size: 791 → 540 lines (-32%)
- Merged redundant sections (Overview + Philosophy + Responsibility Matrix)
- Simplified JSON examples and execution flow descriptions

Enhancements:
1. Intelligent Strategy Engine
   - 4 strategies: conservative/aggressive/exploratory/surgical
   - Auto-selection based on pass rate, stuck tests, regression

2. Progressive Testing
   - Run affected tests during iterations (70-90% faster)
   - Full suite on final validation

3. Parallel Strategy Exploration
   - Trigger on stuck tests (3+ consecutive failures)
   - Git worktree-based parallel execution
   - Select best result by pass rate

Changes:
- Max iterations: 5 → 10
- CLI tools: Gemini/Qwen → Gemini/Qwen/Codex (fallback chain)
- Add strategy metadata to iteration-state.json
- Add test_execution config to fix task JSON

Impact:
- Faster iterations via progressive testing
- Better stuck situation handling via parallel exploration
- Adaptive strategy selection for complex scenarios
2025-11-24 23:02:46 +08:00

540 lines
18 KiB
Markdown

---
name: test-cycle-execute
description: Execute test-fix workflow with dynamic task generation and iterative fix cycles until test pass rate >= 95% or max iterations reached. Uses @cli-planning-agent for failure analysis and task generation.
argument-hint: "[--resume-session=\"session-id\"] [--max-iterations=N]"
allowed-tools: SlashCommand(*), TodoWrite(*), Read(*), Bash(*), Task(*)
---
# Workflow Test-Cycle-Execute Command
## Architecture & Philosophy
**Core Concept**: Dynamic test-fix orchestrator with adaptive task generation based on runtime analysis.
**Quality Gate**: Iterates until test pass rate >= 95% (with criticality assessment) or 100% (full pass).
**Key Differences from Standard Execute**:
- **Standard**: Pre-defined task queue → Sequential execution → Complete
- **Test-Cycle**: Initial tasks → Test → Analyze → Generate fix tasks → Execute → Re-test → Repeat
**Orchestrator Boundary (CRITICAL)**:
- This command is the **ONLY place** where test failures are handled
- All failure analysis delegated to **@cli-planning-agent** (Gemini/Qwen/Codex)
- All fixes applied by **@test-fix-agent**
- Orchestrator manages: pass rate calculation, criticality assessment, iteration loop, strategy selection
**Agent Coordination**:
- **Orchestrator**: Loop control, strategy selection, pass rate calculation, threshold decisions
- **@cli-planning-agent**: CLI analysis (Gemini/Qwen/Codex), root cause extraction, fix task generation
- **@test-fix-agent**: Test execution, code fixes, result reporting
**Resume Mode**: `--resume-session` flag skips discovery and continues from interruption point.
## Enhanced Features
### 1. Intelligent Strategy Engine
**Strategy Types**:
- **Conservative** (default): Single fix per iteration, full validation
- **Aggressive**: Batch fix similar failures (when pass rate >80% + high pattern similarity)
- **Exploratory**: Parallel multi-strategy attempt (when stuck on same failures 3+ times)
- **Surgical**: Minimal changes, rollback on regression
**Strategy Selection Logic**:
```javascript
// In orchestrator, before invoking @cli-planning-agent
function selectStrategy(context) {
const { passRate, iteration, stuckTests, regressionDetected } = context;
if (iteration <= 2) return "conservative";
if (passRate > 80 && context.failurePattern.similarity > 0.7) return "aggressive";
if (stuckTests > 0 && iteration >= 3) return "exploratory";
if (regressionDetected) return "surgical";
return "conservative";
}
```
**Integration**: Orchestrator passes selected strategy to @cli-planning-agent in prompt.
### 2. Progressive Testing
**Concept**: Run affected tests during iterations, full suite on final validation.
**Affected Test Detection**:
- @cli-planning-agent analyzes modified files from fix strategy
- Maps to test files via imports and integration patterns
- Returns `affected_tests[]` in fix task JSON
**Execution**:
- **Iteration N**: `npm test -- ${affected_tests.join(' ')}` (fast feedback)
- **Final validation**: `npm test` (comprehensive check)
**Benefits**: 70-90% iteration speed improvement.
**Task JSON Integration**:
```json
{
"context": {
"fix_strategy": {
"test_execution": {
"mode": "affected_only",
"affected_tests": ["tests/auth/client.test.ts"],
"fallback_to_full": true
}
}
}
}
```
### 3. Parallel Strategy Exploration
**Trigger**: Iteration >= 3 AND stuck tests detected (same test failed 3+ consecutive times)
**Workflow**:
```
Stuck State Detected → Switch to "exploratory" strategy:
├── @cli-planning-agent generates 3 alternative fix approaches
├── Orchestrator creates 3 task JSONs: IMPL-fix-Na, IMPL-fix-Nb, IMPL-fix-Nc
├── Execute in parallel via git worktree:
│ ├── worktree-a/ → Apply fix-Na → Test → Score: 92%
│ ├── worktree-b/ → Apply fix-Nb → Test → Score: 88%
│ └── worktree-c/ → Apply fix-Nc → Test → Score: 95% ✓
└── Select best result, apply to main branch
```
**Implementation**:
```bash
# Orchestrator creates worktrees
git worktree add ../worktree-a
git worktree add ../worktree-b
git worktree add ../worktree-c
# Launch 3 @test-fix-agent instances in parallel
Task(subagent_type="test-fix-agent", prompt="...", model="haiku") x3
# Collect results, select winner by pass_rate
```
**Task JSON**:
```json
{
"id": "IMPL-fix-3-parallel",
"meta": {
"type": "parallel-exploration",
"strategies": ["IMPL-fix-3a", "IMPL-fix-3b", "IMPL-fix-3c"],
"selection_criteria": "highest_pass_rate"
}
}
```
## Core Responsibilities
**Orchestrator**:
- Session discovery and task queue management
- Pass rate calculation: `(passed / total) * 100` from test-results.json
- Criticality assessment from test-results.json
- Strategy selection (conservative/aggressive/exploratory/surgical)
- Stuck test detection (same test failed 3+ times)
- Regression detection (pass rate dropped >10%)
- Iteration control (max 10 iterations, default)
- CLI tool selection (Gemini → Qwen → Codex fallback chain)
- Failure analysis delegation to @cli-planning-agent
- TodoWrite progress tracking
- Session auto-complete when pass rate >= 95%
**@cli-planning-agent**:
- Execute CLI analysis (Gemini/Qwen/Codex) with bug diagnosis template
- Parse CLI output for root causes and fix strategy
- Generate IMPL-fix-N.json with structured task definition
- Detect affected tests for progressive testing
- Save analysis reports and CLI output
**@test-fix-agent**:
- Execute tests and save results to test-results.json
- Apply fixes from task JSON fix_strategy
- Assign criticality to test failures
- Update task status
## Execution Flow
### Phase 1: Discovery & Initialization
1. Detect test-fix session from `workflow_type: "test_session"`
2. Load workflow-session.json, IMPL_PLAN.md, .task/*.json
3. Initialize TodoWrite with initial tasks
4. Setup iteration context (current=0, max=10, strategy="conservative")
**Resume Mode**: Load iteration-state.json, continue from `next_action`.
### Phase 2: Task Execution Loop
```javascript
For each task in queue:
1. Load task JSON and context
2. Determine task type (test-gen, test-fix, fix-iteration, parallel-exploration)
3. Execute via appropriate agent
4. Collect results from test-results.json
5. Calculate pass rate: (passed / total) * 100
6. Assess criticality from test-results.json
7. Detect regression: if passRate < previousPassRate - 10% REGRESSION
8. Detect stuck tests: if same test failed 3+ consecutive iterations STUCK
9. Make threshold decision:
IF passRate === 100%:
SUCCESS: Mark complete, continue to next task
ELSE IF passRate >= 95%:
CHECK criticality:
- All "low" PARTIAL SUCCESS (auto-approve with note)
- Any "high"/"medium" Enter fix loop (step 10)
ELSE IF passRate < 95%:
FAILED: Enter fix loop (step 10)
10. Fix Loop (if needed):
a. Select strategy: conservative/aggressive/exploratory/surgical
b. If strategy === "exploratory":
- Invoke @cli-planning-agent for 3 alternative approaches
- Create parallel tasks (IMPL-fix-Na, Nb, Nc)
- Execute in parallel via git worktree
- Select best result (highest pass_rate)
- Apply to main branch
c. Else:
- Invoke @cli-planning-agent with selected strategy
- Generate single IMPL-fix-N.json
- Insert at queue front
d. Continue loop
11. Check max iterations (abort if exceeded)
```
### Phase 3: Iteration Cycle (Per Fix Task)
**Structure**:
```
Iteration N:
├── 1. Test Execution (via @test-fix-agent)
│ ├── Mode: affected_only or full_suite (from task JSON)
│ ├── Save results: test-results.json, test-output.log
│ └── Report to orchestrator
├── 2. Pass Rate Validation (orchestrator)
│ ├── Calculate: (passed / total) * 100
│ ├── Assess criticality
│ ├── Detect regression/stuck
│ └── Select strategy
├── 3. Failure Analysis (via @cli-planning-agent)
│ ├── Input: failure context + selected strategy
│ ├── CLI tool: Gemini (fallback: Qwen → Codex)
│ ├── Output: IMPL-fix-N.json, iteration-N-analysis.md, cli-output.txt
│ └── Return task ID to orchestrator
└── 4. Fix Execution (via @test-fix-agent)
├── Load fix_strategy from task JSON
├── Apply changes (surgical fixes)
└── Return to step 1 for re-test
```
### Phase 4: Completion
**Success Conditions**:
- All tasks completed
- Pass rate === 100% (full success)
**Partial Success Conditions**:
- All tasks completed
- Pass rate >= 95% and < 100%
- All failures are "low" criticality
- **Action**: Auto-approve with review note
**Completion Steps**:
1. Final validation: Run full test suite
2. Calculate final pass rate
3. Assess status (full/partial/failure)
4. Update session state
5. Generate summary with pass rate metrics
6. Call `/workflow:session:complete`
**Failure Conditions**:
- Max iterations (10) reached without 95% pass rate
- Pass rate < 95% after max iterations
- Unrecoverable errors
**Failure Handling**:
1. Document final state with pass rate
2. Generate failure report (remaining failures, iteration history, recommendations)
3. Mark tasks as blocked
4. Return control to user
## Agent Coordination Details
### @cli-planning-agent Invocation
**CLI Tool Selection** (fallback chain):
1. **Gemini** (primary): `gemini-2.5-pro`
2. **Qwen** (fallback): `coder-model`
3. **Codex** (fallback): `gpt-5.1-codex`
**Template**: `~/.claude/workflows/cli-templates/prompts/analysis/01-diagnose-bug-root-cause.txt`
**Timeout**: 40min (2400000ms)
**Prompt Structure**:
```javascript
Task(
subagent_type="cli-planning-agent",
description=`Analyze test failures (iteration ${N}) - ${strategy} strategy`,
prompt=`
## Task Objective
Analyze test failures and generate fix task JSON for iteration ${N}
## Strategy
${selectedStrategy} - ${strategyDescription}
## MANDATORY FIRST STEPS
1. Read test results: ${session.test_results_path}
2. Read test output: ${session.test_output_path}
3. Read iteration state: ${session.iteration_state_path}
## Session Paths
- Workflow Dir: ${session.workflow_dir}
- Test Results: ${session.test_results_path}
- Task Output Dir: ${session.task_dir}
- Analysis Output: ${session.process_dir}/iteration-${N}-analysis.md
## Context Metadata
- Session ID: ${sessionId}
- Current Iteration: ${N}
- Max Iterations: 10
- Current Pass Rate: ${passRate}%
- Selected Strategy: ${selectedStrategy}
- Stuck Tests: ${stuckTests}
## CLI Configuration
- Tool Priority: gemini → qwen → codex
- Template: 01-diagnose-bug-root-cause.txt
- Timeout: 2400000ms
## Expected Deliverables
1. Task JSON: ${session.task_dir}/IMPL-fix-${N}.json
- Must include: fix_strategy.test_execution.affected_tests[]
- Must include: fix_strategy.confidence_score
2. Analysis report: ${session.process_dir}/iteration-${N}-analysis.md
3. CLI output: ${session.process_dir}/iteration-${N}-cli-output.txt
4. Return task ID to orchestrator
## Strategy-Specific Requirements
${strategyRequirements[selectedStrategy]}
## Success Criteria
- Valid IMPL-fix-${N}.json with all required fields
- Concrete fix strategy with modification points (file:function:lines)
- Affected tests list for progressive testing
- Root cause analysis (not just symptoms)
`
)
```
**Strategy-Specific Requirements**:
```javascript
const strategyRequirements = {
conservative: "Single targeted fix, high confidence required",
aggressive: "Batch fix similar failures, pattern-based approach",
exploratory: "Generate 3 alternative approaches with different root cause hypotheses",
surgical: "Minimal changes, focus on rollback safety"
};
```
### @test-fix-agent Context Loading
**File Loading Sequence**:
1. Task JSON (`task_json_path`)
2. Iteration state (`iteration_state_path`)
3. Test results (`test_results_path`)
4. Test output (`test_output_path`)
5. Analysis report (`analysis_path`)
6. Fix history (`fix_history_path`)
**Paths Provided by Orchestrator**:
```javascript
{
"task_json_path": ".workflow/active/WFS-test-{session}/.task/IMPL-fix-N.json",
"iteration_state_path": ".workflow/active/WFS-test-{session}/.process/iteration-state.json",
"test_results_path": ".workflow/active/WFS-test-{session}/.process/test-results.json",
"test_output_path": ".workflow/active/WFS-test-{session}/.process/test-output.log",
"analysis_path": ".workflow/active/WFS-test-{session}/.process/iteration-N-analysis.md",
"workflow_dir": ".workflow/active/WFS-test-{session}/",
"session_id": "WFS-test-{session}",
"current_iteration": N,
"max_iterations": 10
}
```
## Data Structures
### Session File Structure
```
.workflow/active/WFS-test-{session}/
├── workflow-session.json
├── IMPL_PLAN.md, TODO_LIST.md
├── .task/
│ ├── IMPL-{001,002}.json # Initial tasks
│ ├── IMPL-fix-{N}.json # Generated fix tasks
│ └── IMPL-fix-{N}-parallel.json # Parallel exploration tasks
├── .process/
│ ├── iteration-state.json # Current iteration + strategy
│ ├── test-results.json # Latest test results
│ ├── test-output.log # Full test output
│ ├── fix-history.json # All fix attempts
│ ├── iteration-{N}-analysis.md # CLI analysis
│ └── iteration-{N}-cli-output.txt # Raw CLI output
└── .summaries/iteration-summaries/
```
### Iteration State JSON
```json
{
"session_id": "WFS-test-user-auth",
"current_task": "IMPL-002",
"current_iteration": 3,
"max_iterations": 10,
"selected_strategy": "aggressive",
"iterations": [
{
"iteration": 1,
"pass_rate": 70,
"strategy": "conservative",
"stuck_tests": [],
"regression_detected": false
},
{
"iteration": 2,
"pass_rate": 82,
"strategy": "conservative",
"stuck_tests": [],
"regression_detected": false
},
{
"iteration": 3,
"pass_rate": 89,
"strategy": "aggressive",
"stuck_tests": ["test_auth_edge_case"],
"regression_detected": false
}
],
"stuck_tests": ["test_auth_edge_case"],
"next_action": "execute_fix_task"
}
```
### Task JSON Template (Fix Iteration)
```json
{
"id": "IMPL-fix-3",
"title": "Fix test failures - Iteration 3 (aggressive)",
"status": "pending",
"meta": {
"type": "test-fix-iteration",
"agent": "@test-fix-agent",
"iteration": 3,
"strategy": "aggressive",
"max_iterations": 10
},
"context": {
"requirements": ["Fix identified test failures", "Address root causes"],
"failure_context": {
"failed_tests": ["test1", "test2"],
"error_messages": ["error1", "error2"],
"pass_rate": 89,
"stuck_tests": ["test_auth_edge_case"]
},
"fix_strategy": {
"approach": "Batch fix authentication validation across multiple endpoints",
"modification_points": ["src/auth/middleware.ts:validateToken:45-60"],
"confidence_score": 0.85,
"test_execution": {
"mode": "affected_only",
"affected_tests": ["tests/auth/*.test.ts"],
"fallback_to_full": true
}
}
}
}
```
## Error Handling
| Scenario | Action |
|----------|--------|
| Test execution error | Log error, retry with error context |
| CLI analysis failure | Fallback: Gemini → Qwen → Codex → manual |
| Agent execution error | Save state, retry with simplified context |
| Max iterations reached | Generate failure report, mark blocked |
| Regression detected | Rollback last fix, switch to surgical strategy |
| Stuck tests detected | Switch to exploratory strategy (parallel) |
**CLI Fallback Chain**:
1. Try Gemini (primary)
2. If HTTP 429 or error: Try Qwen
3. If Qwen fails: Try Codex
4. If all fail: Mark as degraded, use basic pattern matching
## TodoWrite Coordination
**Structure**:
```javascript
TodoWrite({
todos: [
{
content: "Execute IMPL-001: Generate tests [code-developer]",
status: "completed",
activeForm: "Executing test generation"
},
{
content: "Execute IMPL-002: Test & Fix Cycle [ITERATION]",
status: "in_progress",
activeForm: "Running test-fix iteration cycle"
},
{
content: " → Iteration 1: Initial test (pass: 70%, conservative)",
status: "completed",
activeForm: "Running initial tests"
},
{
content: " → Iteration 2: Fix validation (pass: 82%, conservative)",
status: "completed",
activeForm: "Fixing validation issues"
},
{
content: " → Iteration 3: Batch fix auth (pass: 89%, aggressive)",
status: "in_progress",
activeForm: "Fixing authentication issues"
}
]
});
```
**Update Rules**:
1. Add iteration item with strategy and pass rate
2. Mark completed after each iteration
3. Update parent task when all complete
## Usage
```bash
# Execute test-fix workflow (discovers active session)
/workflow:test-cycle-execute
# Resume interrupted session
/workflow:test-cycle-execute --resume-session="WFS-test-user-auth"
# Custom iteration limit
/workflow:test-cycle-execute --max-iterations=15
```
## Best Practices
1. **Iteration Limits**: Default 10, adjust based on complexity
2. **Commit Between Iterations**: Enables easier rollback
3. **Monitor Iteration Logs**: Review CLI analysis quality
4. **Trust Strategy Engine**: Let orchestrator select optimal approach
5. **Leverage Progressive Testing**: 70-90% faster iterations
6. **Watch for Stuck Tests**: Parallel exploration auto-triggers at iteration 3