mirror of
https://github.com/catlog22/Claude-Code-Workflow.git
synced 2026-02-13 02:41:50 +08:00
Optimizations: - Reduced file size: 791 → 540 lines (-32%) - Merged redundant sections (Overview + Philosophy + Responsibility Matrix) - Simplified JSON examples and execution flow descriptions Enhancements: 1. Intelligent Strategy Engine - 4 strategies: conservative/aggressive/exploratory/surgical - Auto-selection based on pass rate, stuck tests, regression 2. Progressive Testing - Run affected tests during iterations (70-90% faster) - Full suite on final validation 3. Parallel Strategy Exploration - Trigger on stuck tests (3+ consecutive failures) - Git worktree-based parallel execution - Select best result by pass rate Changes: - Max iterations: 5 → 10 - CLI tools: Gemini/Qwen → Gemini/Qwen/Codex (fallback chain) - Add strategy metadata to iteration-state.json - Add test_execution config to fix task JSON Impact: - Faster iterations via progressive testing - Better stuck situation handling via parallel exploration - Adaptive strategy selection for complex scenarios
540 lines
18 KiB
Markdown
540 lines
18 KiB
Markdown
---
|
|
name: test-cycle-execute
|
|
description: Execute test-fix workflow with dynamic task generation and iterative fix cycles until test pass rate >= 95% or max iterations reached. Uses @cli-planning-agent for failure analysis and task generation.
|
|
argument-hint: "[--resume-session=\"session-id\"] [--max-iterations=N]"
|
|
allowed-tools: SlashCommand(*), TodoWrite(*), Read(*), Bash(*), Task(*)
|
|
---
|
|
|
|
# Workflow Test-Cycle-Execute Command
|
|
|
|
## Architecture & Philosophy
|
|
|
|
**Core Concept**: Dynamic test-fix orchestrator with adaptive task generation based on runtime analysis.
|
|
|
|
**Quality Gate**: Iterates until test pass rate >= 95% (with criticality assessment) or 100% (full pass).
|
|
|
|
**Key Differences from Standard Execute**:
|
|
- **Standard**: Pre-defined task queue → Sequential execution → Complete
|
|
- **Test-Cycle**: Initial tasks → Test → Analyze → Generate fix tasks → Execute → Re-test → Repeat
|
|
|
|
**Orchestrator Boundary (CRITICAL)**:
|
|
- This command is the **ONLY place** where test failures are handled
|
|
- All failure analysis delegated to **@cli-planning-agent** (Gemini/Qwen/Codex)
|
|
- All fixes applied by **@test-fix-agent**
|
|
- Orchestrator manages: pass rate calculation, criticality assessment, iteration loop, strategy selection
|
|
|
|
**Agent Coordination**:
|
|
- **Orchestrator**: Loop control, strategy selection, pass rate calculation, threshold decisions
|
|
- **@cli-planning-agent**: CLI analysis (Gemini/Qwen/Codex), root cause extraction, fix task generation
|
|
- **@test-fix-agent**: Test execution, code fixes, result reporting
|
|
|
|
**Resume Mode**: `--resume-session` flag skips discovery and continues from interruption point.
|
|
|
|
## Enhanced Features
|
|
|
|
### 1. Intelligent Strategy Engine
|
|
|
|
**Strategy Types**:
|
|
- **Conservative** (default): Single fix per iteration, full validation
|
|
- **Aggressive**: Batch fix similar failures (when pass rate >80% + high pattern similarity)
|
|
- **Exploratory**: Parallel multi-strategy attempt (when stuck on same failures 3+ times)
|
|
- **Surgical**: Minimal changes, rollback on regression
|
|
|
|
**Strategy Selection Logic**:
|
|
```javascript
|
|
// In orchestrator, before invoking @cli-planning-agent
|
|
function selectStrategy(context) {
|
|
const { passRate, iteration, stuckTests, regressionDetected } = context;
|
|
|
|
if (iteration <= 2) return "conservative";
|
|
if (passRate > 80 && context.failurePattern.similarity > 0.7) return "aggressive";
|
|
if (stuckTests > 0 && iteration >= 3) return "exploratory";
|
|
if (regressionDetected) return "surgical";
|
|
return "conservative";
|
|
}
|
|
```
|
|
|
|
**Integration**: Orchestrator passes selected strategy to @cli-planning-agent in prompt.
|
|
|
|
### 2. Progressive Testing
|
|
|
|
**Concept**: Run affected tests during iterations, full suite on final validation.
|
|
|
|
**Affected Test Detection**:
|
|
- @cli-planning-agent analyzes modified files from fix strategy
|
|
- Maps to test files via imports and integration patterns
|
|
- Returns `affected_tests[]` in fix task JSON
|
|
|
|
**Execution**:
|
|
- **Iteration N**: `npm test -- ${affected_tests.join(' ')}` (fast feedback)
|
|
- **Final validation**: `npm test` (comprehensive check)
|
|
|
|
**Benefits**: 70-90% iteration speed improvement.
|
|
|
|
**Task JSON Integration**:
|
|
```json
|
|
{
|
|
"context": {
|
|
"fix_strategy": {
|
|
"test_execution": {
|
|
"mode": "affected_only",
|
|
"affected_tests": ["tests/auth/client.test.ts"],
|
|
"fallback_to_full": true
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### 3. Parallel Strategy Exploration
|
|
|
|
**Trigger**: Iteration >= 3 AND stuck tests detected (same test failed 3+ consecutive times)
|
|
|
|
**Workflow**:
|
|
```
|
|
Stuck State Detected → Switch to "exploratory" strategy:
|
|
├── @cli-planning-agent generates 3 alternative fix approaches
|
|
├── Orchestrator creates 3 task JSONs: IMPL-fix-Na, IMPL-fix-Nb, IMPL-fix-Nc
|
|
├── Execute in parallel via git worktree:
|
|
│ ├── worktree-a/ → Apply fix-Na → Test → Score: 92%
|
|
│ ├── worktree-b/ → Apply fix-Nb → Test → Score: 88%
|
|
│ └── worktree-c/ → Apply fix-Nc → Test → Score: 95% ✓
|
|
└── Select best result, apply to main branch
|
|
```
|
|
|
|
**Implementation**:
|
|
```bash
|
|
# Orchestrator creates worktrees
|
|
git worktree add ../worktree-a
|
|
git worktree add ../worktree-b
|
|
git worktree add ../worktree-c
|
|
|
|
# Launch 3 @test-fix-agent instances in parallel
|
|
Task(subagent_type="test-fix-agent", prompt="...", model="haiku") x3
|
|
|
|
# Collect results, select winner by pass_rate
|
|
```
|
|
|
|
**Task JSON**:
|
|
```json
|
|
{
|
|
"id": "IMPL-fix-3-parallel",
|
|
"meta": {
|
|
"type": "parallel-exploration",
|
|
"strategies": ["IMPL-fix-3a", "IMPL-fix-3b", "IMPL-fix-3c"],
|
|
"selection_criteria": "highest_pass_rate"
|
|
}
|
|
}
|
|
```
|
|
|
|
## Core Responsibilities
|
|
|
|
**Orchestrator**:
|
|
- Session discovery and task queue management
|
|
- Pass rate calculation: `(passed / total) * 100` from test-results.json
|
|
- Criticality assessment from test-results.json
|
|
- Strategy selection (conservative/aggressive/exploratory/surgical)
|
|
- Stuck test detection (same test failed 3+ times)
|
|
- Regression detection (pass rate dropped >10%)
|
|
- Iteration control (max 10 iterations, default)
|
|
- CLI tool selection (Gemini → Qwen → Codex fallback chain)
|
|
- Failure analysis delegation to @cli-planning-agent
|
|
- TodoWrite progress tracking
|
|
- Session auto-complete when pass rate >= 95%
|
|
|
|
**@cli-planning-agent**:
|
|
- Execute CLI analysis (Gemini/Qwen/Codex) with bug diagnosis template
|
|
- Parse CLI output for root causes and fix strategy
|
|
- Generate IMPL-fix-N.json with structured task definition
|
|
- Detect affected tests for progressive testing
|
|
- Save analysis reports and CLI output
|
|
|
|
**@test-fix-agent**:
|
|
- Execute tests and save results to test-results.json
|
|
- Apply fixes from task JSON fix_strategy
|
|
- Assign criticality to test failures
|
|
- Update task status
|
|
|
|
## Execution Flow
|
|
|
|
### Phase 1: Discovery & Initialization
|
|
1. Detect test-fix session from `workflow_type: "test_session"`
|
|
2. Load workflow-session.json, IMPL_PLAN.md, .task/*.json
|
|
3. Initialize TodoWrite with initial tasks
|
|
4. Setup iteration context (current=0, max=10, strategy="conservative")
|
|
|
|
**Resume Mode**: Load iteration-state.json, continue from `next_action`.
|
|
|
|
### Phase 2: Task Execution Loop
|
|
|
|
```javascript
|
|
For each task in queue:
|
|
1. Load task JSON and context
|
|
2. Determine task type (test-gen, test-fix, fix-iteration, parallel-exploration)
|
|
3. Execute via appropriate agent
|
|
4. Collect results from test-results.json
|
|
5. Calculate pass rate: (passed / total) * 100
|
|
6. Assess criticality from test-results.json
|
|
7. Detect regression: if passRate < previousPassRate - 10% → REGRESSION
|
|
8. Detect stuck tests: if same test failed 3+ consecutive iterations → STUCK
|
|
|
|
9. Make threshold decision:
|
|
IF passRate === 100%:
|
|
→ SUCCESS: Mark complete, continue to next task
|
|
|
|
ELSE IF passRate >= 95%:
|
|
→ CHECK criticality:
|
|
- All "low" → PARTIAL SUCCESS (auto-approve with note)
|
|
- Any "high"/"medium" → Enter fix loop (step 10)
|
|
|
|
ELSE IF passRate < 95%:
|
|
→ FAILED: Enter fix loop (step 10)
|
|
|
|
10. Fix Loop (if needed):
|
|
a. Select strategy: conservative/aggressive/exploratory/surgical
|
|
b. If strategy === "exploratory":
|
|
- Invoke @cli-planning-agent for 3 alternative approaches
|
|
- Create parallel tasks (IMPL-fix-Na, Nb, Nc)
|
|
- Execute in parallel via git worktree
|
|
- Select best result (highest pass_rate)
|
|
- Apply to main branch
|
|
c. Else:
|
|
- Invoke @cli-planning-agent with selected strategy
|
|
- Generate single IMPL-fix-N.json
|
|
- Insert at queue front
|
|
d. Continue loop
|
|
|
|
11. Check max iterations (abort if exceeded)
|
|
```
|
|
|
|
### Phase 3: Iteration Cycle (Per Fix Task)
|
|
|
|
**Structure**:
|
|
```
|
|
Iteration N:
|
|
├── 1. Test Execution (via @test-fix-agent)
|
|
│ ├── Mode: affected_only or full_suite (from task JSON)
|
|
│ ├── Save results: test-results.json, test-output.log
|
|
│ └── Report to orchestrator
|
|
├── 2. Pass Rate Validation (orchestrator)
|
|
│ ├── Calculate: (passed / total) * 100
|
|
│ ├── Assess criticality
|
|
│ ├── Detect regression/stuck
|
|
│ └── Select strategy
|
|
├── 3. Failure Analysis (via @cli-planning-agent)
|
|
│ ├── Input: failure context + selected strategy
|
|
│ ├── CLI tool: Gemini (fallback: Qwen → Codex)
|
|
│ ├── Output: IMPL-fix-N.json, iteration-N-analysis.md, cli-output.txt
|
|
│ └── Return task ID to orchestrator
|
|
└── 4. Fix Execution (via @test-fix-agent)
|
|
├── Load fix_strategy from task JSON
|
|
├── Apply changes (surgical fixes)
|
|
└── Return to step 1 for re-test
|
|
```
|
|
|
|
### Phase 4: Completion
|
|
|
|
**Success Conditions**:
|
|
- All tasks completed
|
|
- Pass rate === 100% (full success)
|
|
|
|
**Partial Success Conditions**:
|
|
- All tasks completed
|
|
- Pass rate >= 95% and < 100%
|
|
- All failures are "low" criticality
|
|
- **Action**: Auto-approve with review note
|
|
|
|
**Completion Steps**:
|
|
1. Final validation: Run full test suite
|
|
2. Calculate final pass rate
|
|
3. Assess status (full/partial/failure)
|
|
4. Update session state
|
|
5. Generate summary with pass rate metrics
|
|
6. Call `/workflow:session:complete`
|
|
|
|
**Failure Conditions**:
|
|
- Max iterations (10) reached without 95% pass rate
|
|
- Pass rate < 95% after max iterations
|
|
- Unrecoverable errors
|
|
|
|
**Failure Handling**:
|
|
1. Document final state with pass rate
|
|
2. Generate failure report (remaining failures, iteration history, recommendations)
|
|
3. Mark tasks as blocked
|
|
4. Return control to user
|
|
|
|
## Agent Coordination Details
|
|
|
|
### @cli-planning-agent Invocation
|
|
|
|
**CLI Tool Selection** (fallback chain):
|
|
1. **Gemini** (primary): `gemini-2.5-pro`
|
|
2. **Qwen** (fallback): `coder-model`
|
|
3. **Codex** (fallback): `gpt-5.1-codex`
|
|
|
|
**Template**: `~/.claude/workflows/cli-templates/prompts/analysis/01-diagnose-bug-root-cause.txt`
|
|
|
|
**Timeout**: 40min (2400000ms)
|
|
|
|
**Prompt Structure**:
|
|
```javascript
|
|
Task(
|
|
subagent_type="cli-planning-agent",
|
|
description=`Analyze test failures (iteration ${N}) - ${strategy} strategy`,
|
|
prompt=`
|
|
## Task Objective
|
|
Analyze test failures and generate fix task JSON for iteration ${N}
|
|
|
|
## Strategy
|
|
${selectedStrategy} - ${strategyDescription}
|
|
|
|
## MANDATORY FIRST STEPS
|
|
1. Read test results: ${session.test_results_path}
|
|
2. Read test output: ${session.test_output_path}
|
|
3. Read iteration state: ${session.iteration_state_path}
|
|
|
|
## Session Paths
|
|
- Workflow Dir: ${session.workflow_dir}
|
|
- Test Results: ${session.test_results_path}
|
|
- Task Output Dir: ${session.task_dir}
|
|
- Analysis Output: ${session.process_dir}/iteration-${N}-analysis.md
|
|
|
|
## Context Metadata
|
|
- Session ID: ${sessionId}
|
|
- Current Iteration: ${N}
|
|
- Max Iterations: 10
|
|
- Current Pass Rate: ${passRate}%
|
|
- Selected Strategy: ${selectedStrategy}
|
|
- Stuck Tests: ${stuckTests}
|
|
|
|
## CLI Configuration
|
|
- Tool Priority: gemini → qwen → codex
|
|
- Template: 01-diagnose-bug-root-cause.txt
|
|
- Timeout: 2400000ms
|
|
|
|
## Expected Deliverables
|
|
1. Task JSON: ${session.task_dir}/IMPL-fix-${N}.json
|
|
- Must include: fix_strategy.test_execution.affected_tests[]
|
|
- Must include: fix_strategy.confidence_score
|
|
2. Analysis report: ${session.process_dir}/iteration-${N}-analysis.md
|
|
3. CLI output: ${session.process_dir}/iteration-${N}-cli-output.txt
|
|
4. Return task ID to orchestrator
|
|
|
|
## Strategy-Specific Requirements
|
|
${strategyRequirements[selectedStrategy]}
|
|
|
|
## Success Criteria
|
|
- Valid IMPL-fix-${N}.json with all required fields
|
|
- Concrete fix strategy with modification points (file:function:lines)
|
|
- Affected tests list for progressive testing
|
|
- Root cause analysis (not just symptoms)
|
|
`
|
|
)
|
|
```
|
|
|
|
**Strategy-Specific Requirements**:
|
|
```javascript
|
|
const strategyRequirements = {
|
|
conservative: "Single targeted fix, high confidence required",
|
|
aggressive: "Batch fix similar failures, pattern-based approach",
|
|
exploratory: "Generate 3 alternative approaches with different root cause hypotheses",
|
|
surgical: "Minimal changes, focus on rollback safety"
|
|
};
|
|
```
|
|
|
|
### @test-fix-agent Context Loading
|
|
|
|
**File Loading Sequence**:
|
|
1. Task JSON (`task_json_path`)
|
|
2. Iteration state (`iteration_state_path`)
|
|
3. Test results (`test_results_path`)
|
|
4. Test output (`test_output_path`)
|
|
5. Analysis report (`analysis_path`)
|
|
6. Fix history (`fix_history_path`)
|
|
|
|
**Paths Provided by Orchestrator**:
|
|
```javascript
|
|
{
|
|
"task_json_path": ".workflow/active/WFS-test-{session}/.task/IMPL-fix-N.json",
|
|
"iteration_state_path": ".workflow/active/WFS-test-{session}/.process/iteration-state.json",
|
|
"test_results_path": ".workflow/active/WFS-test-{session}/.process/test-results.json",
|
|
"test_output_path": ".workflow/active/WFS-test-{session}/.process/test-output.log",
|
|
"analysis_path": ".workflow/active/WFS-test-{session}/.process/iteration-N-analysis.md",
|
|
"workflow_dir": ".workflow/active/WFS-test-{session}/",
|
|
"session_id": "WFS-test-{session}",
|
|
"current_iteration": N,
|
|
"max_iterations": 10
|
|
}
|
|
```
|
|
|
|
## Data Structures
|
|
|
|
### Session File Structure
|
|
```
|
|
.workflow/active/WFS-test-{session}/
|
|
├── workflow-session.json
|
|
├── IMPL_PLAN.md, TODO_LIST.md
|
|
├── .task/
|
|
│ ├── IMPL-{001,002}.json # Initial tasks
|
|
│ ├── IMPL-fix-{N}.json # Generated fix tasks
|
|
│ └── IMPL-fix-{N}-parallel.json # Parallel exploration tasks
|
|
├── .process/
|
|
│ ├── iteration-state.json # Current iteration + strategy
|
|
│ ├── test-results.json # Latest test results
|
|
│ ├── test-output.log # Full test output
|
|
│ ├── fix-history.json # All fix attempts
|
|
│ ├── iteration-{N}-analysis.md # CLI analysis
|
|
│ └── iteration-{N}-cli-output.txt # Raw CLI output
|
|
└── .summaries/iteration-summaries/
|
|
```
|
|
|
|
### Iteration State JSON
|
|
```json
|
|
{
|
|
"session_id": "WFS-test-user-auth",
|
|
"current_task": "IMPL-002",
|
|
"current_iteration": 3,
|
|
"max_iterations": 10,
|
|
"selected_strategy": "aggressive",
|
|
"iterations": [
|
|
{
|
|
"iteration": 1,
|
|
"pass_rate": 70,
|
|
"strategy": "conservative",
|
|
"stuck_tests": [],
|
|
"regression_detected": false
|
|
},
|
|
{
|
|
"iteration": 2,
|
|
"pass_rate": 82,
|
|
"strategy": "conservative",
|
|
"stuck_tests": [],
|
|
"regression_detected": false
|
|
},
|
|
{
|
|
"iteration": 3,
|
|
"pass_rate": 89,
|
|
"strategy": "aggressive",
|
|
"stuck_tests": ["test_auth_edge_case"],
|
|
"regression_detected": false
|
|
}
|
|
],
|
|
"stuck_tests": ["test_auth_edge_case"],
|
|
"next_action": "execute_fix_task"
|
|
}
|
|
```
|
|
|
|
### Task JSON Template (Fix Iteration)
|
|
```json
|
|
{
|
|
"id": "IMPL-fix-3",
|
|
"title": "Fix test failures - Iteration 3 (aggressive)",
|
|
"status": "pending",
|
|
"meta": {
|
|
"type": "test-fix-iteration",
|
|
"agent": "@test-fix-agent",
|
|
"iteration": 3,
|
|
"strategy": "aggressive",
|
|
"max_iterations": 10
|
|
},
|
|
"context": {
|
|
"requirements": ["Fix identified test failures", "Address root causes"],
|
|
"failure_context": {
|
|
"failed_tests": ["test1", "test2"],
|
|
"error_messages": ["error1", "error2"],
|
|
"pass_rate": 89,
|
|
"stuck_tests": ["test_auth_edge_case"]
|
|
},
|
|
"fix_strategy": {
|
|
"approach": "Batch fix authentication validation across multiple endpoints",
|
|
"modification_points": ["src/auth/middleware.ts:validateToken:45-60"],
|
|
"confidence_score": 0.85,
|
|
"test_execution": {
|
|
"mode": "affected_only",
|
|
"affected_tests": ["tests/auth/*.test.ts"],
|
|
"fallback_to_full": true
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Error Handling
|
|
|
|
| Scenario | Action |
|
|
|----------|--------|
|
|
| Test execution error | Log error, retry with error context |
|
|
| CLI analysis failure | Fallback: Gemini → Qwen → Codex → manual |
|
|
| Agent execution error | Save state, retry with simplified context |
|
|
| Max iterations reached | Generate failure report, mark blocked |
|
|
| Regression detected | Rollback last fix, switch to surgical strategy |
|
|
| Stuck tests detected | Switch to exploratory strategy (parallel) |
|
|
|
|
**CLI Fallback Chain**:
|
|
1. Try Gemini (primary)
|
|
2. If HTTP 429 or error: Try Qwen
|
|
3. If Qwen fails: Try Codex
|
|
4. If all fail: Mark as degraded, use basic pattern matching
|
|
|
|
## TodoWrite Coordination
|
|
|
|
**Structure**:
|
|
```javascript
|
|
TodoWrite({
|
|
todos: [
|
|
{
|
|
content: "Execute IMPL-001: Generate tests [code-developer]",
|
|
status: "completed",
|
|
activeForm: "Executing test generation"
|
|
},
|
|
{
|
|
content: "Execute IMPL-002: Test & Fix Cycle [ITERATION]",
|
|
status: "in_progress",
|
|
activeForm: "Running test-fix iteration cycle"
|
|
},
|
|
{
|
|
content: " → Iteration 1: Initial test (pass: 70%, conservative)",
|
|
status: "completed",
|
|
activeForm: "Running initial tests"
|
|
},
|
|
{
|
|
content: " → Iteration 2: Fix validation (pass: 82%, conservative)",
|
|
status: "completed",
|
|
activeForm: "Fixing validation issues"
|
|
},
|
|
{
|
|
content: " → Iteration 3: Batch fix auth (pass: 89%, aggressive)",
|
|
status: "in_progress",
|
|
activeForm: "Fixing authentication issues"
|
|
}
|
|
]
|
|
});
|
|
```
|
|
|
|
**Update Rules**:
|
|
1. Add iteration item with strategy and pass rate
|
|
2. Mark completed after each iteration
|
|
3. Update parent task when all complete
|
|
|
|
## Usage
|
|
|
|
```bash
|
|
# Execute test-fix workflow (discovers active session)
|
|
/workflow:test-cycle-execute
|
|
|
|
# Resume interrupted session
|
|
/workflow:test-cycle-execute --resume-session="WFS-test-user-auth"
|
|
|
|
# Custom iteration limit
|
|
/workflow:test-cycle-execute --max-iterations=15
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
1. **Iteration Limits**: Default 10, adjust based on complexity
|
|
2. **Commit Between Iterations**: Enables easier rollback
|
|
3. **Monitor Iteration Logs**: Review CLI analysis quality
|
|
4. **Trust Strategy Engine**: Let orchestrator select optimal approach
|
|
5. **Leverage Progressive Testing**: 70-90% faster iterations
|
|
6. **Watch for Stuck Tests**: Parallel exploration auto-triggers at iteration 3
|