feat: Implement dynamic test-fix execution phase with adaptive task generation

- Added Phase 2: Test-Cycle Execution documentation outlining the process for dynamic test-fix execution, including agent roles, core responsibilities, intelligent strategy engine, and progressive testing. - Introduced new PowerShell scripts for analyzing TypeScript errors, focusing on error categorization and reporting. - Created end-to-end tests for the Help Page, ensuring content visibility, documentation navigation, internationalization support, and accessibility compliance.
2026-02-12 02:37:45 +08:00 · 2026-02-07 17:01:30 +08:00
parent 4ce4419ea6
commit ba5f4eba84
70 changed files with 7288 additions and 488 deletions
--- a/.codex/skills/workflow-test-fix-cycle/SKILL.md
+++ b/.codex/skills/workflow-test-fix-cycle/SKILL.md
@@ -0,0 +1,392 @@
+---
+name: workflow-test-fix-cycle
+description: End-to-end test-fix workflow: generate test sessions with progressive layers (L0-L3), then execute iterative fix cycles until pass rate >= 95%. Combines test-fix-gen and test-cycle-execute into a unified pipeline. Triggers on "workflow:test-fix-cycle".
+allowed-tools: spawn_agent, wait, send_input, close_agent, AskUserQuestion, Read, Write, Edit, Bash, Glob, Grep
+---
+
+# Workflow Test-Fix Cycle
+
+End-to-end test-fix workflow pipeline: generate test sessions with progressive layers (L0-L3), AI code validation, and task generation (Phase 1), then execute iterative fix cycles with adaptive strategy engine until pass rate >= 95% (Phase 2).
+
+## Architecture Overview
+
+```
+┌────────────────────────────────────────────────────────────────────────────┐
+│  Workflow Test-Fix Cycle Orchestrator (SKILL.md)                            │
+│  → Full pipeline: Test generation + Iterative execution                     │
+│  → Phase dispatch: Read phase docs, execute, pass context                   │
+└───────────────┬────────────────────────────────────────────────────────────┘
+                │
+   ┌────────────┴────────────────────────┐
+   ↓                                     ↓
+┌─────────────────────────┐   ┌─────────────────────────────┐
+│ Phase 1: Test-Fix Gen   │   │ Phase 2: Test-Cycle Execute  │
+│ phases/01-test-fix-gen  │   │ phases/02-test-cycle-execute │
+│ 5 sub-phases:           │   │ 3 stages:                    │
+│ ① Create Session        │   │ ① Discovery                  │
+│ ② Gather Context        │   │ ② Main Loop (iterate)        │
+│ ③ Test Analysis (Gemini)│   │ ③ Completion                 │
+│ ④ Generate Tasks        │   │                              │
+│ ⑤ Summary               │   │ Agents (via spawn_agent):    │
+│                         │   │ @cli-planning-agent           │
+│ Agents (via spawn_agent)│   │ @test-fix-agent               │
+│ @test-context-search    │   │                              │
+│ @context-search         │   │ Strategy: conservative →      │
+│ @cli-execution          │   │ aggressive → surgical          │
+│ @action-planning        │   │                              │
+└────────┬────────────────┘   └────────────┬──────────────────┘
+         ↓                                 ↓
+   IMPL-001..002.json              Pass Rate >= 95%
+   TEST_ANALYSIS_RESULTS.md        Auto-complete session
+
+Task Pipeline:
+┌──────────────┐    ┌─────────────────┐    ┌─────────────────┐    ┌──────────────┐
+│  IMPL-001    │───→│  IMPL-001.3     │───→│  IMPL-001.5     │───→│  IMPL-002    │
+│  Test Gen    │    │  Code Validate  │    │  Quality Gate   │    │  Test & Fix  │
+│  L1-L3       │    │  L0 + AI Issues │    │  Coverage 80%+  │    │  Max 10 iter │
+│@code-developer│   │ @test-fix-agent │    │ @test-fix-agent │    │@test-fix-agent│
+└──────────────┘    └─────────────────┘    └─────────────────┘    └──────────────┘
+                                                                        │
+                                                              Fix Loop: │
+                                                    ┌──────────────────┘
+                                                    ↓
+                                              ┌──────────┐
+                                              │ @cli-plan│───→ IMPL-fix-N.json
+                                              │  agent   │
+                                              ├──────────┤
+                                              │@test-fix │───→ Apply & re-test
+                                              │  agent   │
+                                              └──────────┘
+```
+
+## Key Design Principles
+
+1. **Two-Phase Pipeline**: Generation (Phase 1) creates session + tasks, Execution (Phase 2) runs iterative fix cycles
+2. **Pure Orchestrator**: Dispatch to phase docs, parse outputs, pass context between phases
+3. **Auto-Continue**: Full pipeline runs autonomously once triggered
+4. **Subagent Lifecycle**: Explicit lifecycle management with spawn_agent → wait → close_agent
+5. **Progressive Test Layers**: L0 (Static) → L1 (Unit) → L2 (Integration) → L3 (E2E)
+6. **AI Code Issue Detection**: Validates against common AI-generated code problems
+7. **Intelligent Strategy Engine**: conservative → aggressive → surgical based on iteration context
+8. **CLI Fallback Chain**: Gemini → Qwen → Codex for analysis resilience
+9. **Progressive Testing**: Affected tests during iterations, full suite for final validation
+10. **Role Path Loading**: Subagent roles loaded via path reference in MANDATORY FIRST STEPS
+
+## Auto Mode
+
+This workflow is fully autonomous - Phase 1 generates test session and tasks, Phase 2 executes iterative fix cycles, all without user intervention until pass rate >= 95% or max iterations reached.
+
+## Subagent API Reference
+
+### spawn_agent
+Create a new subagent with task assignment.
+
+```javascript
+const agentId = spawn_agent({
+  message: `
+## TASK ASSIGNMENT
+
+### MANDATORY FIRST STEPS (Agent Execute)
+1. **Read role definition**: ~/.codex/agents/{agent-type}.md (MUST read first)
+2. Read: .workflow/project-tech.json
+3. Read: .workflow/project-guidelines.json
+
+## TASK CONTEXT
+${taskContext}
+
+## DELIVERABLES
+${deliverables}
+`
+})
+```
+
+### wait
+Get results from subagent (only way to retrieve results).
+
+```javascript
+const result = wait({
+  ids: [agentId],
+  timeout_ms: 600000  // 10 minutes
+})
+
+if (result.timed_out) {
+  // Handle timeout - can continue waiting or send_input to prompt completion
+}
+```
+
+### send_input
+Continue interaction with active subagent (for clarification or follow-up).
+
+```javascript
+send_input({
+  id: agentId,
+  message: `
+## CLARIFICATION ANSWERS
+${answers}
+
+## NEXT STEP
+Continue with plan generation.
+`
+})
+```
+
+### close_agent
+Clean up subagent resources (irreversible).
+
+```javascript
+close_agent({ id: agentId })
+```
+
+## Usage
+
+```
+workflow-test-fix-cycle <input> [options]
+
+# Input (Phase 1 - Test Generation)
+source-session-id    WFS-* session ID (Session Mode - test validation for completed implementation)
+feature description  Text description of what to test (Prompt Mode)
+/path/to/file.md     Path to requirements file (Prompt Mode)
+
+# Options (Phase 2 - Cycle Execution)
+--max-iterations=N   Custom iteration limit (default: 10)
+
+# Examples
+workflow-test-fix-cycle WFS-user-auth-v2                                              # Session Mode
+workflow-test-fix-cycle "Test the user authentication API endpoints in src/auth/api.ts" # Prompt Mode - text
+workflow-test-fix-cycle ./docs/api-requirements.md                                     # Prompt Mode - file
+workflow-test-fix-cycle "Test user registration" --max-iterations=15                    # With custom iterations
+
+# Resume (Phase 2 only - session already created)
+workflow-test-fix-cycle --resume-session="WFS-test-user-auth"                          # Resume interrupted session
+```
+
+**Quality Gate**: Test pass rate >= 95% (criticality-aware) or 100%
+**Max Iterations**: 10 (default, adjustable)
+**CLI Tools**: Gemini → Qwen → Codex (fallback chain)
+
+## Test Strategy Overview
+
+Progressive Test Layers (L0-L3):
+
+| Layer | Name | Focus |
+|-------|------|-------|
+| **L0** | Static Analysis | Compilation, imports, types, AI code issues |
+| **L1** | Unit Tests | Function/class behavior (happy/negative/edge cases) |
+| **L2** | Integration Tests | Component interactions, API contracts, failure modes |
+| **L3** | E2E Tests | User journeys, critical paths (optional) |
+
+**Key Features**:
+- **AI Code Issue Detection** - Validates against common AI-generated code problems (hallucinated imports, placeholder code, mock leakage, etc.)
+- **Project Type Detection** - Applies appropriate test templates (React, Node API, CLI, Library, etc.)
+- **Quality Gates** - IMPL-001.3 (code validation) and IMPL-001.5 (test quality) ensure high standards
+
+**Detailed specifications**: See the test-task-generate workflow tool for complete L0-L3 requirements and quality thresholds.
+
+## Execution Flow
+
+```
+Input → Detect Mode (session | prompt | resume)
+  │
+  ├─ resume mode → Skip to Phase 2
+  │
+  └─ session/prompt mode → Phase 1
+       │
+Phase 1: Test-Fix Generation (phases/01-test-fix-gen.md)
+  ├─ Sub-phase 1.1: Create Test Session → testSessionId
+  ├─ Sub-phase 1.2: Gather Test Context (spawn_agent) → contextPath
+  ├─ Sub-phase 1.3: Test Generation Analysis (spawn_agent → Gemini) → TEST_ANALYSIS_RESULTS.md
+  ├─ Sub-phase 1.4: Generate Test Tasks (spawn_agent) → IMPL-*.json, IMPL_PLAN.md, TODO_LIST.md
+  └─ Sub-phase 1.5: Phase 1 Summary
+       │
+Phase 2: Test-Cycle Execution (phases/02-test-cycle-execute.md)
+  ├─ Discovery: Load session, tasks, iteration state
+  ├─ Main Loop (for each task):
+  │   ├─ Execute → Test → Calculate pass_rate
+  │   ├─ 100% → SUCCESS: Next task
+  │   ├─ 95-99% + low criticality → PARTIAL SUCCESS: Approve
+  │   └─ <95% → Fix Loop:
+  │       ├─ Select strategy: conservative/aggressive/surgical
+  │       ├─ spawn_agent(@cli-planning-agent) → IMPL-fix-N.json
+  │       ├─ spawn_agent(@test-fix-agent) → Apply fix & re-test
+  │       └─ Re-test → Back to decision
+  └─ Completion: Final validation → Summary → Auto-complete session
+```
+
+## Core Rules
+
+1. **Start Immediately**: First action is progress tracking initialization
+2. **No Preliminary Analysis**: Do not read files before Phase 1
+3. **Parse Every Output**: Extract data from each phase for the next
+4. **Auto-Continue**: After each phase finishes, automatically execute next pending phase
+5. **Phase Loading**: Read phase doc on-demand (`phases/01-*.md`, `phases/02-*.md`)
+6. **Task Attachment Model**: Sub-tasks ATTACH → execute → COLLAPSE
+7. **CRITICAL: DO NOT STOP**: Continuous pipeline until Phase 2 completion
+8. **Phase Transition**: After Phase 1 summary, immediately begin Phase 2
+9. **Explicit Lifecycle**: Always close_agent after wait completes to free resources
+
+## Phase Execution
+
+### Phase 1: Test-Fix Generation
+
+**Read**: `phases/01-test-fix-gen.md`
+
+5 sub-phases that create a test session and generate task JSONs:
+1. Create Test Session → `testSessionId`
+2. Gather Test Context (spawn_agent → wait → close_agent) → `contextPath`
+3. Test Generation Analysis (spawn_agent → wait → close_agent) → `TEST_ANALYSIS_RESULTS.md`
+4. Generate Test Tasks (spawn_agent → wait → close_agent) → `IMPL-001.json`, `IMPL-001.3.json`, `IMPL-001.5.json`, `IMPL-002.json`, `IMPL_PLAN.md`, `TODO_LIST.md`
+5. Phase 1 Summary (internal - transitions to Phase 2)
+
+**Agents Used** (via spawn_agent):
+- `test-context-search-agent` (~/.codex/agents/test-context-search-agent.md) - Context gathering (Session Mode)
+- `context-search-agent` (~/.codex/agents/context-search-agent.md) - Context gathering (Prompt Mode)
+- `cli-execution-agent` (~/.codex/agents/cli-execution-agent.md) - Test analysis with Gemini
+- `action-planning-agent` (~/.codex/agents/action-planning-agent.md) - Task JSON generation
+
+### Phase 2: Test-Cycle Execution
+
+**Read**: `phases/02-test-cycle-execute.md`
+
+3-stage iterative execution with adaptive strategy:
+1. Discovery - Load session, tasks, iteration state
+2. Main Loop - Execute tasks → Test → Analyze failures → Fix → Re-test
+3. Completion - Final validation → Summary → Auto-complete session
+
+**Agents Used** (via spawn_agent):
+- `cli-planning-agent` (~/.codex/agents/cli-planning-agent.md) - Failure analysis, root cause extraction, fix task generation
+- `test-fix-agent` (~/.codex/agents/test-fix-agent.md) - Test execution, code fixes, criticality assignment
+
+**Strategy Engine**: conservative (iteration 1-2) → aggressive (pass >80%) → surgical (regression)
+
+## Output Artifacts
+
+### Directory Structure
+
+```
+.workflow/active/WFS-test-[session]/
+├── workflow-session.json              # Session metadata
+├── IMPL_PLAN.md                       # Test generation and execution strategy
+├── TODO_LIST.md                       # Task checklist
+├── .task/
+│   ├── IMPL-001.json                  # Test understanding & generation
+│   ├── IMPL-001.3-validation.json     # Code validation gate
+│   ├── IMPL-001.5-review.json         # Test quality gate
+│   ├── IMPL-002.json                  # Test execution & fix cycle
+│   └── IMPL-fix-{N}.json             # Generated fix tasks (Phase 2)
+├── .process/
+│   ├── [test-]context-package.json    # Context and coverage analysis
+│   ├── TEST_ANALYSIS_RESULTS.md       # Test requirements and strategy (L0-L3)
+│   ├── iteration-state.json           # Current iteration + strategy + stuck tests
+│   ├── test-results.json              # Latest results (pass_rate, criticality)
+│   ├── test-output.log                # Full test output
+│   ├── fix-history.json               # All fix attempts
+│   ├── iteration-{N}-analysis.md      # CLI analysis report
+│   └── iteration-{N}-cli-output.txt
+└── .summaries/iteration-summaries/
+```
+
+## Progress Tracking Pattern
+
+**Phase 1** (Generation):
+```javascript
+[
+  { content: "Phase 1: Test-Fix Generation", status: "in_progress" },
+  { content: "  1.1 Create Test Session", status: "completed" },
+  { content: "  1.2 Gather Test Context", status: "in_progress" },
+  { content: "  1.3 Test Generation Analysis", status: "pending" },
+  { content: "  1.4 Generate Test Tasks", status: "pending" },
+  { content: "  1.5 Phase Summary", status: "pending" },
+  { content: "Phase 2: Test-Cycle Execution", status: "pending" }
+]
+```
+
+**Phase 2** (Execution):
+```javascript
+[
+  { content: "Phase 1: Test-Fix Generation", status: "completed" },
+  { content: "Phase 2: Test-Cycle Execution", status: "in_progress" },
+  { content: "  Execute IMPL-001: Generate tests [code-developer]", status: "completed" },
+  { content: "  Execute IMPL-002: Test & Fix Cycle [ITERATION]", status: "in_progress" },
+  { content: "    → Iteration 1: Initial test (pass: 70%, conservative)", status: "completed" },
+  { content: "    → Iteration 2: Fix validation (pass: 82%, conservative)", status: "completed" },
+  { content: "    → Iteration 3: Batch fix auth (pass: 89%, aggressive)", status: "in_progress" }
+]
+```
+
+**Update Rules**:
+- Phase 1: Attach/collapse sub-phase tasks within Phase 1
+- Phase 2: Add iteration items with strategy and pass rate
+- Mark completed after each phase/iteration
+- Update parent task when all complete
+
+## Error Handling
+
+| Phase | Scenario | Action |
+|-------|----------|--------|
+| 1.1 | Source session not found (session mode) | Return error with session ID |
+| 1.1 | No completed IMPL tasks (session mode) | Return error, source incomplete |
+| 1.2 | Context gathering failed | Return error, check source artifacts |
+| 1.2 | Agent timeout | Retry with extended timeout, close_agent, then return error |
+| 1.3 | Gemini analysis failed | Return error, check context package |
+| 1.4 | Task generation failed | Retry once, then return error |
+| 2 | Test execution error | Log, retry with error context |
+| 2 | CLI analysis failure | Fallback: Gemini → Qwen → Codex → manual |
+| 2 | Agent execution error | Save state, close_agent, retry with simplified context |
+| 2 | Max iterations reached | Generate failure report, mark blocked |
+| 2 | Regression detected | Rollback last fix, switch to surgical strategy |
+| 2 | Stuck tests detected | Continue with alternative strategy, document in failure report |
+
+**Lifecycle Error Handling**:
+```javascript
+try {
+  const agentId = spawn_agent({ message: "..." });
+  const result = wait({ ids: [agentId], timeout_ms: 600000 });
+  // ... process result ...
+  close_agent({ id: agentId });
+} catch (error) {
+  if (agentId) close_agent({ id: agentId });
+  throw error;
+}
+```
+
+## Coordinator Checklist
+
+**Phase 1 (Generation)**:
+- Detect input type (session ID / description / file path / resume)
+- Initialize progress tracking with 2 top-level phases
+- Read `phases/01-test-fix-gen.md` for detailed sub-phase execution
+- Execute 5 sub-phases with spawn_agent → wait → close_agent lifecycle
+- Verify all Phase 1 outputs (4+ task JSONs, IMPL_PLAN.md, TODO_LIST.md)
+- **Ensure all agents are closed** after each sub-phase completes
+
+**Phase 2 (Execution)**:
+- Read `phases/02-test-cycle-execute.md` for detailed execution logic
+- Load session state and task queue
+- Execute iterative test-fix cycles with spawn_agent → wait → close_agent
+- Track iterations in progress tracking
+- Auto-complete session on success (pass rate >= 95%)
+- **Ensure all agents are closed** after each iteration
+
+**Resume Mode**:
+- If `--resume-session` provided, skip Phase 1
+- Load existing session directly into Phase 2
+
+## Related Skills
+
+**Prerequisite Skills**:
+- `workflow:plan` or `workflow:execute` - Complete implementation (Session Mode)
+- None for Prompt Mode
+
+**Phase 1 Agents** (used by phases/01-test-fix-gen.md via spawn_agent):
+- `test-context-search-agent` (~/.codex/agents/test-context-search-agent.md) - Test coverage analysis (Session Mode)
+- `context-search-agent` (~/.codex/agents/context-search-agent.md) - Codebase analysis (Prompt Mode)
+- `cli-execution-agent` (~/.codex/agents/cli-execution-agent.md) - Test requirements with Gemini
+- `action-planning-agent` (~/.codex/agents/action-planning-agent.md) - Task JSON generation
+
+**Phase 2 Agents** (used by phases/02-test-cycle-execute.md via spawn_agent):
+- `cli-planning-agent` (~/.codex/agents/cli-planning-agent.md) - CLI analysis, root cause extraction, task generation
+- `test-fix-agent` (~/.codex/agents/test-fix-agent.md) - Test execution, code fixes, criticality assignment
+
+**Follow-up**:
+- Session auto-complete on success
+- Issue creation for follow-up work (post-completion expansion)
--- a/.codex/skills/workflow-test-fix-cycle/phases/01-test-fix-gen.md
+++ b/.codex/skills/workflow-test-fix-cycle/phases/01-test-fix-gen.md
@@ -0,0 +1,456 @@
+# Phase 1: Test-Fix Generation
+
+5 sub-phases that create a test workflow session, gather context, analyze test requirements, and generate task JSONs. All agent interactions use spawn_agent → wait → close_agent lifecycle.
+
+## Execution Model
+
+**Auto-Continue Workflow**: All sub-phases run autonomously once triggered. Sub-phase 1.2-1.4 delegate to specialized agents via spawn_agent.
+
+1. **Sub-phase 1.1 executes** → Test session created → Auto-continues
+2. **Sub-phase 1.2 executes** → Context gathering (spawn_agent) → Auto-continues
+3. **Sub-phase 1.3 executes** → Test generation analysis (spawn_agent → Gemini) → Auto-continues
+4. **Sub-phase 1.4 executes** → Task generation (spawn_agent) → Auto-continues
+5. **Sub-phase 1.5 executes** → Phase 1 Summary → Transitions to Phase 2
+
+**Task Attachment Model**:
+- Phase execution **expands workflow** by attaching sub-tasks to current progress tracking
+- When executing a phase, its internal tasks are attached to the orchestrator's tracking
+- Orchestrator **executes these attached tasks** sequentially
+- After completion, attached tasks are **collapsed** back to high-level phase summary
+- This is **task expansion**, not external delegation
+
+**Auto-Continue Mechanism**:
+- Progress tracking monitors current sub-phase status and dynamically manages task attachment/collapse
+- When each sub-phase finishes executing, automatically execute next pending sub-phase
+- All sub-phases run autonomously without user interaction
+- **CONTINUOUS EXECUTION** - Do not stop until all sub-phases complete
+
+## Sub-Phase 1.1: Create Test Session
+
+**Step 1.0: Detect Input Mode**
+
+```
+// Automatic mode detection based on input pattern
+if (input.startsWith("WFS-")) {
+  MODE = "session"
+  // Load source session to preserve original task description
+  Read(".workflow/active/[sourceSessionId]/workflow-session.json")
+} else {
+  MODE = "prompt"
+}
+```
+
+**Step 1.1: Execute** - Create test workflow session
+
+```javascript
+// Session Mode - preserve original task description
+// Read and execute: workflow-plan session start phase
+// with --type test --new "Test validation for [sourceSessionId]: [originalTaskDescription]"
+
+// Prompt Mode - use user's description directly
+// Read and execute: workflow-plan session start phase
+// with --type test --new "Test generation for: [description]"
+```
+
+**Parse Output**:
+- Extract: `SESSION_ID: WFS-test-[slug]` (store as `testSessionId`)
+
+**Validation**:
+- Session Mode: Source session `.workflow/active/[sourceSessionId]/` exists with completed IMPL tasks
+- Both Modes: New test session directory created with metadata
+
+**Progress Tracking**: Mark sub-phase 1.1 completed, sub-phase 1.2 in_progress
+
+---
+
+## Sub-Phase 1.2: Gather Test Context
+
+**Step 2.1: Execute** - Gather context based on mode via spawn_agent
+
+```javascript
+// Session Mode - gather from source session via test-context-search-agent
+const contextAgentId = spawn_agent({
+  message: `
+## TASK ASSIGNMENT
+
+### MANDATORY FIRST STEPS (Agent Execute)
+1. **Read role definition**: ~/.codex/agents/test-context-search-agent.md (MUST read first)
+2. Read: .workflow/project-tech.json
+3. Read: .workflow/project-guidelines.json
+
+---
+
+## Task Objective
+Gather test context for session [testSessionId]
+
+## Session Paths
+- Session Dir: .workflow/active/[testSessionId]/
+- Output: .workflow/active/[testSessionId]/.process/test-context-package.json
+
+## Expected Deliverables
+- test-context-package.json with coverage analysis and test framework detection
+`
+});
+
+const contextResult = wait({ ids: [contextAgentId], timeout_ms: 600000 });
+close_agent({ id: contextAgentId });
+
+// Prompt Mode - gather from codebase via context-search-agent
+const contextAgentId = spawn_agent({
+  message: `
+## TASK ASSIGNMENT
+
+### MANDATORY FIRST STEPS (Agent Execute)
+1. **Read role definition**: ~/.codex/agents/context-search-agent.md (MUST read first)
+2. Read: .workflow/project-tech.json
+3. Read: .workflow/project-guidelines.json
+
+---
+
+## Task Objective
+Gather project context for session [testSessionId]: [task_description]
+
+## Session Paths
+- Session Dir: .workflow/active/[testSessionId]/
+- Output: .workflow/active/[testSessionId]/.process/context-package.json
+
+## Expected Deliverables
+- context-package.json with codebase analysis
+`
+});
+
+const contextResult = wait({ ids: [contextAgentId], timeout_ms: 600000 });
+close_agent({ id: contextAgentId });
+```
+
+**Input**: `testSessionId` from Sub-Phase 1.1
+
+**Parse Output**:
+- Extract: context package path (store as `contextPath`)
+- Pattern: `.workflow/active/[testSessionId]/.process/[test-]context-package.json`
+
+**Validation**:
+- Context package file exists and is valid JSON
+- Contains coverage analysis (session mode) or codebase analysis (prompt mode)
+- Test framework detected
+
+**Progress Tracking Update (tasks attached)**:
+```json
+[
+  {"content": "Phase 1: Test-Fix Generation", "status": "in_progress"},
+  {"content": "  1.1 Create Test Session", "status": "completed"},
+  {"content": "  1.2 Gather Test Context", "status": "in_progress"},
+  {"content": "    → Load source/codebase context", "status": "in_progress"},
+  {"content": "    → Analyze test coverage", "status": "pending"},
+  {"content": "    → Generate context package", "status": "pending"},
+  {"content": "  1.3 Test Generation Analysis", "status": "pending"},
+  {"content": "  1.4 Generate Test Tasks", "status": "pending"},
+  {"content": "  1.5 Phase Summary", "status": "pending"},
+  {"content": "Phase 2: Test-Cycle Execution", "status": "pending"}
+]
+```
+
+**Progress Tracking Update (tasks collapsed)**:
+```json
+[
+  {"content": "Phase 1: Test-Fix Generation", "status": "in_progress"},
+  {"content": "  1.1 Create Test Session", "status": "completed"},
+  {"content": "  1.2 Gather Test Context", "status": "completed"},
+  {"content": "  1.3 Test Generation Analysis", "status": "pending"},
+  {"content": "  1.4 Generate Test Tasks", "status": "pending"},
+  {"content": "  1.5 Phase Summary", "status": "pending"},
+  {"content": "Phase 2: Test-Cycle Execution", "status": "pending"}
+]
+```
+
+---
+
+## Sub-Phase 1.3: Test Generation Analysis
+
+**Step 3.1: Execute** - Analyze test requirements with Gemini via cli-execution-agent
+
+```javascript
+const analysisAgentId = spawn_agent({
+  message: `
+## TASK ASSIGNMENT
+
+### MANDATORY FIRST STEPS (Agent Execute)
+1. **Read role definition**: ~/.codex/agents/cli-execution-agent.md (MUST read first)
+2. Read: .workflow/project-tech.json
+3. Read: .workflow/project-guidelines.json
+
+---
+
+## Task Objective
+Analyze test requirements for session [testSessionId] using Gemini CLI
+
+## Context
+- Session ID: [testSessionId]
+- Context Package: [contextPath]
+
+## Expected Behavior
+- Use Gemini to analyze coverage gaps
+- Detect project type and apply appropriate test templates
+- Generate multi-layered test requirements (L0-L3)
+- Scan for AI code issues
+- Generate TEST_ANALYSIS_RESULTS.md
+
+## Output Path
+.workflow/active/[testSessionId]/.process/TEST_ANALYSIS_RESULTS.md
+
+## Expected Deliverables
+- TEST_ANALYSIS_RESULTS.md with L0-L3 requirements and AI issue scan
+`
+});
+
+const analysisResult = wait({ ids: [analysisAgentId], timeout_ms: 1200000 });
+close_agent({ id: analysisAgentId });
+```
+
+**Input**:
+- `testSessionId` from Sub-Phase 1.1
+- `contextPath` from Sub-Phase 1.2
+
+**Expected Behavior**:
+- Use Gemini to analyze coverage gaps
+- Detect project type and apply appropriate test templates
+- Generate **multi-layered test requirements** (L0-L3)
+- Scan for AI code issues
+- Generate `TEST_ANALYSIS_RESULTS.md`
+
+**Output**: `.workflow/[testSessionId]/.process/TEST_ANALYSIS_RESULTS.md`
+
+**Validation** - TEST_ANALYSIS_RESULTS.md must include:
+- Project Type Detection (with confidence)
+- Coverage Assessment (current vs target)
+- Test Framework & Conventions
+- Multi-Layered Test Plan (L0-L3)
+- AI Issue Scan Results
+- Test Requirements by File (with layer annotations)
+- Quality Assurance Criteria
+- Success Criteria
+
+**Note**: Detailed specifications for project types, L0-L3 layers, and AI issue detection are defined in the test-concept-enhanced workflow tool.
+
+---
+
+## Sub-Phase 1.4: Generate Test Tasks
+
+**Step 4.1: Execute** - Generate test planning documents via action-planning-agent
+
+```javascript
+const taskGenAgentId = spawn_agent({
+  message: `
+## TASK ASSIGNMENT
+
+### MANDATORY FIRST STEPS (Agent Execute)
+1. **Read role definition**: ~/.codex/agents/action-planning-agent.md (MUST read first)
+2. Read: .workflow/project-tech.json
+3. Read: .workflow/project-guidelines.json
+
+---
+
+## Task Objective
+Generate test-specific IMPL_PLAN.md and task JSONs for session [testSessionId]
+
+## Context
+- Session ID: [testSessionId]
+- TEST_ANALYSIS_RESULTS.md: .workflow/active/[testSessionId]/.process/TEST_ANALYSIS_RESULTS.md
+
+## Expected Output (minimum 4 tasks)
+- IMPL-001.json: Test understanding & generation (@code-developer)
+- IMPL-001.3-validation.json: Code validation gate (@test-fix-agent)
+- IMPL-001.5-review.json: Test quality gate (@test-fix-agent)
+- IMPL-002.json: Test execution & fix cycle (@test-fix-agent)
+- IMPL_PLAN.md: Test generation and execution strategy
+- TODO_LIST.md: Task checklist
+
+## Output Paths
+- Tasks: .workflow/active/[testSessionId]/.task/
+- Plan: .workflow/active/[testSessionId]/IMPL_PLAN.md
+- Todo: .workflow/active/[testSessionId]/TODO_LIST.md
+`
+});
+
+const taskGenResult = wait({ ids: [taskGenAgentId], timeout_ms: 600000 });
+close_agent({ id: taskGenAgentId });
+```
+
+**Input**: `testSessionId` from Sub-Phase 1.1
+
+**Note**: action-planning-agent generates test-specific IMPL_PLAN.md and task JSONs based on TEST_ANALYSIS_RESULTS.md.
+
+**Expected Output** (minimum 4 tasks):
+
+| Task | Type | Agent | Purpose |
+|------|------|-------|---------|
+| IMPL-001 | test-gen | @code-developer | Test understanding & generation (L1-L3) |
+| IMPL-001.3 | code-validation | @test-fix-agent | Code validation gate (L0 + AI issues) |
+| IMPL-001.5 | test-quality-review | @test-fix-agent | Test quality gate |
+| IMPL-002 | test-fix | @test-fix-agent | Test execution & fix cycle |
+
+**Validation**:
+- `.workflow/active/[testSessionId]/.task/IMPL-001.json` exists
+- `.workflow/active/[testSessionId]/.task/IMPL-001.3-validation.json` exists
+- `.workflow/active/[testSessionId]/.task/IMPL-001.5-review.json` exists
+- `.workflow/active/[testSessionId]/.task/IMPL-002.json` exists
+- `.workflow/active/[testSessionId]/IMPL_PLAN.md` exists
+- `.workflow/active/[testSessionId]/TODO_LIST.md` exists
+
+**Progress Tracking Update (agent task attached)**:
+```json
+[
+  {"content": "Phase 1: Test-Fix Generation", "status": "in_progress"},
+  {"content": "  1.1 Create Test Session", "status": "completed"},
+  {"content": "  1.2 Gather Test Context", "status": "completed"},
+  {"content": "  1.3 Test Generation Analysis", "status": "completed"},
+  {"content": "  1.4 Generate Test Tasks", "status": "in_progress"},
+  {"content": "  1.5 Phase Summary", "status": "pending"},
+  {"content": "Phase 2: Test-Cycle Execution", "status": "pending"}
+]
+```
+
+---
+
+## Sub-Phase 1.5: Phase 1 Summary
+
+**Internal Summary** (transitions directly to Phase 2):
+```
+Phase 1 Complete - Test-Fix Generation
+
+Input: [original input]
+Mode: [Session|Prompt]
+Test Session: [testSessionId]
+
+Tasks Created:
+- IMPL-001: Test Understanding & Generation (@code-developer)
+- IMPL-001.3: Code Validation Gate - AI Error Detection (@test-fix-agent)
+- IMPL-001.5: Test Quality Gate - Static Analysis & Coverage (@test-fix-agent)
+- IMPL-002: Test Execution & Fix Cycle (@test-fix-agent)
+
+Quality Thresholds:
+- Code Validation: Zero CRITICAL issues, zero compilation errors
+- Minimum Coverage: 80% line, 70% branch
+- Static Analysis: Zero critical anti-patterns
+- Max Fix Iterations: 5
+
+Artifacts:
+- Test plan: .workflow/[testSessionId]/IMPL_PLAN.md
+- Task list: .workflow/[testSessionId]/TODO_LIST.md
+- Analysis: .workflow/[testSessionId]/.process/TEST_ANALYSIS_RESULTS.md
+
+→ Transitioning to Phase 2: Test-Cycle Execution
+```
+
+**Progress Tracking**: Mark Phase 1 completed, Phase 2 in_progress. Immediately begin Phase 2.
+
+## Data Flow
+
+```
+User Input (session ID | description | file path)
+    ↓
+[Detect Mode: session | prompt]
+    ↓
+Sub-Phase 1.1: session:start --type test --new "description"
+    ↓ Output: testSessionId
+    ↓
+Sub-Phase 1.2: test-context-gather | context-gather (via spawn_agent)
+    ↓ Input: testSessionId
+    ↓ Output: contextPath (context-package.json)
+    ↓
+Sub-Phase 1.3: test-concept-enhanced (via spawn_agent → cli-execution-agent)
+    ↓ Input: testSessionId + contextPath
+    ↓ Output: TEST_ANALYSIS_RESULTS.md (L0-L3 requirements + AI issues)
+    ↓
+Sub-Phase 1.4: test-task-generate (via spawn_agent → action-planning-agent)
+    ↓ Input: testSessionId + TEST_ANALYSIS_RESULTS.md
+    ↓ Output: IMPL_PLAN.md, IMPL-*.json (4+), TODO_LIST.md
+    ↓
+Sub-Phase 1.5: Phase 1 Summary
+    ↓
+→ Phase 2: Test-Cycle Execution
+```
+
+## Execution Flow Diagram
+
+```
+Orchestrator triggers Phase 1
+  ↓
+[Input Detection] → MODE: session | prompt
+  ↓
+[Progress Init] Phase 1 sub-phases
+  ↓
+Sub-Phase 1.1: Create Test Session
+  → Read and execute session start phase
+  → testSessionId extracted (WFS-test-user-auth)
+  ↓
+Sub-Phase 1.2: Gather Test Context (spawn_agent executed)
+  → spawn_agent: context-search-agent
+  → wait → close_agent
+  → ATTACH 3 sub-tasks: ← ATTACHED
+    - → Load codebase context
+    - → Analyze test coverage
+    - → Generate context package
+  → Execute sub-tasks sequentially
+  → COLLAPSE tasks ← COLLAPSED
+  → contextPath extracted
+  ↓
+Sub-Phase 1.3: Test Generation Analysis (spawn_agent executed)
+  → spawn_agent: cli-execution-agent (Gemini)
+  → wait → close_agent
+  → ATTACH 3 sub-tasks: ← ATTACHED
+    - → Analyze coverage gaps with Gemini
+    - → Detect AI code issues (L0.5)
+    - → Generate L0-L3 test requirements
+  → Execute sub-tasks sequentially
+  → COLLAPSE tasks ← COLLAPSED
+  → TEST_ANALYSIS_RESULTS.md created
+  ↓
+Sub-Phase 1.4: Generate Test Tasks (spawn_agent executed)
+  → spawn_agent: action-planning-agent
+  → wait → close_agent
+  → Agent autonomously generates:
+    - IMPL-001.json (test generation)
+    - IMPL-001.3-validation.json (code validation)
+    - IMPL-001.5-review.json (test quality)
+    - IMPL-002.json (test execution)
+    - IMPL_PLAN.md
+    - TODO_LIST.md
+  ↓
+Sub-Phase 1.5: Phase 1 Summary
+  → Internal summary with artifacts
+  → Transition to Phase 2
+```
+
+## Session Metadata
+
+**File**: `workflow-session.json`
+
+| Mode | Fields |
+|------|--------|
+| **Session** | `type: "test"`, `source_session_id: "[sourceId]"` |
+| **Prompt** | `type: "test"` (no source_session_id) |
+
+## Error Handling
+
+| Sub-Phase | Error Condition | Action |
+|-----------|----------------|--------|
+| 1.1 | Source session not found (session mode) | Return error with session ID |
+| 1.1 | No completed IMPL tasks (session mode) | Return error, source incomplete |
+| 1.2 | Context gathering failed | Return error, check source artifacts |
+| 1.2 | Agent timeout | Retry with extended timeout, close_agent, then return error |
+| 1.3 | Gemini analysis failed | Return error, check context package |
+| 1.4 | Task generation failed | Retry once, then return error |
+
+**Lifecycle Error Handling**:
+```javascript
+try {
+  const agentId = spawn_agent({ message: "..." });
+  const result = wait({ ids: [agentId], timeout_ms: 600000 });
+  // ... process result ...
+  close_agent({ id: agentId });
+} catch (error) {
+  if (agentId) close_agent({ id: agentId });
+  throw error;
+}
+```
--- a/.codex/skills/workflow-test-fix-cycle/phases/02-test-cycle-execute.md
+++ b/.codex/skills/workflow-test-fix-cycle/phases/02-test-cycle-execute.md
@@ -0,0 +1,478 @@
+# Phase 2: Test-Cycle Execution
+
+Dynamic test-fix execution with **adaptive task generation** based on runtime analysis. Iterative fix cycles until test pass rate >= 95% or max iterations reached. All agent interactions use spawn_agent → wait → close_agent lifecycle.
+
+**vs Standard Execute**:
+- **Standard**: Pre-defined tasks → Execute sequentially → Done
+- **Test-Cycle**: Initial tasks → **Test → Analyze failures → Generate fix tasks → Fix → Re-test** → Repeat until pass
+
+## Agent Roles
+
+| Agent | Responsibility |
+|-------|---------------|
+| **Orchestrator** | Loop control, strategy selection, pass rate calculation, threshold decisions |
+| **@cli-planning-agent** | CLI analysis (Gemini/Qwen/Codex), root cause extraction, task generation, affected test detection |
+| **@test-fix-agent** | Test execution, code fixes, criticality assignment, result reporting |
+
+## Core Responsibilities
+
+### Orchestrator
+- Session discovery, task queue management
+- Pass rate calculation: `(passed / total) * 100` from test-results.json
+- Criticality assessment (high/medium/low)
+- Strategy selection based on context
+- **Runtime Calculations** (from iteration-state.json):
+  - Current iteration: `iterations.length + 1`
+  - Stuck tests: Tests appearing in `failed_tests` for 3+ consecutive iterations
+  - Regression: Compare consecutive `pass_rate` values (>10% drop)
+  - Max iterations: Read from `task.meta.max_iterations`
+- Iteration control (max 10, default)
+- CLI tool fallback chain: Gemini → Qwen → Codex
+- Progress tracking
+- Session auto-complete (pass rate >= 95%)
+- **Explicit Lifecycle**: Always close_agent after wait completes
+
+### @cli-planning-agent
+- Execute CLI analysis with bug diagnosis template
+- Parse output for root causes and fix strategy
+- Generate IMPL-fix-N.json task definition
+- Detect affected tests for progressive testing
+- Save: analysis.md, cli-output.txt
+
+### @test-fix-agent
+- Execute tests, save results to test-results.json
+- Apply fixes from task.context.fix_strategy
+- Assign criticality to failures
+- Update task status
+
+## Intelligent Strategy Engine
+
+**Auto-selects optimal strategy based on iteration context:**
+
+| Strategy | Trigger | Behavior |
+|----------|---------|----------|
+| **Conservative** | Iteration 1-2 (default) | Single targeted fix, full validation |
+| **Aggressive** | Pass rate >80% + similar failures | Batch fix related issues |
+| **Surgical** | Regression detected (pass rate drops >10%) | Minimal changes, rollback focus |
+
+**Selection Logic** (in orchestrator):
+```javascript
+if (iteration <= 2) return "conservative";
+if (passRate > 80 && failurePattern.similarity > 0.7) return "aggressive";
+if (regressionDetected) return "surgical";
+return "conservative";
+```
+
+**Integration**: Strategy passed to @cli-planning-agent in prompt for tailored analysis.
+
+## Progressive Testing
+
+**Runs affected tests during iterations, full suite only for final validation.**
+
+**How It Works**:
+1. @cli-planning-agent analyzes fix_strategy.modification_points
+2. Maps modified files to test files (via imports + integration patterns)
+3. Returns `affected_tests[]` in task JSON
+4. @test-fix-agent runs: `npm test -- ${affected_tests.join(' ')}`
+5. Final validation: `npm test` (full suite)
+
+
+**Benefits**: 70-90% iteration speed improvement, instant feedback on fix effectiveness.
+
+## Agent Invocation Template
+
+**@cli-planning-agent** (failure analysis):
+```javascript
+// Spawn agent for failure analysis
+const analysisAgentId = spawn_agent({
+  message: `
+## TASK ASSIGNMENT
+
+### MANDATORY FIRST STEPS (Agent Execute)
+1. **Read role definition**: ~/.codex/agents/cli-planning-agent.md (MUST read first)
+2. Read: .workflow/project-tech.json
+3. Read: .workflow/project-guidelines.json
+
+---
+
+    ## Task Objective
+    Analyze test failures and generate fix task JSON for iteration ${N}
+
+    ## Strategy
+    ${selectedStrategy} - ${strategyDescription}
+
+    ## MANDATORY FIRST STEPS
+    1. Read test results: ${session.test_results_path}
+    2. Read test output: ${session.test_output_path}
+    3. Read iteration state: ${session.iteration_state_path}
+
+    ## Context Metadata (Orchestrator-Calculated)
+    - Session ID: ${sessionId} (from file path)
+    - Current Iteration: ${N} (= iterations.length + 1)
+    - Max Iterations: ${maxIterations} (from task.meta.max_iterations)
+    - Current Pass Rate: ${passRate}%
+    - Selected Strategy: ${selectedStrategy} (from iteration-state.json)
+    - Stuck Tests: ${stuckTests} (calculated from iterations[].failed_tests history)
+
+    ## CLI Configuration
+    - Tool Priority: gemini & codex
+    - Template: 01-diagnose-bug-root-cause.txt
+    - Timeout: 2400000ms
+
+    ## Expected Deliverables
+    1. Task JSON: ${session.task_dir}/IMPL-fix-${N}.json
+       - Must include: fix_strategy.test_execution.affected_tests[]
+       - Must include: fix_strategy.confidence_score
+    2. Analysis report: ${session.process_dir}/iteration-${N}-analysis.md
+    3. CLI output: ${session.process_dir}/iteration-${N}-cli-output.txt
+
+    ## Strategy-Specific Requirements
+    - Conservative: Single targeted fix, high confidence required
+    - Aggressive: Batch fix similar failures, pattern-based approach
+    - Surgical: Minimal changes, focus on rollback safety
+
+    ## Success Criteria
+    - Concrete fix strategy with modification points (file:function:lines)
+    - Affected tests list for progressive testing
+    - Root cause analysis (not just symptoms)
+`
+});
+
+// Wait for analysis completion
+const analysisResult = wait({
+  ids: [analysisAgentId],
+  timeout_ms: 2400000  // 40 minutes (CLI analysis timeout)
+});
+
+// Clean up
+close_agent({ id: analysisAgentId });
+```
+
+**@test-fix-agent** (execution):
+```javascript
+// Spawn agent for test execution/fixing
+const fixAgentId = spawn_agent({
+  message: `
+## TASK ASSIGNMENT
+
+### MANDATORY FIRST STEPS (Agent Execute)
+1. **Read role definition**: ~/.codex/agents/test-fix-agent.md (MUST read first)
+2. Read: .workflow/project-tech.json
+3. Read: .workflow/project-guidelines.json
+
+---
+
+    ## Task Objective
+    ${taskTypeObjective[task.meta.type]}
+
+    ## MANDATORY FIRST STEPS
+    1. Read task JSON: ${session.task_json_path}
+    2. Read iteration state: ${session.iteration_state_path}
+    3. ${taskTypeSpecificReads[task.meta.type]}
+
+    ## CRITICAL: Syntax Check Priority
+    **Before any code modification or test execution:**
+    - Run project syntax checker (TypeScript: tsc --noEmit, ESLint, etc.)
+    - Verify zero syntax errors before proceeding
+    - If syntax errors found: Fix immediately before other work
+    - Syntax validation is MANDATORY gate - no exceptions
+
+    ## Session Paths
+    - Workflow Dir: ${session.workflow_dir}
+    - Task JSON: ${session.task_json_path}
+    - Test Results Output: ${session.test_results_path}
+    - Test Output Log: ${session.test_output_path}
+    - Iteration State: ${session.iteration_state_path}
+
+    ## Task Type: ${task.meta.type}
+    ${taskTypeGuidance[task.meta.type]}
+
+    ## Expected Deliverables
+    ${taskTypeDeliverables[task.meta.type]}
+
+    ## Success Criteria
+    - ${taskTypeSuccessCriteria[task.meta.type]}
+    - Update task status in task JSON
+    - Save all outputs to specified paths
+    - Report completion to orchestrator
+`
+});
+
+// Wait for execution completion
+const fixResult = wait({
+  ids: [fixAgentId],
+  timeout_ms: 600000  // 10 minutes
+});
+
+// Clean up
+close_agent({ id: fixAgentId });
+
+// Task Type Configurations
+const taskTypeObjective = {
+  "test-gen": "Generate comprehensive tests based on requirements",
+  "test-fix": "Execute test suite and report results with criticality assessment",
+  "test-fix-iteration": "Apply fixes from strategy and validate with tests"
+};
+
+const taskTypeSpecificReads = {
+  "test-gen": "Read test context: ${session.test_context_path}",
+  "test-fix": "Read previous results (if exists): ${session.test_results_path}",
+  "test-fix-iteration": "Read fix strategy: ${session.analysis_path}, fix history: ${session.fix_history_path}"
+};
+
+const taskTypeGuidance = {
+  "test-gen": `
+    - Review task.context.requirements for test scenarios
+    - Analyze codebase to understand implementation
+    - Generate tests covering: happy paths, edge cases, error handling
+    - Follow existing test patterns and framework conventions
+  `,
+  "test-fix": `
+    - Run test command from task.context or project config
+    - Capture: pass/fail counts, error messages, stack traces
+    - Assess criticality for each failure:
+      * high: core functionality broken, security issues
+      * medium: feature degradation, data integrity issues
+      * low: edge cases, flaky tests, env-specific issues
+    - Save structured results to test-results.json
+  `,
+  "test-fix-iteration": `
+    - Load fix_strategy from task.context.fix_strategy
+    - Identify modification_points: ${task.context.fix_strategy.modification_points}
+    - Apply surgical fixes (minimal changes)
+    - Test execution mode: ${task.context.fix_strategy.test_execution.mode}
+      * affected_only: Run ${task.context.fix_strategy.test_execution.affected_tests}
+      * full_suite: Run complete test suite
+    - If failures persist: Document in test-results.json, DO NOT analyze (orchestrator handles)
+  `
+};
+
+const taskTypeDeliverables = {
+  "test-gen": "- Test files in target directories\n    - Test coverage report\n    - Summary in .summaries/",
+  "test-fix": "- test-results.json (pass_rate, criticality, failures)\n    - test-output.log (full test output)\n    - Summary in .summaries/",
+  "test-fix-iteration": "- Modified source files\n    - test-results.json (updated pass_rate)\n    - test-output.log\n    - Summary in .summaries/"
+};
+
+const taskTypeSuccessCriteria = {
+  "test-gen": "All test files created, executable without errors, coverage documented",
+  "test-fix": "Test results saved with accurate pass_rate and criticality, all failures documented",
+  "test-fix-iteration": "Fixes applied per strategy, tests executed, results reported (pass/fail to orchestrator)"
+};
+```
+
+## CLI Tool Configuration
+
+**Fallback Chain**: Gemini → Qwen → Codex
+**Template**: `~/.codex/workflows/cli-templates/prompts/analysis/01-diagnose-bug-root-cause.txt`
+**Timeout**: 40min (2400000ms)
+
+**Tool Details**:
+1. **Gemini** (primary): `gemini-2.5-pro`
+2. **Qwen** (fallback): `coder-model`
+3. **Codex** (fallback): `gpt-5.1-codex`
+
+**When to Fallback**: HTTP 429, timeout, analysis quality degraded
+
+## Session File Structure
+
+```
+.workflow/active/WFS-test-{session}/
+├── workflow-session.json           # Session metadata
+├── IMPL_PLAN.md, TODO_LIST.md
+├── .task/
+│   ├── IMPL-{001,002}.json         # Initial tasks
+│   └── IMPL-fix-{N}.json           # Generated fix tasks
+├── .process/
+│   ├── iteration-state.json        # Current iteration + strategy + stuck tests
+│   ├── test-results.json           # Latest results (pass_rate, criticality)
+│   ├── test-output.log             # Full test output
+│   ├── fix-history.json            # All fix attempts
+│   ├── iteration-{N}-analysis.md   # CLI analysis report
+│   └── iteration-{N}-cli-output.txt
+└── .summaries/iteration-summaries/
+```
+
+## Iteration State JSON
+
+**Purpose**: Persisted state machine for iteration loop - enables Resume and historical analysis.
+
+```json
+{
+  "current_task": "IMPL-002",
+  "selected_strategy": "aggressive",
+  "next_action": "execute_fix_task",
+  "iterations": [
+    {
+      "iteration": 1,
+      "pass_rate": 70,
+      "strategy": "conservative",
+      "failed_tests": ["test_auth_flow", "test_user_permissions"]
+    },
+    {
+      "iteration": 2,
+      "pass_rate": 82,
+      "strategy": "conservative",
+      "failed_tests": ["test_user_permissions", "test_token_expiry"]
+    },
+    {
+      "iteration": 3,
+      "pass_rate": 89,
+      "strategy": "aggressive",
+      "failed_tests": ["test_auth_edge_case"]
+    }
+  ]
+}
+```
+
+**Field Descriptions**:
+- `current_task`: Pointer to active task (essential for Resume)
+- `selected_strategy`: Current iteration strategy (runtime state)
+- `next_action`: State machine next step (`execute_fix_task` | `retest` | `complete`)
+- `iterations[]`: Historical log of all iterations (source of truth for trends)
+
+## Completion Conditions
+
+**Full Success**:
+- All tasks completed
+- Pass rate === 100%
+- Action: Auto-complete session
+
+**Partial Success**:
+- All tasks completed
+- Pass rate >= 95% and < 100%
+- All failures are "low" criticality
+- Action: Auto-approve with review note
+
+**Failure**:
+- Max iterations (10) reached without 95% pass rate
+- Pass rate < 95% after max iterations
+- Action: Generate failure report, mark blocked, return to user
+
+## Error Handling
+
+| Scenario | Action |
+|----------|--------|
+| Test execution error | Log, retry with error context |
+| CLI analysis failure | Fallback: Gemini → Qwen → Codex → manual |
+| Agent execution error | Save state, close_agent, retry with simplified context |
+| Max iterations reached | Generate failure report, mark blocked |
+| Regression detected | Rollback last fix, switch to surgical strategy |
+| Stuck tests detected | Continue with alternative strategy, document in failure report |
+
+**CLI Fallback Triggers** (Gemini → Qwen → Codex → manual):
+
+Fallback is triggered when any of these conditions occur:
+
+1. **Invalid Output**:
+   - CLI tool fails to generate valid `IMPL-fix-N.json` (JSON parse error)
+   - Missing required fields: `fix_strategy.modification_points` or `fix_strategy.affected_tests`
+
+2. **Low Confidence**:
+   - `fix_strategy.confidence_score < 0.4` (indicates uncertain analysis)
+
+3. **Technical Failures**:
+   - HTTP 429 (rate limit) or 5xx errors
+   - Timeout (exceeds 2400000ms / 40min)
+   - Connection errors
+
+4. **Quality Degradation**:
+   - Analysis report < 100 words (too brief, likely incomplete)
+   - No concrete modification points provided (only general suggestions)
+   - Same root cause identified 3+ consecutive times (stuck analysis)
+
+**Fallback Sequence**:
+- Try primary tool (Gemini)
+- If trigger detected → Try fallback (Qwen)
+- If trigger detected again → Try final fallback (Codex)
+- If all fail → Mark as degraded, use basic pattern matching from fix-history.json, notify user
+
+**Lifecycle Error Handling**:
+```javascript
+try {
+  const agentId = spawn_agent({ message: "..." });
+  const result = wait({ ids: [agentId], timeout_ms: 2400000 });
+  // ... process result ...
+  close_agent({ id: agentId });
+} catch (error) {
+  if (agentId) close_agent({ id: agentId });
+  // Save state for resume capability
+  throw error;
+}
+```
+
+## Progress Tracking Pattern
+
+```javascript
+[
+  {
+    content: "Execute IMPL-001: Generate tests [code-developer]",
+    status: "completed"
+  },
+  {
+    content: "Execute IMPL-002: Test & Fix Cycle [ITERATION]",
+    status: "in_progress"
+  },
+  {
+    content: "  → Iteration 1: Initial test (pass: 70%, conservative)",
+    status: "completed"
+  },
+  {
+    content: "  → Iteration 2: Fix validation (pass: 82%, conservative)",
+    status: "completed"
+  },
+  {
+    content: "  → Iteration 3: Batch fix auth (pass: 89%, aggressive)",
+    status: "in_progress"
+  }
+]
+```
+
+**Update Rules**:
+- Add iteration item with: strategy, pass rate
+- Mark completed after each iteration
+- Update parent task when all complete
+
+## Commit Strategy
+
+**Automatic Commits** (orchestrator-managed):
+
+The orchestrator automatically creates git commits at key checkpoints to enable safe rollback:
+
+1. **After Successful Iteration** (pass rate increased):
+   ```bash
+   git add .
+   git commit -m "test-cycle: iteration ${N} - ${strategy} strategy (pass: ${oldRate}% → ${newRate}%)"
+   ```
+
+2. **Before Rollback** (regression detected):
+   ```bash
+   # Current state preserved, then:
+   git revert HEAD
+   git commit -m "test-cycle: rollback iteration ${N} - regression detected (pass: ${newRate}% < ${oldRate}%)"
+   ```
+
+**Commit Content**:
+- Modified source files from fix application
+- Updated test-results.json, iteration-state.json
+- Excludes: temporary files, logs
+
+**Benefits**:
+- **Rollback Safety**: Each iteration is a revert point
+- **Progress Tracking**: Git history shows iteration evolution
+- **Audit Trail**: Clear record of which strategy/iteration caused issues
+- **Resume Capability**: Can resume from any checkpoint
+
+**Note**: Final session completion creates additional commit with full summary.
+
+## Post-Completion Expansion
+
+完成后询问用户是否扩展为issue(test/enhance/refactor/doc)，选中项创建新 issue: `"{summary} - {dimension}"`
+
+## Best Practices
+
+1. **Default Settings Work**: 10 iterations sufficient for most cases
+2. **Automatic Commits**: Orchestrator commits after each successful iteration - no manual intervention needed
+3. **Trust Strategy Engine**: Auto-selection based on proven heuristics
+4. **Monitor Logs**: Check `.process/iteration-N-analysis.md` for CLI analysis insights
+5. **Progressive Testing**: Saves 70-90% iteration time automatically
+6. **Always Close Agents**: Ensure close_agent is called after every wait completes, including error paths