feat: migrate test-task-generate to agent-driven architecture

- Refactor test-task-generate.md to use action-planning-agent - Add two-phase execution flow (Discovery → Agent Execution) - Integrate Memory-First principle and MCP tool enhancements - Support both agent-mode (default) and cli-execute-mode - Add test-specific context package with TEST_ANALYSIS_RESULTS.md - Align with task-generate-agent.md architecture - Remove 556 lines of redundant old content (Phase 1-4 old structure) - Update test-gen.md and test-fix-gen.md to reflect agent-driven changes Changes include: - Core Philosophy with agent-driven principles - Agent Context Package structure for test sessions - Discovery Actions for test context loading - Agent Invocation with test-specific requirements - Test Task Structure Reference - Updated Integration & Usage sections - Enhanced Related Commands documentation Now test task generation uses autonomous agent planning with MCP enhancements for better test coverage analysis and generation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
refactor: update tdd-plan to use --cli-execute parameter
2026-02-05 01:50:27 +08:00 · 2025-11-08 17:23:55 +08:00 · 2025-11-08 17:17:09 +08:00 · 2025-11-08 17:11:17 +08:00 · 2025-11-08 17:03:18 +08:00
7 changed files with 683 additions and 958 deletions
--- a/.claude/commands/workflow/plan.md
+++ b/.claude/commands/workflow/plan.md
@@ -1,7 +1,7 @@
 ---
 name: plan
-description: 5-phase planning workflow with Gemini analysis and action-planning-agent task generation, outputs IMPL_PLAN.md and task JSONs with optional CLI auto-execution
-argument-hint: "[--agent] [--cli-execute] \"text description\"|file.md"
+description: 5-phase planning workflow with action-planning-agent task generation, outputs IMPL_PLAN.md and task JSONs with optional CLI auto-execution
+argument-hint: "[--cli-execute] \"text description\"|file.md"
 allowed-tools: SlashCommand(*), TodoWrite(*), Read(*), Bash(*)
 ---

@@ -20,7 +20,7 @@ This workflow runs **fully autonomously** once triggered. Phase 3 (conflict reso
 2. **Phase 1 executes** → Session discovery → Auto-continues
 3. **Phase 2 executes** → Context gathering → Auto-continues
 4. **Phase 3 executes** (optional, if conflict_risk ≥ medium) → Conflict resolution → Auto-continues
-5. **Phase 4 executes** (task-generate-agent if --agent) → Task generation → Reports final summary
+5. **Phase 4 executes** → Task generation (task-generate-agent) → Reports final summary

 **Auto-Continue Mechanism**:
 - TodoList tracks current phase status
@@ -28,11 +28,6 @@ This workflow runs **fully autonomously** once triggered. Phase 3 (conflict reso
 - All phases run autonomously without user interaction (clarification handled in brainstorm phase)
 - Progress updates shown at each phase for visibility

-**Execution Modes**:
- **Manual Mode** (default): Use `/workflow:tools:task-generate`
- **Agent Mode** (`--agent`): Use `/workflow:tools:task-generate-agent`
- **CLI Execute Mode** (`--cli-execute`): Generate tasks with Codex execution commands
-
 ## Core Rules

 1. **Start Immediately**: First action is TodoWrite initialization, second action is Phase 1 command execution
@@ -159,24 +154,18 @@ CONTEXT: Existing user database schema, REST API endpoints
 - Task generation translates high-level role analyses into concrete, actionable work items
 - **Intent priority**: Current user prompt > role analysis.md files > guidance-specification.md

-**Command Selection**:
- Manual: `SlashCommand(command="/workflow:tools:task-generate --session [sessionId]")`
- Agent: `SlashCommand(command="/workflow:tools:task-generate-agent --session [sessionId]")`
- CLI Execute: Add `--cli-execute` flag to either command
-
-**Flag Combination**:
- `--cli-execute` alone: Manual task generation with CLI execution
- `--agent --cli-execute`: Agent task generation with CLI execution
-
-**Command Examples**:
+**Command**:
 ```bash
-# Manual with CLI execution
-/workflow:tools:task-generate --session WFS-auth --cli-execute
+# Default (agent mode)
+SlashCommand(command="/workflow:tools:task-generate-agent --session [sessionId]")

-# Agent with CLI execution
-/workflow:tools:task-generate-agent --session WFS-auth --cli-execute
+# With CLI execution
+SlashCommand(command="/workflow:tools:task-generate-agent --session [sessionId] --cli-execute")
 ```

+**Flag**:
+- `--cli-execute`: Generate tasks with Codex execution commands
+
 **Input**: `sessionId` from Phase 1

 **Validation**:
@@ -296,7 +285,7 @@ Phase 3: conflict-resolution [AUTO-TRIGGERED if conflict_risk ≥ medium]
    ↓ Output: Modified brainstorm artifacts (NO report file)
    ↓ Skip if conflict_risk is none/low → proceed directly to Phase 4
    ↓
-Phase 4: task-generate[--agent] --session sessionId
+Phase 4: task-generate-agent --session sessionId [--cli-execute]
    ↓ Input: sessionId + resolved brainstorm artifacts + session memory
    ↓ Output: IMPL_PLAN.md, task JSONs, TODO_LIST.md
    ↓
@@ -333,9 +322,8 @@ Return summary to user
 - **If conflict_risk ≥ medium**: Launch Phase 3 conflict-resolution with sessionId and contextPath
 - Wait for Phase 3 completion (if executed), verify CONFLICT_RESOLUTION.md created
 - **If conflict_risk is none/low**: Skip Phase 3, proceed directly to Phase 4
- **Build Phase 4 command** based on flags:
-  - Base command: `/workflow:tools:task-generate` (or `-agent` if `--agent` flag)
-  - Add `--session [sessionId]`
+- **Build Phase 4 command**:
+  - Base command: `/workflow:tools:task-generate-agent --session [sessionId]`
  - Add `--cli-execute` if flag present
 - Pass session ID to Phase 4 command
 - Verify all Phase 4 outputs
@@ -380,8 +368,7 @@ CONSTRAINTS: [Limitations or boundaries]
 - `/workflow:tools:context-gather` - Phase 2: Gather project context and analyze codebase
 - `/workflow:tools:conflict-resolution` - Phase 3: Detect and resolve conflicts (auto-triggered if conflict_risk ≥ medium)
 - `/compact` - Phase 3: Memory optimization (if context approaching limits)
- `/workflow:tools:task-generate` - Phase 4: Generate task JSON files with manual approach
- `/workflow:tools:task-generate-agent` - Phase 4: Generate task JSON files with agent-driven approach (when `--agent` flag used)
+- `/workflow:tools:task-generate-agent` - Phase 4: Generate task JSON files with agent-driven approach

 **Follow-up Commands**:
 - `/workflow:action-plan-verify` - Recommended: Verify plan quality and catch issues before execution
--- a/.claude/commands/workflow/tdd-plan.md
+++ b/.claude/commands/workflow/tdd-plan.md
@@ -1,7 +1,7 @@
 ---
 name: tdd-plan
 description: TDD workflow planning with Red-Green-Refactor task chain generation, test-first development structure, and cycle tracking
-argument-hint: "[--agent] \"feature description\"|file.md"
+argument-hint: "[--cli-execute] \"feature description\"|file.md"
 allowed-tools: SlashCommand(*), TodoWrite(*), Read(*), Bash(*)
 ---

@@ -12,8 +12,8 @@ allowed-tools: SlashCommand(*), TodoWrite(*), Read(*), Bash(*)
 **This command is a pure orchestrator**: Execute 5 slash commands in sequence, parse outputs, pass context, and ensure complete TDD workflow creation.

 **Execution Modes**:
- **Manual Mode** (default): Use `/workflow:tools:task-generate-tdd`
- **Agent Mode** (`--agent`): Use `/workflow:tools:task-generate-tdd --agent`
+- **Agent Mode** (default): Use `/workflow:tools:task-generate-tdd` (autonomous agent-driven)
+- **CLI Mode** (`--cli-execute`): Use `/workflow:tools:task-generate-tdd --cli-execute` (Gemini/Qwen)

 ## Core Rules

@@ -129,8 +129,8 @@ TEST_FOCUS: [Test scenarios]

 ### Phase 5: TDD Task Generation
 **Command**:
- Manual: `/workflow:tools:task-generate-tdd --session [sessionId]`
- Agent: `/workflow:tools:task-generate-tdd --session [sessionId] --agent`
+- Agent Mode (default): `/workflow:tools:task-generate-tdd --session [sessionId]`
+- CLI Mode (`--cli-execute`): `/workflow:tools:task-generate-tdd --session [sessionId] --cli-execute`

 **Parse**: Extract feature count, task count (not chain count - tasks now contain internal TDD cycles)

@@ -373,8 +373,8 @@ Supports action-planning-agent for more autonomous TDD planning with:
 - `/workflow:tools:test-context-gather` - Phase 3: Analyze existing test patterns and coverage
 - `/workflow:tools:conflict-resolution` - Phase 4: Detect and resolve conflicts (auto-triggered if conflict_risk ≥ medium)
 - `/compact` - Phase 4: Memory optimization (if context approaching limits)
- `/workflow:tools:task-generate-tdd` - Phase 5: Generate TDD task chains with Red-Green-Refactor cycles
- `/workflow:tools:task-generate-tdd --agent` - Phase 5: Generate TDD tasks with agent-driven approach (when `--agent` flag used)
+- `/workflow:tools:task-generate-tdd` - Phase 5: Generate TDD tasks with agent-driven approach (default, autonomous)
+- `/workflow:tools:task-generate-tdd --cli-execute` - Phase 5: Generate TDD tasks with CLI tools (Gemini/Qwen, when `--cli-execute` flag used)

 **Follow-up Commands**:
 - `/workflow:action-plan-verify` - Recommended: Verify TDD plan quality and structure before execution
--- a/.claude/commands/workflow/test-fix-gen.md
+++ b/.claude/commands/workflow/test-fix-gen.md
@@ -473,9 +473,9 @@ WFS-test-[session]/
 - `/workflow:tools:test-context-gather` - Phase 2 (Session Mode): Gather source session context
 - `/workflow:tools:context-gather` - Phase 2 (Prompt Mode): Analyze codebase directly
 - `/workflow:tools:test-concept-enhanced` - Phase 3: Generate test requirements using Gemini
- `/workflow:tools:test-task-generate` - Phase 4: Generate test task JSONs with fix cycle specification
- `/workflow:tools:test-task-generate --use-codex` - Phase 4: With automated Codex fixes (when `--use-codex` flag used)
- `/workflow:tools:test-task-generate --cli-execute` - Phase 4: With CLI execution mode (when `--cli-execute` flag used)
+- `/workflow:tools:test-task-generate` - Phase 4: Generate test task JSONs using action-planning-agent (autonomous, default)
+- `/workflow:tools:test-task-generate --use-codex` - Phase 4: With automated Codex fixes for IMPL-002 (when `--use-codex` flag used)
+- `/workflow:tools:test-task-generate --cli-execute` - Phase 4: With CLI execution mode for IMPL-001 test generation (when `--cli-execute` flag used)

 **Follow-up Commands**:
 - `/workflow:status` - Review generated test tasks
--- a/.claude/commands/workflow/test-gen.md
+++ b/.claude/commands/workflow/test-gen.md
@@ -337,9 +337,9 @@ See `/workflow:tools:test-task-generate` for complete JSON schemas.
 - `/workflow:session:start` - Phase 1: Create independent test workflow session
 - `/workflow:tools:test-context-gather` - Phase 2: Analyze test coverage and gather source session context
 - `/workflow:tools:test-concept-enhanced` - Phase 3: Generate test requirements and strategy using Gemini
- `/workflow:tools:test-task-generate` - Phase 4: Generate test generation and execution task JSONs
- `/workflow:tools:test-task-generate --use-codex` - Phase 4: With automated Codex fixes (when `--use-codex` flag used)
- `/workflow:tools:test-task-generate --cli-execute` - Phase 4: With CLI execution mode (when `--cli-execute` flag used)
+- `/workflow:tools:test-task-generate` - Phase 4: Generate test task JSONs using action-planning-agent (autonomous, default)
+- `/workflow:tools:test-task-generate --use-codex` - Phase 4: With automated Codex fixes for IMPL-002 (when `--use-codex` flag used)
+- `/workflow:tools:test-task-generate --cli-execute` - Phase 4: With CLI execution mode for IMPL-001 test generation (when `--cli-execute` flag used)

 **Follow-up Commands**:
 - `/workflow:status` - Review generated test tasks
--- a/.claude/commands/workflow/tools/task-generate-agent.md
+++ b/.claude/commands/workflow/tools/task-generate-agent.md
@@ -163,168 +163,50 @@ If conflict_risk was medium/high, modifications have been applied to:

 ## Phase 2: Document Generation Task

-### Task Decomposition Standards
-**Core Principle**: Task Merging Over Decomposition
- **Merge Rule**: Execute together when possible
- **Decompose Only When**:
-  - Excessive workload (>2500 lines or >6 files)
-  - Different tech stacks or domains
-  - Sequential dependency blocking
-  - Parallel execution needed
+**Agent Configuration Reference**: All task generation rules, quantification requirements, quality standards, and execution details are defined in action-planning-agent.

-**Task Limits**:
- **Maximum 10 tasks** (hard limit)
- **Function-based**: Complete units (logic + UI + tests + config)
- **Hierarchy**: Flat (≤5) | Two-level (6-10) | Re-scope (>10)
+Refer to: @.claude/agents/action-planning-agent.md for:
+- Task Decomposition Standards
+- Quantification Requirements (MANDATORY)
+- 5-Field Task JSON Schema
+- IMPL_PLAN.md Structure
+- TODO_LIST.md Format
+- Execution Flow & Quality Validation

-### Quantification Requirements (MANDATORY)
-
-**Purpose**: Eliminate ambiguity by enforcing explicit counts and enumerations in all task specifications.
-
-**Core Rules**:
-1. **Extract Counts from Analysis**: Search for HOW MANY items and list them explicitly
-2. **Enforce Explicit Lists**: Every deliverable uses format `{count} {type}: [{explicit_list}]`
-3. **Make Acceptance Measurable**: Include verification commands (e.g., `ls ... | wc -l = N`)
-4. **Quantify Modification Points**: Specify exact targets (files, functions with line numbers)
-5. **Avoid Vague Language**: Replace "complete", "comprehensive", "reorganize" with quantified statements
-
-**Standard Formats**:
- **Requirements**: `"Implement N items: [item1, item2, ...]"` or `"Modify N files: [file1:func:lines, ...]"`
- **Acceptance**: `"N items exist: verify by [command]"` or `"Coverage >= X%: verify by [test command]"`
- **Modification Points**: `"Create N files: [list]"` or `"Modify N functions: [func() in file lines X-Y]"`
-
-**Validation Checklist**:
- [ ] Every requirement contains explicit count or enumerated list
- [ ] Every acceptance criterion is measurable with verification command
- [ ] Every modification_point specifies exact targets (files/functions/lines)
- [ ] No vague language ("complete", "comprehensive", "reorganize" without counts)
- [ ] Each implementation step has its own acceptance criteria
-
-### Required Outputs
+### Required Outputs Summary

 #### 1. Task JSON Files (.task/IMPL-*.json)
-
-**Location**: `.workflow/{session-id}/.task/`
-**Template Path**: Provided by command (agent-mode or cli-mode template)
-
-**Key Responsibilities**:
- Read template from provided path: `Read({template_path})`
- Replace placeholder variables with session-specific paths
- Include MCP tool integration in `pre_analysis` steps
- Map artifacts based on task domain (UI → ui-designer, Backend → system-architect)
- Apply quantification requirements to all task fields
- Ensure all tasks follow template structure exactly
-
-**Template Selection** (Pre-selected by command):
- **Agent Mode**: `~/.claude/workflows/cli-templates/prompts/workflow/task-json-agent-mode.txt`
- **CLI Mode**: `~/.claude/workflows/cli-templates/prompts/workflow/task-json-cli-mode.txt`
-
-**Note**: Agent does NOT choose template - it's pre-selected based on `--cli-execute` flag and provided in context
+- **Location**: `.workflow/{session-id}/.task/`
+- **Template**: Read from `{template_path}` (pre-selected by command based on `--cli-execute` flag)
+- **Schema**: 5-field structure (id, title, status, meta, context, flow_control) with artifacts integration
+- **Details**: See action-planning-agent.md § Task JSON Generation

 #### 2. IMPL_PLAN.md
-**Location**: .workflow/{session-id}/IMPL_PLAN.md
-
-**IMPL_PLAN Template**:
-\`\`\`
-$(cat ~/.claude/workflows/cli-templates/prompts/workflow/impl-plan-template.txt)
-\`\`\`
-
-**Important**:
- Use the template above for IMPL_PLAN.md generation
- Replace all {placeholder} variables with actual session-specific values
- Populate CCW Workflow Context based on actual phase progression
- Extract content from role analyses and context-package.json
- List all detected brainstorming artifacts with correct paths (role analyses, guidance-specification.md)
- Include conflict resolution status if CONFLICT_RESOLUTION.md exists
+- **Location**: `.workflow/{session-id}/IMPL_PLAN.md`
+- **Template**: `~/.claude/workflows/cli-templates/prompts/workflow/impl-plan-template.txt`
+- **Details**: See action-planning-agent.md § Implementation Plan Creation

 #### 3. TODO_LIST.md
-**Location**: .workflow/{session-id}/TODO_LIST.md
-**Structure**:
-\`\`\`markdown
-# Tasks: {Session Topic}
+- **Location**: `.workflow/{session-id}/TODO_LIST.md`
+- **Format**: Hierarchical task list with status indicators (▸, [ ], [x]) and JSON links
+- **Details**: See action-planning-agent.md § TODO List Generation

-## Task Progress
-▸ **IMPL-001**: [Main Task Group] → [📋](./.task/IMPL-001.json)
-  - [ ] **IMPL-001.1**: [Subtask] → [📋](./.task/IMPL-001.1.json)
-  - [ ] **IMPL-001.2**: [Subtask] → [📋](./.task/IMPL-001.2.json)
+### Agent Execution Summary

- [ ] **IMPL-002**: [Simple Task] → [📋](./.task/IMPL-002.json)
+**Key Steps** (Detailed instructions in action-planning-agent.md):
+1. Load task JSON template from provided path
+2. Extract and decompose tasks with quantification
+3. Generate task JSON files enforcing quantification requirements
+4. Create IMPL_PLAN.md using template
+5. Generate TODO_LIST.md matching task JSONs
+6. Update session state

-## Status Legend
- \`▸\` = Container task (has subtasks)
- \`- [ ]\` = Pending leaf task
- \`- [x]\` = Completed leaf task
-\`\`\`
-
-### Execution Instructions for Agent
-
-**Agent Task**: Generate task JSON files, IMPL_PLAN.md, and TODO_LIST.md based on analysis results
-
-**Note**: The correct task JSON template path has been pre-selected by the command based on the `--cli-execute` flag and is provided in the context as `{template_path}`.
-
-**Step 1: Load Task JSON Template**
- Read template from the provided path: `Read({template_path})`
- This template is already the correct one based on execution mode
-
-**Step 2: Extract and Decompose Tasks (WITH QUANTIFICATION)**
- Parse role analysis.md files for requirements, design specs, and task recommendations
- **CRITICAL: Apply Quantification Extraction Process**:
-  - Scan for counts: numbers + nouns (e.g., "5 files", "17 commands", "3 features")
-  - Build explicit lists for each deliverable (no "..." unless list >20 items)
-  - Flag vague language ("complete", "comprehensive", "reorganize") for replacement
-  - Extract verification methods for each deliverable
- Review synthesis enhancements and clarifications in role analyses
- Apply conflict resolution strategies (if CONFLICT_RESOLUTION.md exists)
- Apply task merging rules (merge when possible, decompose only when necessary)
- Map artifacts to tasks based on domain (UI → ui-designer, Backend → system-architect, Data → data-architect)
- Ensure task count ≤10
-
-**Step 3: Generate Task JSON Files (ENFORCE QUANTIFICATION)**
- Use the template structure from Step 1
- Create .task/IMPL-*.json files with proper structure
- **MANDATORY: Apply Quantification Formats**:
-  - Every requirement: \`{count} {type}: [{explicit_list}]\`
-  - Every acceptance: Measurable with verification command
-  - Every modification_point: Exact targets (files/functions/lines)
-  - NO vague language in any field
- Replace all {placeholder} variables with actual session paths
- Embed artifacts array with brainstorming outputs
- Include MCP tool integration in pre_analysis steps
- **Validation**: Run checklist from Quantification Requirements section before writing files
-
-**Step 4: Create IMPL_PLAN.md**
- Use IMPL_PLAN template
- Populate all sections with session-specific content
- List artifacts with priorities and usage guidelines
- Document execution strategy and dependencies
-
-**Step 5: Generate TODO_LIST.md**
- Create task progress checklist matching generated JSONs
- Use proper status indicators (▸, [ ], [x])
- Link to task JSON files
-
-**Step 6: Update Session State**
- Update workflow-session.json with task count and artifact inventory
- Mark session ready for execution
-
-### MCP Enhancement (Optional)
-
-**Code Analysis**: Use `find`, `rg` for file discovery and pattern search
-**External Research**: Use `mcp__exa__get_code_context_exa` for best practices and API examples
-
-### Quality Validation
-
-Before completion, verify:
- [ ] All task JSON files created in .task/ directory
- [ ] Each task JSON has 5 required fields
- [ ] Artifact references correctly mapped
- [ ] Flow control includes artifact loading steps
- [ ] MCP tool integration added where appropriate
- [ ] IMPL_PLAN.md follows required structure
- [ ] TODO_LIST.md matches task JSONs
- [ ] Dependency graph is acyclic
- [ ] Task count within limits (≤10)
- [ ] Session state updated
+**Quality Gates** (Full checklist in action-planning-agent.md):
+- ✓ Quantification requirements enforced (explicit counts, measurable acceptance, exact targets)
+- ✓ Task count ≤10 (hard limit)
+- ✓ Artifact references mapped correctly
+- ✓ MCP tool integration added
+- ✓ Documents follow template structure

 ## Output

--- a/.claude/commands/workflow/tools/task-generate-tdd.md
+++ b/.claude/commands/workflow/tools/task-generate-tdd.md
@@ -1,14 +1,28 @@
 ---
 name: task-generate-tdd
-description: Generate TDD task chains with Red-Green-Refactor dependencies, test-first structure, and cycle validation
-argument-hint: "--session WFS-session-id [--agent]"
-allowed-tools: Read(*), Write(*), Bash(gemini:*), TodoWrite(*)
+description: Autonomous TDD task generation using action-planning-agent with Red-Green-Refactor cycles, test-first structure, and cycle validation
+argument-hint: "--session WFS-session-id [--cli-execute]"
+examples:
+  - /workflow:tools:task-generate-tdd --session WFS-auth
+  - /workflow:tools:task-generate-tdd --session WFS-auth --cli-execute
 ---

-# TDD Task Generation Command
+# Autonomous TDD Task Generation Command

 ## Overview
-Generate TDD-specific tasks from analysis results with complete Red-Green-Refactor cycles contained within each task.
+Autonomous TDD task JSON and IMPL_PLAN.md generation using action-planning-agent with two-phase execution: discovery and document generation. Supports both agent-driven execution (default) and CLI tool execution modes. Generates complete Red-Green-Refactor cycles contained within each task.
+
+## Core Philosophy
+- **Agent-Driven**: Delegate execution to action-planning-agent for autonomous operation
+- **Two-Phase Flow**: Discovery (context gathering) → Output (document generation)
+- **Memory-First**: Reuse loaded documents from conversation memory
+- **MCP-Enhanced**: Use MCP tools for advanced code analysis and research
+- **Pre-Selected Templates**: Command selects correct TDD template based on `--cli-execute` flag **before** invoking agent
+- **Agent Simplicity**: Agent receives pre-selected template and focuses only on content generation
+- **Path Clarity**: All `focus_paths` prefer absolute paths (e.g., `D:\\project\\src\\module`), or clear relative paths from project root (e.g., `./src/module`)
+- **TDD-First**: Every feature starts with a failing test (Red phase)
+- **Feature-Complete Tasks**: Each task contains complete Red-Green-Refactor cycle
+- **Quantification-Enforced**: All test cases, coverage requirements, and implementation scope MUST include explicit counts and enumerations

 ## Task Strategy & Philosophy

@@ -44,22 +58,220 @@ Generate TDD-specific tasks from analysis results with complete Red-Green-Refact
 - **Current approach**: 1 feature = 1 task (IMPL-N with internal Red-Green-Refactor phases)
 - **Complex features**: 1 container (IMPL-N) + subtasks (IMPL-N.M) when necessary

-### Core Principles
- **TDD-First**: Every feature starts with a failing test (Red phase)
- **Feature-Complete Tasks**: Each task contains complete Red-Green-Refactor cycle
- **Phase-Explicit**: Internal phases clearly marked in flow_control.implementation_approach
- **Task Merging**: Prefer single task per feature over decomposition
- **Path Clarity**: All `focus_paths` prefer absolute paths (e.g., `D:\\project\\src\\module`), or clear relative paths from project root (e.g., `./src/module`)
- **Artifact-Aware**: Integrates brainstorming outputs
- **Memory-First**: Reuse loaded documents from memory
- **Context-Aware**: Analyzes existing codebase and test patterns
- **Iterative Green Phase**: Auto-diagnose and fix test failures with Gemini + optional Codex
- **Safety-First**: Auto-revert on max iterations to prevent broken state
- **Quantification-Enforced**: All test cases, coverage requirements, and implementation scope MUST include explicit counts and enumerations (e.g., "15 test cases: [test1, test2, ...]" not "comprehensive tests")
+## Execution Lifecycle

-## Quantification Requirements for TDD (MANDATORY)
+### Phase 1: Discovery & Context Loading
+**⚡ Memory-First Rule**: Skip file loading if documents already in conversation memory

-**Purpose**: Eliminate ambiguity by enforcing explicit test case counts, coverage metrics, and implementation scope.
+**Agent Context Package**:
+```javascript
+{
+  "session_id": "WFS-[session-id]",
+  "execution_mode": "agent-mode" | "cli-execute-mode",  // Determined by flag
+  "task_json_template_path": "~/.claude/workflows/cli-templates/prompts/workflow/task-json-agent-mode.txt"
+                           | "~/.claude/workflows/cli-templates/prompts/workflow/task-json-cli-mode.txt",
+  // Path selected by command based on --cli-execute flag, agent reads it
+  "workflow_type": "tdd",
+  "session_metadata": {
+    // If in memory: use cached content
+    // Else: Load from .workflow/{session-id}/workflow-session.json
+  },
+  "brainstorm_artifacts": {
+    // Loaded from context-package.json → brainstorm_artifacts section
+    "role_analyses": [
+      {
+        "role": "system-architect",
+        "files": [{"path": "...", "type": "primary|supplementary"}]
+      }
+    ],
+    "guidance_specification": {"path": "...", "exists": true},
+    "synthesis_output": {"path": "...", "exists": true},
+    "conflict_resolution": {"path": "...", "exists": true}  // if conflict_risk >= medium
+  },
+  "context_package_path": ".workflow/{session-id}/.process/context-package.json",
+  "context_package": {
+    // If in memory: use cached content
+    // Else: Load from .workflow/{session-id}/.process/context-package.json
+  },
+  "test_context_package_path": ".workflow/{session-id}/.process/test-context-package.json",
+  "test_context_package": {
+    // Existing test patterns and coverage analysis
+  },
+  "mcp_capabilities": {
+    "code_index": true,
+    "exa_code": true,
+    "exa_web": true
+  }
+}
+```
+
+**Discovery Actions**:
+1. **Load Session Context** (if not in memory)
+   ```javascript
+   if (!memory.has("workflow-session.json")) {
+     Read(.workflow/{session-id}/workflow-session.json)
+   }
+   ```
+
+2. **Load Context Package** (if not in memory)
+   ```javascript
+   if (!memory.has("context-package.json")) {
+     Read(.workflow/{session-id}/.process/context-package.json)
+   }
+   ```
+
+3. **Load Test Context Package** (if not in memory)
+   ```javascript
+   if (!memory.has("test-context-package.json")) {
+     Read(.workflow/{session-id}/.process/test-context-package.json)
+   }
+   ```
+
+4. **Extract & Load Role Analyses** (from context-package.json)
+   ```javascript
+   // Extract role analysis paths from context package
+   const roleAnalysisPaths = contextPackage.brainstorm_artifacts.role_analyses
+     .flatMap(role => role.files.map(f => f.path));
+
+   // Load each role analysis file
+   roleAnalysisPaths.forEach(path => Read(path));
+   ```
+
+5. **Load Conflict Resolution** (from context-package.json, if exists)
+   ```javascript
+   if (contextPackage.brainstorm_artifacts.conflict_resolution?.exists) {
+     Read(contextPackage.brainstorm_artifacts.conflict_resolution.path)
+   }
+   ```
+
+6. **Code Analysis with Native Tools** (optional - enhance understanding)
+   ```bash
+   # Find relevant test files and patterns
+   find . -name "*test*" -type f
+   rg "describe|it\(|test\(" -g "*.ts"
+   ```
+
+7. **MCP External Research** (optional - gather TDD best practices)
+   ```javascript
+   // Get external TDD examples and patterns
+   mcp__exa__get_code_context_exa(
+     query="TypeScript TDD best practices Red-Green-Refactor",
+     tokensNum="dynamic"
+   )
+   ```
+
+### Phase 2: Agent Execution (Document Generation)
+
+**Pre-Agent Template Selection** (Command decides path before invoking agent):
+```javascript
+// Command checks flag and selects template PATH (not content)
+const templatePath = hasCliExecuteFlag
+  ? "~/.claude/workflows/cli-templates/prompts/workflow/task-json-cli-mode.txt"
+  : "~/.claude/workflows/cli-templates/prompts/workflow/task-json-agent-mode.txt";
+```
+
+**Agent Invocation**:
+```javascript
+Task(
+  subagent_type="action-planning-agent",
+  description="Generate TDD task JSON and implementation plan",
+  prompt=`
+## Execution Context
+
+**Session ID**: WFS-{session-id}
+**Workflow Type**: TDD
+**Execution Mode**: {agent-mode | cli-execute-mode}
+**Task JSON Template Path**: {template_path}
+
+## Phase 1: Discovery Results (Provided Context)
+
+### Session Metadata
+{session_metadata_content}
+
+### Role Analyses (Enhanced by Synthesis)
+{role_analyses_content}
+- Includes requirements, design specs, enhancements, and clarifications from synthesis phase
+
+### Artifacts Inventory
+- **Guidance Specification**: {guidance_spec_path}
+- **Role Analyses**: {role_analyses_list}
+
+### Context Package
+{context_package_summary}
+- Includes conflict_risk assessment
+
+### Test Context Package
+{test_context_package_summary}
+- Existing test patterns, framework config, coverage analysis
+
+### Conflict Resolution (Conditional)
+If conflict_risk was medium/high, modifications have been applied to:
+- **guidance-specification.md**: Design decisions updated to resolve conflicts
+- **Role analyses (*.md)**: Recommendations adjusted for compatibility
+- **context-package.json**: Marked as "resolved" with conflict IDs
+- NO separate CONFLICT_RESOLUTION.md file (conflicts resolved in-place)
+
+### MCP Analysis Results (Optional)
+**Code Structure**: {mcp_code_index_results}
+**External Research**: {mcp_exa_research_results}
+
+## Phase 2: TDD Document Generation Task
+
+**Agent Configuration Reference**: All TDD task generation rules, quantification requirements, Red-Green-Refactor cycle structure, quality standards, and execution details are defined in action-planning-agent.
+
+Refer to: @.claude/agents/action-planning-agent.md for:
+- TDD Task Decomposition Standards
+- Red-Green-Refactor Cycle Requirements
+- Quantification Requirements (MANDATORY)
+- 5-Field Task JSON Schema
+- IMPL_PLAN.md Structure (TDD variant)
+- TODO_LIST.md Format
+- TDD Execution Flow & Quality Validation
+
+### TDD-Specific Requirements Summary
+
+#### Task Structure Philosophy
+- **1 feature = 1 task** containing complete TDD cycle internally
+- Each task executes Red-Green-Refactor phases sequentially
+- Task count = Feature count (typically 5 features = 5 tasks)
+- Subtasks only when complexity >2500 lines or >6 files per cycle
+- **Maximum 10 tasks** (hard limit for TDD workflows)
+
+#### TDD Cycle Mapping
+- **Simple features**: IMPL-N with internal Red-Green-Refactor phases
+- **Complex features**: IMPL-N (container) + IMPL-N.M (subtasks)
+- Each cycle includes: test_count, test_cases array, implementation_scope, expected_coverage
+
+#### Required Outputs Summary
+
+##### 1. TDD Task JSON Files (.task/IMPL-*.json)
+- **Location**: `.workflow/{session-id}/.task/`
+- **Template**: Read from `{template_path}` (pre-selected by command based on `--cli-execute` flag)
+- **Schema**: 5-field structure with TDD-specific metadata
+  - `meta.tdd_workflow`: true (REQUIRED)
+  - `meta.max_iterations`: 3 (Green phase test-fix cycle limit)
+  - `meta.use_codex`: false (manual fixes by default)
+  - `context.tdd_cycles`: Array with quantified test cases and coverage
+  - `flow_control.implementation_approach`: Exactly 3 steps with `tdd_phase` field
+    1. Red Phase (`tdd_phase: "red"`): Write failing tests
+    2. Green Phase (`tdd_phase: "green"`): Implement to pass tests
+    3. Refactor Phase (`tdd_phase: "refactor"`): Improve code quality
+- **Details**: See action-planning-agent.md § TDD Task JSON Generation
+
+##### 2. IMPL_PLAN.md (TDD Variant)
+- **Location**: `.workflow/{session-id}/IMPL_PLAN.md`
+- **Template**: `~/.claude/workflows/cli-templates/prompts/workflow/impl-plan-template.txt`
+- **TDD-Specific Frontmatter**: workflow_type="tdd", tdd_workflow=true, feature_count, task_breakdown
+- **TDD Implementation Tasks Section**: Feature-by-feature with internal Red-Green-Refactor cycles
+- **Details**: See action-planning-agent.md § TDD Implementation Plan Creation
+
+##### 3. TODO_LIST.md
+- **Location**: `.workflow/{session-id}/TODO_LIST.md`
+- **Format**: Hierarchical task list with internal TDD phase indicators (Red → Green → Refactor)
+- **Status**: ▸ (container), [ ] (pending), [x] (completed)
+- **Details**: See action-planning-agent.md § TODO List Generation
+
+### Quantification Requirements (MANDATORY)

 **Core Rules**:
 1. **Explicit Test Case Counts**: Red phase specifies exact number with enumerated list
@@ -68,12 +280,10 @@ Generate TDD-specific tasks from analysis results with complete Red-Green-Refact
 4. **Enumerated Refactoring Targets**: Refactor phase lists specific improvements with counts

 **TDD Phase Formats**:
- **Red Phase**: `"Write N test cases: [test1, test2, ...]"`
- **Green Phase**: `"Implement N functions in file lines X-Y: [func1() X1-Y1, func2() X2-Y2, ...]"`
- **Refactor Phase**: `"Apply N refactorings: [improvement1 (details), improvement2 (details), ...]"`
- **Acceptance**: `"All N tests pass with >=X% coverage: verify by [test command]"`
-
-**TDD Cycles Array**: Each cycle must include `test_count`, `test_cases` array, `implementation_scope`, and `expected_coverage`
+- **Red Phase**: "Write N test cases: [test1, test2, ...]"
+- **Green Phase**: "Implement N functions in file lines X-Y: [func1() X1-Y1, func2() X2-Y2, ...]"
+- **Refactor Phase**: "Apply N refactorings: [improvement1 (details), improvement2 (details), ...]"
+- **Acceptance**: "All N tests pass with >=X% coverage: verify by [test command]"

 **Validation Checklist**:
 - [ ] Every Red phase specifies exact test case count with enumerated list
@@ -83,184 +293,94 @@ Generate TDD-specific tasks from analysis results with complete Red-Green-Refact
 - [ ] tdd_cycles array contains test_count and test_cases for each cycle
 - [ ] No vague language ("comprehensive", "complete", "thorough")

-## Core Responsibilities
- Parse analysis results and identify testable features
- Generate feature-complete tasks with internal TDD cycles (1 task per simple feature)
- Apply task merging strategy by default, create subtasks only when complexity requires
- Generate IMPL_PLAN.md with TDD Implementation Tasks section
- Generate TODO_LIST.md with internal TDD phase indicators
- Update session state for TDD execution with task count compliance
+### Agent Execution Summary

-## Execution Lifecycle
+**Key Steps** (Detailed instructions in action-planning-agent.md):
+1. Load task JSON template from provided path
+2. Extract and decompose features with TDD cycles
+3. Generate TDD task JSON files enforcing quantification requirements
+4. Create IMPL_PLAN.md using TDD template variant
+5. Generate TODO_LIST.md with TDD phase indicators
+6. Update session state with TDD metadata

-### Phase 1: Input Validation & Discovery
-**Memory-First Rule**: Skip file loading if documents already in conversation memory
+**Quality Gates** (Full checklist in action-planning-agent.md):
+- ✓ Quantification requirements enforced (explicit counts, measurable acceptance, exact targets)
+- ✓ Task count ≤10 (hard limit)
+- ✓ Each task has meta.tdd_workflow: true
+- ✓ Each task has exactly 3 implementation steps with tdd_phase field
+- ✓ Green phase includes test-fix cycle logic
+- ✓ Artifact references mapped correctly
+- ✓ MCP tool integration added
+- ✓ Documents follow TDD template structure

-1. **Session Validation**
-   - If session metadata in memory → Skip loading
-   - Else: Load `.workflow/{session_id}/workflow-session.json`
+## Output

-2. **Conflict Resolution Check** (NEW - Priority Input)
-   - If CONFLICT_RESOLUTION.md exists → Load selected strategies
-   - Else: Skip to brainstorming artifacts
-   - Path: `.workflow/{session_id}/.process/CONFLICT_RESOLUTION.md`
+Generate all three documents and report completion status:
+- TDD task JSON files created: N files (IMPL-*.json)
+- TDD cycles configured: N cycles with quantified test cases
+- Artifacts integrated: synthesis-spec, guidance-specification, N role analyses
+- Test context integrated: existing patterns and coverage
+- MCP enhancements: code-index, exa-research
+- Session ready for TDD execution: /workflow:execute
+`
+)
+```

-3. **Artifact Discovery**
-   - If artifact inventory in memory → Skip scanning
-   - Else: Scan `.workflow/{session_id}/.brainstorming/` directory
-   - Detect: role analysis documents, guidance-specification.md, role analyses
+### Agent Context Passing

-4. **Context Package Loading**
-   - Load `.workflow/{session_id}/.process/context-package.json`
-   - Load `.workflow/{session_id}/.process/test-context-package.json` (if exists)
+**Memory-Aware Context Assembly**:
+```javascript
+// Assemble context package for agent
+const agentContext = {
+  session_id: "WFS-[id]",
+  workflow_type: "tdd",

-### Phase 2: TDD Task JSON Generation
+  // Use memory if available, else load
+  session_metadata: memory.has("workflow-session.json")
+    ? memory.get("workflow-session.json")
+    : Read(.workflow/WFS-[id]/workflow-session.json),

-**Input Sources** (priority order):
-1. **Conflict Resolution** (if exists): `.process/CONFLICT_RESOLUTION.md` - Selected resolution strategies
-2. **Brainstorming Artifacts**: Role analysis documents (system-architect, product-owner, etc.)
-3. **Context Package**: `.process/context-package.json` - Project structure and requirements
-4. **Test Context**: `.process/test-context-package.json` - Existing test patterns
+  context_package_path: ".workflow/WFS-[id]/.process/context-package.json",

-**TDD Task Structure includes**:
- Feature list with testable requirements
- Test cases for Red phase
- Implementation requirements for Green phase (with test-fix cycle)
- Refactoring opportunities
- Task dependencies and execution order
- Conflict resolution decisions (if applicable)
+  context_package: memory.has("context-package.json")
+    ? memory.get("context-package.json")
+    : Read(".workflow/WFS-[id]/.process/context-package.json"),

-### Phase 3: Task JSON & IMPL_PLAN.md Generation
+  test_context_package_path: ".workflow/WFS-[id]/.process/test-context-package.json",

-#### Task Structure (Feature-Complete with Internal TDD)
-For each feature, generate task(s) with ID format:
- **IMPL-N** - Single task containing complete TDD cycle (Red-Green-Refactor)
- **IMPL-N.M** - Sub-tasks only when feature is complex (>2500 lines or technical blocking)
+  test_context_package: memory.has("test-context-package.json")
+    ? memory.get("test-context-package.json")
+    : Read(".workflow/WFS-[id]/.process/test-context-package.json"),

-**Task Dependency Rules**:
- **Sequential features**: IMPL-2 depends_on ["IMPL-1"] if Feature 2 needs Feature 1
- **Independent features**: No dependencies, can execute in parallel
- **Complex features**: IMPL-N.2 depends_on ["IMPL-N.1"] for subtask ordering
+  // Extract brainstorm artifacts from context package
+  brainstorm_artifacts: extractBrainstormArtifacts(context_package),

-**Agent Assignment**:
- **All IMPL tasks** → `@code-developer` (handles full TDD cycle)
- Agent executes Red, Green, Refactor phases sequentially within task
+  // Load role analyses using paths from context package
+  role_analyses: brainstorm_artifacts.role_analyses
+    .flatMap(role => role.files)
+    .map(file => Read(file.path)),

-**Meta Fields**:
- `meta.type`: "feature" (TDD-driven feature implementation)
- `meta.agent`: "@code-developer"
- `meta.tdd_workflow`: true (enables TDD-specific flow)
- `meta.tdd_phase`: Not used (phases are in flow_control.implementation_approach)
- `meta.max_iterations`: 3 (for Green phase test-fix cycle)
- `meta.use_codex`: false (manual fixes by default)
+  // Load conflict resolution if exists (from context package)
+  conflict_resolution: brainstorm_artifacts.conflict_resolution?.exists
+    ? Read(brainstorm_artifacts.conflict_resolution.path)
+    : null,

-#### Task JSON Structure Reference
-
-Each TDD task JSON contains complete Red-Green-Refactor cycle with these key fields:
-
-**Top-Level Fields**:
- `id`: Task identifier (`IMPL-N` or `IMPL-N.M` for subtasks)
- `title`: Feature description with TDD
- `status`: `pending | in_progress | completed | container`
- `context_package_path`: Path to context package
- `meta`: TDD-specific metadata
- `context`: Requirements, cycles, paths, acceptance
- `flow_control`: Pre-analysis, 3 TDD phases, post-completion, error handling
-
-**Meta Object (TDD-Specific)**:
- `type`: "feature"
- `agent`: "@code-developer"
- `tdd_workflow`: `true` (REQUIRED - enables TDD flow)
- `max_iterations`: Green phase test-fix cycle limit (default: 3)
- `use_codex`: `false` (manual fixes) or `true` (Codex automated fixes)
-
-**Context Object**:
- `requirements`: Quantified feature requirements with TDD phase details
- `tdd_cycles`: Array of test cycles (each with `test_count`, `test_cases`, `implementation_scope`, `expected_coverage`)
- `focus_paths`: Target directories (absolute or relative from project root)
- `acceptance`: Measurable success criteria with verification commands
- `depends_on`: Task dependencies
- `parent`: Parent task ID (for subtasks only)
-
-**Flow Control Object**:
- `pre_analysis`: Optional pre-execution checks
- `implementation_approach`: Exactly 3 steps with `tdd_phase` field:
-  1. **Red Phase** (`tdd_phase: "red"`): Write failing tests
-  2. **Green Phase** (`tdd_phase: "green"`): Implement to pass tests (includes test-fix cycle)
-  3. **Refactor Phase** (`tdd_phase: "refactor"`): Improve code quality
- `post_completion`: Optional final verification
- `error_handling`: Error recovery strategies (e.g., auto-revert on max iterations)
-
-**Implementation Approach Step Structure**:
-Each step includes:
- `step`: Step number
- `title`: Phase description
- `tdd_phase`: Phase identifier ("red" | "green" | "refactor")
- `description`: Detailed phase description
- `modification_points`: Quantified changes to make
- `logic_flow`: Step-by-step execution logic
- `acceptance`: Phase-specific acceptance criteria
- `command`: Test/verification command (optional)
- `depends_on`: Previous step dependencies
- `output`: Step output identifier
-
-#### IMPL_PLAN.md Structure
-
-**Frontmatter** (TDD-specific fields):
- `workflow_type`: "tdd"
- `tdd_workflow`: true
- `feature_count`, `task_count` (≤10 total)
- `task_breakdown`: simple_features, complex_features, total_subtasks
- `test_context`: Path to test-context-package.json (if exists)
- `conflict_resolution`: Path to CONFLICT_RESOLUTION.md (if exists)
- `verification_history`, `phase_progression`
-
-**8 Sections**:
-1. **Summary**: Core requirements, TDD-specific approach
-2. **Context Analysis**: CCW workflow context, project profile, module structure, dependencies
-3. **Brainstorming Artifacts Reference**: Artifact usage strategy, priority order
-4. **Implementation Strategy**: TDD cycles (Red-Green-Refactor), architectural approach, testing strategy
-5. **TDD Implementation Tasks**: Feature-by-feature tasks with internal TDD cycles, dependencies
-6. **Implementation Plan**: Phased breakdown, resource requirements
-7. **Risk Assessment & Mitigation**: Risk table, TDD-specific risks, monitoring
-8. **Success Criteria**: Functional completeness, technical quality (≥80% coverage), TDD compliance
-
-### Phase 4: TODO_LIST.md Generation
-
-Generate task list with internal TDD phase indicators:
-
-**Structure**:
- Simple features: `- [ ] **IMPL-N**: Feature with TDD` (Internal phases: Red → Green → Refactor)
- Complex features: `▸ **IMPL-N**: Container` with subtasks `- [ ] **IMPL-N.M**: Sub-feature`
-
-**Status Legend**:
- `▸` = Container task (has subtasks)
- `[ ]` = Pending | `[x]` = Completed
- Red → Green → Refactor = TDD phases
-
-### Phase 5: Session State Update
-
-Update workflow-session.json with TDD metadata:
-```json
-{
-  "workflow_type": "tdd",
-  "feature_count": 5,
-  "task_count": 5,
-  "task_breakdown": {
-    "simple_features": 4,
-    "complex_features": 1,
-    "total_subtasks": 2
-  },
-  "tdd_workflow": true,
-  "task_limit_compliance": true
+  // Optional MCP enhancements
+  mcp_analysis: executeMcpDiscovery()
 }
 ```

-**Task Count Calculation**:
- **Simple features**: 1 task each (IMPL-N with internal TDD cycle)
- **Complex features**: 1 container + M subtasks (IMPL-N + IMPL-N.M)
- **Total**: Simple feature count + Complex feature subtask count
- **Example**: 4 simple + 1 complex (with 2 subtasks) = 6 total tasks (not 15)
+## TDD Task Structure Reference
+
+This section provides quick reference for TDD task JSON structure. For complete implementation details, see the agent invocation prompt in Phase 2 above.
+
+**Quick Reference**:
+- Each TDD task contains complete Red-Green-Refactor cycle
+- Task ID format: `IMPL-N` (simple) or `IMPL-N.M` (complex subtasks)
+- Required metadata: `meta.tdd_workflow: true`, `meta.max_iterations: 3`
+- Flow control: Exactly 3 steps with `tdd_phase` field (red, green, refactor)
+- Context: `tdd_cycles` array with quantified test cases and coverage
+- See Phase 2 agent prompt for full schema and requirements

 ## Output Files Structure
 ```
@@ -331,23 +451,28 @@ Update workflow-session.json with TDD metadata:

 **Command Chain**:
 - Called by: `/workflow:tdd-plan` (Phase 4)
- Calls: Gemini CLI for TDD breakdown
+- Invokes: `action-planning-agent` for autonomous task generation
 - Followed by: `/workflow:execute`, `/workflow:tdd-verify`

 **Basic Usage**:
 ```bash
-# Manual mode (default)
+# Agent mode (default, autonomous execution)
 /workflow:tools:task-generate-tdd --session WFS-auth

-# Agent mode (autonomous task generation)
-/workflow:tools:task-generate-tdd --session WFS-auth --agent
+# CLI tool mode (use Gemini/Qwen for generation)
+/workflow:tools:task-generate-tdd --session WFS-auth --cli-execute
 ```

+**Execution Modes**:
+- **Agent mode** (default): Uses `action-planning-agent` with agent-mode task template
+- **CLI mode** (`--cli-execute`): Uses Gemini/Qwen with cli-mode task template
+
 **Output**:
- Task JSON files in `.task/` directory (IMPL-N.json format)
+- TDD task JSON files in `.task/` directory (IMPL-N.json format)
 - IMPL_PLAN.md with TDD Implementation Tasks section
 - TODO_LIST.md with internal TDD phase indicators
 - Session state updated with task count and TDD metadata
+- MCP enhancements integrated (if available)

 ## Test Coverage Analysis Integration

--- a/.claude/commands/workflow/tools/test-task-generate.md
+++ b/.claude/commands/workflow/tools/test-task-generate.md
@@ -1,6 +1,6 @@
 ---
 name: test-task-generate
-description: Generate test-fix task JSON with iterative test-fix-retest cycle specification using Gemini/Qwen/Codex
+description: Autonomous test-fix task generation using action-planning-agent with test-fix-retest cycle specification and discovery phase
 argument-hint: "[--use-codex] [--cli-execute] --session WFS-test-session-id"
 examples:
  - /workflow:tools:test-task-generate --session WFS-test-auth
@@ -9,10 +9,23 @@ examples:
  - /workflow:tools:test-task-generate --cli-execute --use-codex --session WFS-test-auth
 ---

-# Test Task Generation Command
+# Autonomous Test Task Generation Command

 ## Overview
-Generate specialized test-fix task JSON with comprehensive test-fix-retest cycle specification, including Gemini diagnosis (using bug-fix template) and manual fix workflow (Codex automation only when explicitly requested).
+Autonomous test-fix task JSON generation using action-planning-agent with two-phase execution: discovery and document generation. Supports both agent-driven execution (default) and CLI tool execution modes. Generates specialized test-fix tasks with comprehensive test-fix-retest cycle specification.
+
+## Core Philosophy
+- **Agent-Driven**: Delegate execution to action-planning-agent for autonomous operation
+- **Two-Phase Flow**: Discovery (context gathering) → Output (document generation)
+- **Memory-First**: Reuse loaded documents from conversation memory
+- **MCP-Enhanced**: Use MCP tools for advanced code analysis and test research
+- **Pre-Selected Templates**: Command selects correct test template based on `--cli-execute` flag **before** invoking agent
+- **Agent Simplicity**: Agent receives pre-selected template and focuses only on content generation
+- **Path Clarity**: All `focus_paths` prefer absolute paths (e.g., `D:\\project\\src\\module`), or clear relative paths from project root
+- **Test-First**: Generate comprehensive test coverage before execution
+- **Iterative Refinement**: Test-fix-retest cycle until all tests pass
+- **Surgical Fixes**: Minimal code changes, no refactoring during test fixes
+- **Auto-Revert**: Rollback all changes if max iterations reached

 ## Execution Modes

@@ -24,583 +37,278 @@ Generate specialized test-fix task JSON with comprehensive test-fix-retest cycle
 - **Manual Mode (Default)**: Gemini diagnosis → user applies fixes
 - **Codex Mode (`--use-codex`)**: Gemini diagnosis → Codex applies fixes with resume mechanism

-## Core Philosophy
- **Analysis-Driven Test Generation**: Use TEST_ANALYSIS_RESULTS.md from test-concept-enhanced
- **Agent-Based Test Creation**: Call @code-developer agent for comprehensive test generation
- **Coverage-First**: Generate all missing tests before execution
- **Test Execution**: Execute complete test suite after generation
- **Gemini Diagnosis**: Use Gemini for root cause analysis and fix suggestions (references bug-fix template)
- **Manual Fixes First**: Apply fixes manually by default, codex only when explicitly needed
- **Iterative Refinement**: Repeat test-analyze-fix-retest cycle until all tests pass
- **Surgical Fixes**: Minimal code changes, no refactoring during test fixes
- **Auto-Revert**: Rollback all changes if max iterations reached
-
-## Core Responsibilities
- Parse TEST_ANALYSIS_RESULTS.md from test-concept-enhanced
- Extract test requirements and generation strategy
- Parse `--use-codex` flag to determine fix mode (manual vs automated)
- Generate test generation subtask calling @code-developer
- Generate test execution and fix cycle task JSON with appropriate fix mode
- Configure Gemini diagnosis workflow (bug-fix template) and manual/Codex fix application
- Create test-oriented IMPL_PLAN.md and TODO_LIST.md with test generation phase
-
 ## Execution Lifecycle

-### Phase 1: Input Validation & Discovery
+### Phase 1: Discovery & Context Loading
+**⚡ Memory-First Rule**: Skip file loading if documents already in conversation memory

-1. **Parameter Parsing**
-   - Parse `--use-codex` flag from command arguments → Controls IMPL-002 fix mode
-   - Parse `--cli-execute` flag from command arguments → Controls IMPL-001 generation mode
-   - Store flag values for task JSON generation
-
-2. **Test Session Validation**
-   - Load `.workflow/{test-session-id}/workflow-session.json`
-   - Verify `workflow_type: "test_session"`
-   - Extract `source_session_id` from metadata
-
-3. **Test Analysis Results Loading**
-   - **REQUIRED**: Load `.workflow/{test-session-id}/.process/TEST_ANALYSIS_RESULTS.md`
-   - Parse test requirements by file
-   - Extract test generation strategy
-   - Identify test files to create with specifications
-
-4. **Test Context Package Loading**
-   - Load `.workflow/{test-session-id}/.process/test-context-package.json`
-   - Extract test framework configuration
-   - Extract coverage gaps and priorities
-   - Load source session implementation summaries
-
-### Phase 2: Task JSON Generation
-
-Generate **TWO task JSON files**:
-1. **IMPL-001.json** - Test Generation (calls @code-developer)
-2. **IMPL-002.json** - Test Execution and Fix Cycle (calls @test-fix-agent)
-
-#### IMPL-001.json - Test Generation Task
-
-```json
+**Agent Context Package**:
+```javascript
 {
-  "id": "IMPL-001",
-  "title": "Generate comprehensive tests for [sourceSessionId]",
-  "status": "pending",
-  "meta": {
-    "type": "test-gen",
-    "agent": "@code-developer",
-    "source_session": "[sourceSessionId]",
-    "test_framework": "jest|pytest|cargo|detected"
+  "session_id": "WFS-test-[session-id]",
+  "execution_mode": "agent-mode" | "cli-execute-mode",  // Determined by flag
+  "task_json_template_path": "~/.claude/workflows/cli-templates/prompts/workflow/task-json-agent-mode.txt"
+                           | "~/.claude/workflows/cli-templates/prompts/workflow/task-json-cli-mode.txt",
+  // Path selected by command based on --cli-execute flag, agent reads it
+  "workflow_type": "test_session",
+  "use_codex": true | false,  // Determined by --use-codex flag
+  "session_metadata": {
+    // If in memory: use cached content
+    // Else: Load from .workflow/{test-session-id}/workflow-session.json
  },
-  "context": {
-    "requirements": [
-      "Generate comprehensive test files based on TEST_ANALYSIS_RESULTS.md",
-      "Follow existing test patterns and conventions from test framework",
-      "Create tests for all missing coverage identified in analysis",
-      "Include happy path, error handling, edge cases, and integration tests",
-      "Use test data and mocks as specified in analysis",
-      "Ensure tests follow project coding standards"
-    ],
-    "focus_paths": [
-      "tests/**/*",
-      "src/**/*.test.*",
-      "{paths_from_analysis}"
-    ],
-    "acceptance": [
-      "All test files from TEST_ANALYSIS_RESULTS.md section 5 are created",
-      "Tests follow existing test patterns and conventions",
-      "Test scenarios cover happy path, errors, edge cases, integration",
-      "All dependencies are properly mocked",
-      "Test files are syntactically valid and can be executed",
-      "Test coverage meets analysis requirements"
-    ],
-    "depends_on": [],
-    "source_context": {
-      "session_id": "[sourceSessionId]",
-      "test_analysis": ".workflow/[testSessionId]/.process/TEST_ANALYSIS_RESULTS.md",
-      "test_context": ".workflow/[testSessionId]/.process/test-context-package.json",
-      "implementation_summaries": [
-        ".workflow/[sourceSessionId]/.summaries/IMPL-001-summary.md"
-      ]
-    }
+  "test_analysis_results_path": ".workflow/{test-session-id}/.process/TEST_ANALYSIS_RESULTS.md",
+  "test_analysis_results": {
+    // If in memory: use cached content
+    // Else: Load from TEST_ANALYSIS_RESULTS.md
  },
-  "flow_control": {
-    "pre_analysis": [
-      {
-        "step": "load_test_analysis",
-        "action": "Load test generation requirements and strategy",
-        "commands": [
-          "Read(.workflow/[testSessionId]/.process/TEST_ANALYSIS_RESULTS.md)",
-          "Read(.workflow/[testSessionId]/.process/test-context-package.json)"
-        ],
-        "output_to": "test_generation_requirements",
-        "on_error": "fail"
-      },
-      {
-        "step": "load_implementation_context",
-        "action": "Load source implementation for test generation context",
-        "commands": [
-          "bash(for f in .workflow/[sourceSessionId]/.summaries/IMPL-*-summary.md; do echo \"=== $(basename $f) ===\"&& cat \"$f\"; done)"
-        ],
-        "output_to": "implementation_context",
-        "on_error": "skip_optional"
-      },
-      {
-        "step": "load_existing_test_patterns",
-        "action": "Study existing tests for pattern reference",
-        "commands": [
-          "bash(find . -name \"*.test.*\" -type f)",
-          "bash(# Read first 2 existing test files as examples)",
-          "bash(test_files=$(find . -name \"*.test.*\" -type f | head -2))",
-          "bash(for f in $test_files; do echo \"=== $f ===\"&& cat \"$f\"; done)"
-        ],
-        "output_to": "existing_test_patterns",
-        "on_error": "skip_optional"
-      }
-    ],
-    // Agent Mode (Default): Agent implements tests
-    "implementation_approach": [
-      {
-        "step": 1,
-        "title": "Generate comprehensive test suite",
-        "description": "Generate comprehensive test suite based on TEST_ANALYSIS_RESULTS.md. Follow test generation strategy and create all test files listed in section 5 (Implementation Targets).",
-        "modification_points": [
-          "Read TEST_ANALYSIS_RESULTS.md sections 3 and 4",
-          "Study existing test patterns",
-          "Create test files with all required scenarios",
-          "Implement happy path, error handling, edge case, and integration tests",
-          "Add required mocks and fixtures"
-        ],
-        "logic_flow": [
-          "Read TEST_ANALYSIS_RESULTS.md section 3 (Test Requirements by File)",
-          "Read TEST_ANALYSIS_RESULTS.md section 4 (Test Generation Strategy)",
-          "Study existing test patterns from test_context.test_framework.conventions",
-          "For each test file in section 5 (Implementation Targets): Create test file with specified scenarios, Implement happy path tests, Implement error handling tests, Implement edge case tests, Implement integration tests (if specified), Add required mocks and fixtures",
-          "Follow test framework conventions and project standards",
-          "Ensure all tests are executable and syntactically valid"
-        ],
-        "depends_on": [],
-        "output": "test_suite"
-      }
-    ],
-
-    // CLI Execute Mode (--cli-execute): Use Codex command (alternative format shown below)
-    "implementation_approach": [{
-      "step": 1,
-      "title": "Generate tests using Codex",
-      "description": "Use Codex CLI to autonomously generate comprehensive test suite based on TEST_ANALYSIS_RESULTS.md",
-      "modification_points": [
-        "Codex loads TEST_ANALYSIS_RESULTS.md and existing test patterns",
-        "Codex generates all test files listed in analysis section 5",
-        "Codex ensures tests follow framework conventions"
-      ],
-      "logic_flow": [
-        "Start new Codex session",
-        "Pass TEST_ANALYSIS_RESULTS.md to Codex",
-        "Codex studies existing test patterns",
-        "Codex generates comprehensive test suite",
-        "Codex validates test syntax and executability"
-      ],
-      "command": "bash(codex -C [focus_paths] --full-auto exec \"PURPOSE: Generate comprehensive test suite TASK: Create test files based on TEST_ANALYSIS_RESULTS.md section 5 MODE: write CONTEXT: @.workflow/WFS-test-[session]/.process/TEST_ANALYSIS_RESULTS.md @.workflow/WFS-test-[session]/.process/test-context-package.json EXPECTED: All test files with happy path, error handling, edge cases, integration tests RULES: Follow test framework conventions, ensure tests are executable\" --skip-git-repo-check -s danger-full-access)",
-      "depends_on": [],
-      "output": "test_generation"
-    }],
-    "target_files": [
-      "{test_file_1 from TEST_ANALYSIS_RESULTS.md section 5}",
-      "{test_file_2 from TEST_ANALYSIS_RESULTS.md section 5}",
-      "{test_file_N from TEST_ANALYSIS_RESULTS.md section 5}"
-    ]
+  "test_context_package_path": ".workflow/{test-session-id}/.process/test-context-package.json",
+  "test_context_package": {
+    // Existing test patterns and coverage analysis
+  },
+  "source_session_id": "[source-session-id]",  // if exists
+  "source_session_summaries": {
+    // Implementation context from source session
+  },
+  "mcp_capabilities": {
+    "code_index": true,
+    "exa_code": true,
+    "exa_web": true
  }
 }
 ```

-#### IMPL-002.json - Test Execution & Fix Cycle Task
+**Discovery Actions**:
+1. **Load Test Session Context** (if not in memory)
+   ```javascript
+   if (!memory.has("workflow-session.json")) {
+     Read(.workflow/{test-session-id}/workflow-session.json)
+   }
+   ```

-```json
-{
-  "id": "IMPL-002",
-  "title": "Execute and fix tests for [sourceSessionId]",
-  "status": "pending",
-  "meta": {
-    "type": "test-fix",
-    "agent": "@test-fix-agent",
-    "source_session": "[sourceSessionId]",
-    "test_framework": "jest|pytest|cargo|detected",
-    "max_iterations": 5,
-    "use_codex": false  // Set to true if --use-codex flag present
-  },
-  "context": {
-    "requirements": [
-      "Execute complete test suite (generated in IMPL-001)",
-      "Diagnose test failures using Gemini analysis with bug-fix template",
-      "Present fixes to user for manual application (default)",
-      "Use Codex ONLY if user explicitly requests automation",
-      "Iterate until all tests pass or max iterations reached",
-      "Revert changes if unable to fix within iteration limit"
-    ],
-    "focus_paths": [
-      "tests/**/*",
-      "src/**/*.test.*",
-      "{implementation_files_from_source_session}"
-    ],
-    "acceptance": [
-      "All tests pass successfully (100% pass rate)",
-      "No test failures or errors in final run",
-      "Code changes are minimal and surgical",
-      "All fixes are verified through retest",
-      "Iteration logs document fix progression"
-    ],
-    "depends_on": ["IMPL-001"],
-    "source_context": {
-      "session_id": "[sourceSessionId]",
-      "test_generation_summary": ".workflow/[testSessionId]/.summaries/IMPL-001-summary.md",
-      "implementation_summaries": [
-        ".workflow/[sourceSessionId]/.summaries/IMPL-001-summary.md"
-      ]
-    }
-  },
-  "flow_control": {
-    "pre_analysis": [
-      {
-        "step": "load_source_session_summaries",
-        "action": "Load implementation context from source session",
-        "commands": [
-          "bash(find .workflow/[sourceSessionId]/.summaries/ -name 'IMPL-*-summary.md' 2>/dev/null)",
-          "bash(for f in .workflow/[sourceSessionId]/.summaries/IMPL-*-summary.md; do echo \"=== $(basename $f) ===\"&& cat \"$f\"; done)"
-        ],
-        "output_to": "implementation_context",
-        "on_error": "skip_optional"
-      },
-      {
-        "step": "discover_test_framework",
-        "action": "Identify test framework and test command",
-        "commands": [
-          "bash(jq -r '.scripts.test // \"npm test\"' package.json 2>/dev/null || echo 'pytest' || echo 'cargo test')",
-          "bash([ -f 'package.json' ] && echo 'jest/npm' || [ -f 'pytest.ini' ] && echo 'pytest' || [ -f 'Cargo.toml' ] && echo 'cargo' || echo 'unknown')"
-        ],
-        "output_to": "test_command",
-        "on_error": "fail"
-      },
-      {
-        "step": "analyze_test_coverage",
-        "action": "Analyze test coverage and identify missing tests",
-        "commands": [
-          "bash(find . -name \"*.test.*\" -type f)",
-          "bash(rg \"test|describe|it|def test_\" -g \"*.test.*\")",
-          "bash(# Count implementation files vs test files)",
-          "bash(impl_count=$(find [changed_files_dirs] -type f \\( -name '*.ts' -o -name '*.js' -o -name '*.py' \\) ! -name '*.test.*' 2>/dev/null | wc -l))",
-          "bash(test_count=$(find . -name \"*.test.*\" -type f | wc -l))",
-          "bash(echo \"Implementation files: $impl_count, Test files: $test_count\")"
-        ],
-        "output_to": "test_coverage_analysis",
-        "on_error": "skip_optional"
-      },
-      {
-        "step": "identify_files_without_tests",
-        "action": "List implementation files that lack corresponding test files",
-        "commands": [
-          "bash(# For each changed file from source session, check if test exists)",
-          "bash(for file in [changed_files]; do test_file=$(echo $file | sed 's/\\(.*\\)\\.\\(ts\\|js\\|py\\)$/\\1.test.\\2/'); [ ! -f \"$test_file\" ] && echo \"$file\"; done)"
-        ],
-        "output_to": "files_without_tests",
-        "on_error": "skip_optional"
-      },
-      {
-        "step": "prepare_test_environment",
-        "action": "Ensure test environment is ready",
-        "commands": [
-          "bash([ -f 'package.json' ] && npm install 2>/dev/null || true)",
-          "bash([ -f 'requirements.txt' ] && pip install -q -r requirements.txt 2>/dev/null || true)"
-        ],
-        "output_to": "environment_status",
-        "on_error": "skip_optional"
-      }
-    ],
-    "implementation_approach": [
-      {
-        "step": 1,
-        "title": "Execute iterative test-fix-retest cycle",
-        "description": "Execute iterative test-fix-retest cycle using Gemini diagnosis (bug-fix template) and manual fixes (Codex only if meta.use_codex=true). Max 5 iterations with automatic revert on failure.",
-        "test_fix_cycle": {
-          "max_iterations": 5,
-          "cycle_pattern": "test → gemini_diagnose → manual_fix (or codex if needed) → retest",
-          "tools": {
-            "test_execution": "bash(test_command)",
-            "diagnosis": "gemini (MODE: analysis, uses bug-fix template)",
-            "fix_application": "manual (default) or codex exec resume --last (if explicitly needed)",
-            "verification": "bash(test_command) + regression_check"
-          },
-          "exit_conditions": {
-            "success": "all_tests_pass",
-            "failure": "max_iterations_reached",
-            "error": "test_command_not_found"
-          }
-        },
-        "modification_points": [
-        "PHASE 1: Initial Test Execution",
-        "  1.1. Discover test command from framework detection",
-        "  1.2. Execute initial test run: bash([test_command])",
-        "  1.3. Parse test output and count failures",
-        "  1.4. If all pass → Skip to PHASE 3 (success)",
-        "  1.5. If failures → Store failure output, proceed to PHASE 2",
-        "",
-        "PHASE 2: Iterative Test-Fix-Retest Cycle (max 5 iterations)",
-        "  Note: This phase handles test failures, NOT test generation failures",
-        "  Initialize: max_iterations=5, current_iteration=0",
-        "  ",
-        "  WHILE (tests failing AND current_iteration < max_iterations):",
-        "    current_iteration++",
-        "    ",
-        "    STEP 2.1: Gemini Diagnosis (using bug-fix template)",
-        "    - Prepare diagnosis context:",
-        "      * Test failure output from previous run",
-        "      * Source files from focus_paths",
-        "      * Implementation summaries from source session",
-        "    - Execute Gemini analysis with bug-fix template:",
-        "      bash(cd .workflow/WFS-test-[session]/.process && gemini \"",
-        "      PURPOSE: Diagnose test failure iteration [N] and propose minimal fix",
-        "      TASK: Systematic bug analysis and fix recommendations for test failure",
-        "      MODE: analysis",
-        "      CONTEXT: @CLAUDE.md,**/*CLAUDE.md",
-        "               Test output: [test_failures]",
-        "               Source files: [focus_paths]",
-        "               Implementation: [implementation_context]",
-        "      EXPECTED: Root cause analysis, code path tracing, targeted fixes",
-        "      RULES: $(cat ~/.claude/workflows/cli-templates/prompts/analysis/01-diagnose-bug-root-cause.txt) | Bug: [test_failure_description]",
-        "             Minimal surgical fixes only - no refactoring",
-        "      \" > fix-iteration-[N]-diagnosis.md)",
-        "    - Parse diagnosis → extract fix_suggestion and target_files",
-        "    - Present fix to user for manual application (default)",
-        "    ",
-        "    STEP 2.2: Apply Fix (Based on meta.use_codex Flag)",
-        "    ",
-        "    IF meta.use_codex = false (DEFAULT):",
-        "    - Present Gemini diagnosis to user for manual fix",
-        "    - User applies fix based on diagnosis recommendations",
-        "    - Stage changes: bash(git add -A)",
-        "    - Store fix log: .process/fix-iteration-[N]-changes.log",
-        "    ",
-        "    IF meta.use_codex = true (--use-codex flag present):",
-        "    - Stage current changes (if valid git repo): bash(git add -A)",
-        "    - First iteration: Start new Codex session",
-        "      codex -C [project_root] --full-auto exec \"",
-        "      PURPOSE: Fix test failure iteration 1",
-        "      TASK: [fix_suggestion from Gemini]",
-        "      MODE: write",
-        "      CONTEXT: Diagnosis: .workflow/.process/fix-iteration-1-diagnosis.md",
-        "               Target files: [target_files]",
-        "               Implementation context: [implementation_context]",
-        "      EXPECTED: Minimal code changes to resolve test failure",
-        "      RULES: Apply ONLY suggested changes, no refactoring",
-        "             Preserve existing code style",
-        "      \" --skip-git-repo-check -s danger-full-access",
-        "    - Subsequent iterations: Resume session for context continuity",
-        "      codex exec \"",
-        "      CONTINUE TO NEXT FIX:",
-        "      Iteration [N] of 5: Fix test failure",
-        "      ",
-        "      PURPOSE: Fix remaining test failures",
-        "      TASK: [fix_suggestion from Gemini iteration N]",
-        "      CONTEXT: Previous fixes applied, diagnosis: .process/fix-iteration-[N]-diagnosis.md",
-        "      EXPECTED: Surgical fix for current failure",
-        "      RULES: Build on previous fixes, maintain consistency",
-        "      \" resume --last --skip-git-repo-check -s danger-full-access",
-        "    - Store fix log: .process/fix-iteration-[N]-changes.log",
-        "    ",
-        "    STEP 2.3: Retest and Verification",
-        "    - Re-execute test suite: bash([test_command])",
-        "    - Capture output: .process/fix-iteration-[N]-retest.log",
-        "    - Count failures: bash(grep -c 'FAIL\\|ERROR' .process/fix-iteration-[N]-retest.log)",
-        "    - Check for regression:",
-        "      IF new_failures > previous_failures:",
-        "        WARN: Regression detected",
-        "        Include in next Gemini diagnosis context",
-        "    - Analyze results:",
-        "      IF all_tests_pass:",
-        "        BREAK loop → Proceed to PHASE 3",
-        "      ELSE:",
-        "        Update test_failures context",
-        "        CONTINUE loop",
-        "  ",
-        "  IF max_iterations reached AND tests still failing:",
-        "    EXECUTE: git reset --hard HEAD (revert all changes)",
-        "    MARK: Task status = blocked",
-        "    GENERATE: Detailed failure report with iteration logs",
-        "    EXIT: Require manual intervention",
-        "",
-        "PHASE 3: Final Validation and Certification",
-        "  3.1. Execute final confirmation test run",
-        "  3.2. Generate success summary:",
-        "       - Iterations required: [current_iteration]",
-        "       - Fixes applied: [summary from iteration logs]",
-        "       - Test results: All passing ✅",
-        "  3.3. Mark task status: completed",
-        "  3.4. Update TODO_LIST.md: Mark as ✅",
-        "  3.5. Certify code: APPROVED for deployment"
-      ],
-      "logic_flow": [
-        "Load source session implementation context",
-        "Discover test framework and command",
-        "PHASE 0: Test Coverage Check",
-        "  Analyze existing test files",
-        "  Identify files without tests",
-        "  IF tests missing:",
-        "    Report to user (no automatic generation)",
-        "    Wait for user to generate tests or request automation",
-        "  ELSE:",
-        "    Skip to Phase 1",
-        "PHASE 1: Initial Test Execution",
-        "  Execute test suite",
-        "  IF all pass → Success (Phase 3)",
-        "  ELSE → Store failures, proceed to Phase 2",
-        "PHASE 2: Iterative Fix Cycle (max 5 iterations)",
-        "  LOOP (max 5 times):",
-        "    1. Gemini diagnoses failure with bug-fix template → fix suggestion",
-        "    2. Check meta.use_codex flag:",
-        "       - IF false (default): Present fix to user for manual application",
-        "       - IF true (--use-codex): Codex applies fix with resume for continuity",
-        "    3. Retest and check results",
-        "    4. IF pass → Exit loop to Phase 3",
-        "    5. ELSE → Continue with updated context",
-        "  IF max iterations → Revert + report failure",
-        "PHASE 3: Final Validation",
-        "  Confirm all tests pass",
-        "  Generate summary (include test generation info)",
-        "  Certify code APPROVED"
-      ],
-        "error_handling": {
-          "max_iterations_reached": {
-            "action": "revert_all_changes",
-            "commands": [
-              "bash(git reset --hard HEAD)",
-              "bash(jq '.status = \"blocked\"' .workflow/[session]/.task/IMPL-001.json > temp.json && mv temp.json .workflow/[session]/.task/IMPL-001.json)"
-            ],
-            "report": "Generate failure report with iteration logs in .summaries/IMPL-001-failure-report.md"
-          },
-          "test_command_fails": {
-            "action": "treat_as_test_failure",
-            "context": "Use stderr as failure context for Gemini diagnosis"
-          },
-          "codex_apply_fails": {
-            "action": "retry_once_then_skip",
-            "fallback": "Mark iteration as skipped, continue to next"
-          },
-          "gemini_diagnosis_fails": {
-            "action": "retry_with_simplified_context",
-            "fallback": "Use previous diagnosis, continue"
-          },
-          "regression_detected": {
-            "action": "log_warning_continue",
-            "context": "Include regression info in next Gemini diagnosis"
-          }
-        },
-        "depends_on": [],
-        "output": "test_fix_results"
-      }
-    ],
-    "target_files": [
-      "Auto-discovered from test failures",
-      "Extracted from Gemini diagnosis each iteration",
-      "Format: file:function:lines or file (for new files)"
-    ],
-    "codex_session": {
-      "strategy": "resume_for_continuity",
-      "first_iteration": "codex exec \"fix iteration 1\" --full-auto",
-      "subsequent_iterations": "codex exec \"fix iteration N\" resume --last",
-      "benefits": [
-        "Maintains conversation context across fixes",
-        "Remembers previous decisions and patterns",
-        "Ensures consistency in fix approach",
-        "Reduces redundant context injection"
-      ]
-    }
-  }
+2. **Load TEST_ANALYSIS_RESULTS.md** (if not in memory, REQUIRED)
+   ```javascript
+   if (!memory.has("TEST_ANALYSIS_RESULTS.md")) {
+     Read(.workflow/{test-session-id}/.process/TEST_ANALYSIS_RESULTS.md)
+   }
+   ```
+
+3. **Load Test Context Package** (if not in memory)
+   ```javascript
+   if (!memory.has("test-context-package.json")) {
+     Read(.workflow/{test-session-id}/.process/test-context-package.json)
+   }
+   ```
+
+4. **Load Source Session Summaries** (if source_session_id exists)
+   ```javascript
+   if (sessionMetadata.source_session_id) {
+     const summaryFiles = Bash("find .workflow/{source-session-id}/.summaries/ -name 'IMPL-*-summary.md'")
+     summaryFiles.forEach(file => Read(file))
+   }
+   ```
+
+5. **Code Analysis with Native Tools** (optional - enhance understanding)
+   ```bash
+   # Find test files and patterns
+   find . -name "*test*" -type f
+   rg "describe|it\(|test\(" -g "*.ts"
+   ```
+
+6. **MCP External Research** (optional - gather test best practices)
+   ```javascript
+   // Get external test examples and patterns
+   mcp__exa__get_code_context_exa(
+     query="TypeScript test generation best practices jest",
+     tokensNum="dynamic"
+   )
+   ```
+
+### Phase 2: Agent Execution (Document Generation)
+
+**Pre-Agent Template Selection** (Command decides path before invoking agent):
+```javascript
+// Command checks flag and selects template PATH (not content)
+const templatePath = hasCliExecuteFlag
+  ? "~/.claude/workflows/cli-templates/prompts/workflow/task-json-cli-mode.txt"
+  : "~/.claude/workflows/cli-templates/prompts/workflow/task-json-agent-mode.txt";
+```
+
+**Agent Invocation**:
+```javascript
+Task(
+  subagent_type="action-planning-agent",
+  description="Generate test-fix task JSON and implementation plan",
+  prompt=`
+## Execution Context
+
+**Session ID**: WFS-test-{session-id}
+**Workflow Type**: Test Session
+**Execution Mode**: {agent-mode | cli-execute-mode}
+**Task JSON Template Path**: {template_path}
+**Use Codex**: {true | false}
+
+## Phase 1: Discovery Results (Provided Context)
+
+### Test Session Metadata
+{session_metadata_content}
+- source_session_id: {source_session_id} (if exists)
+- workflow_type: "test_session"
+
+### TEST_ANALYSIS_RESULTS.md (REQUIRED)
+{test_analysis_results_content}
+- Coverage Assessment
+- Test Framework & Conventions
+- Test Requirements by File
+- Test Generation Strategy
+- Implementation Targets
+- Success Criteria
+
+### Test Context Package
+{test_context_package_summary}
+- Existing test patterns, framework config, coverage analysis
+
+### Source Session Implementation Context (Optional)
+{source_session_summaries}
+- Implementation context from completed session
+
+### MCP Analysis Results (Optional)
+**Code Structure**: {mcp_code_index_results}
+**External Research**: {mcp_exa_research_results}
+
+## Phase 2: Test Task Document Generation
+
+**Agent Configuration Reference**: All test task generation rules, test-fix cycle structure, quality standards, and execution details are defined in action-planning-agent.
+
+Refer to: @.claude/agents/action-planning-agent.md for:
+- Test Task Decomposition Standards
+- Test-Fix-Retest Cycle Requirements
+- 5-Field Task JSON Schema
+- IMPL_PLAN.md Structure (Test variant)
+- TODO_LIST.md Format
+- Test Execution Flow & Quality Validation
+
+### Test-Specific Requirements Summary
+
+#### Task Structure Philosophy
+- **Minimum 2 tasks**: IMPL-001 (test generation) + IMPL-002 (test execution & fix)
+- **Expandable**: Add IMPL-003+ for complex projects (per-module, integration, etc.)
+- IMPL-001: Uses @code-developer or CLI execution
+- IMPL-002: Uses @test-fix-agent with iterative fix cycle
+
+#### Test-Fix Cycle Configuration
+- **Max Iterations**: 5 (for IMPL-002)
+- **Diagnosis Tool**: Gemini with bug-fix template
+- **Fix Application**: Manual (default) or Codex (if --use-codex flag)
+- **Cycle Pattern**: test → gemini_diagnose → manual_fix (or codex) → retest
+- **Exit Conditions**: All tests pass OR max iterations reached (auto-revert)
+
+#### Required Outputs Summary
+
+##### 1. Test Task JSON Files (.task/IMPL-*.json)
+- **Location**: `.workflow/{test-session-id}/.task/`
+- **Template**: Read from `{template_path}` (pre-selected by command based on `--cli-execute` flag)
+- **Schema**: 5-field structure with test-specific metadata
+  - IMPL-001: `meta.type: "test-gen"`, `meta.agent: "@code-developer"`
+  - IMPL-002: `meta.type: "test-fix"`, `meta.agent: "@test-fix-agent"`, `meta.use_codex: {use_codex}`
+  - `flow_control`: Test generation approach (IMPL-001) or test-fix cycle (IMPL-002)
+- **Details**: See action-planning-agent.md § Test Task JSON Generation
+
+##### 2. IMPL_PLAN.md (Test Variant)
+- **Location**: `.workflow/{test-session-id}/IMPL_PLAN.md`
+- **Template**: `~/.claude/workflows/cli-templates/prompts/workflow/impl-plan-template.txt`
+- **Test-Specific Frontmatter**: workflow_type="test_session", test_framework, source_session_id
+- **Test-Fix-Retest Cycle Section**: Iterative fix cycle with Gemini diagnosis
+- **Details**: See action-planning-agent.md § Test Implementation Plan Creation
+
+##### 3. TODO_LIST.md
+- **Location**: `.workflow/{test-session-id}/TODO_LIST.md`
+- **Format**: Task list with test generation and execution phases
+- **Status**: [ ] (pending), [x] (completed)
+- **Details**: See action-planning-agent.md § TODO List Generation
+
+### Agent Execution Summary
+
+**Key Steps** (Detailed instructions in action-planning-agent.md):
+1. Load task JSON template from provided path
+2. Parse TEST_ANALYSIS_RESULTS.md for test requirements
+3. Generate IMPL-001 (test generation) task JSON
+4. Generate IMPL-002 (test execution & fix) task JSON with use_codex flag
+5. Generate additional IMPL-*.json if project complexity requires
+6. Create IMPL_PLAN.md using test template variant
+7. Generate TODO_LIST.md with test task indicators
+8. Update session state with test metadata
+
+**Quality Gates** (Full checklist in action-planning-agent.md):
+- ✓ Minimum 2 tasks created (IMPL-001 + IMPL-002)
+- ✓ IMPL-001 has test generation approach from TEST_ANALYSIS_RESULTS.md
+- ✓ IMPL-002 has test-fix cycle with correct use_codex flag
+- ✓ Test framework configuration integrated
+- ✓ Source session context referenced (if exists)
+- ✓ MCP tool integration added
+- ✓ Documents follow test template structure
+
+## Output
+
+Generate all three documents and report completion status:
+- Test task JSON files created: N files (minimum 2)
+- Test requirements integrated: TEST_ANALYSIS_RESULTS.md
+- Test context integrated: existing patterns and coverage
+- Source session context: {source_session_id} summaries (if exists)
+- MCP enhancements: code-index, exa-research
+- Session ready for test execution: /workflow:execute or /workflow:test-cycle-execute
+`
+)
+```
+
+### Agent Context Passing
+
+**Memory-Aware Context Assembly**:
+```javascript
+// Assemble context package for agent
+const agentContext = {
+  session_id: "WFS-test-[id]",
+  workflow_type: "test_session",
+  use_codex: hasUseCodexFlag,
+
+  // Use memory if available, else load
+  session_metadata: memory.has("workflow-session.json")
+    ? memory.get("workflow-session.json")
+    : Read(.workflow/WFS-test-[id]/workflow-session.json),
+
+  test_analysis_results_path: ".workflow/WFS-test-[id]/.process/TEST_ANALYSIS_RESULTS.md",
+
+  test_analysis_results: memory.has("TEST_ANALYSIS_RESULTS.md")
+    ? memory.get("TEST_ANALYSIS_RESULTS.md")
+    : Read(".workflow/WFS-test-[id]/.process/TEST_ANALYSIS_RESULTS.md"),
+
+  test_context_package_path: ".workflow/WFS-test-[id]/.process/test-context-package.json",
+
+  test_context_package: memory.has("test-context-package.json")
+    ? memory.get("test-context-package.json")
+    : Read(".workflow/WFS-test-[id]/.process/test-context-package.json"),
+
+  // Load source session summaries if exists
+  source_session_id: session_metadata.source_session_id || null,
+
+  source_session_summaries: session_metadata.source_session_id
+    ? loadSourceSummaries(session_metadata.source_session_id)
+    : null,
+
+  // Optional MCP enhancements
+  mcp_analysis: executeMcpDiscovery()
 }
 ```

-### Phase 3: IMPL_PLAN.md Generation
+## Test Task Structure Reference

-#### Document Structure
-```markdown
---
-identifier: WFS-test-[session-id]
-source_session: WFS-[source-session-id]
-workflow_type: test_session
-test_framework: jest|pytest|cargo|detected
---
+This section provides quick reference for test task JSON structure. For complete implementation details, see the agent invocation prompt in Phase 2 above.

-# Test Validation Plan: [Source Session Topic]
-
-## Summary
-Execute comprehensive test suite for implementation from session WFS-[source-session-id].
-Diagnose and fix all test failures using iterative Gemini analysis and Codex execution.
-
-## Source Session Context
- **Implementation Session**: WFS-[source-session-id]
- **Completed Tasks**: IMPL-001, IMPL-002, ...
- **Changed Files**: [list from git log]
- **Implementation Summaries**: [references to source session summaries]
-
-## Test Framework
- **Detected Framework**: jest|pytest|cargo|other
- **Test Command**: npm test|pytest|cargo test
- **Test Files**: [discovered test files]
- **Coverage**: [estimated test coverage]
-
-## Test-Fix-Retest Cycle
- **Max Iterations**: 5
- **Diagnosis Tool**: Gemini (analysis mode with bug-fix template from bug-index.md)
- **Fix Tool**: Manual (default, meta.use_codex=false) or Codex (if --use-codex flag, meta.use_codex=true)
- **Verification**: Bash test execution + regression check
-
-### Cycle Workflow
-1. **Initial Test**: Execute full suite, capture failures
-2. **Iterative Fix Loop** (max 5 times):
-   - Gemini diagnoses failure using bug-fix template → surgical fix suggestion
-   - Check meta.use_codex flag:
-     - If false (default): Present fix to user for manual application
-     - If true (--use-codex): Codex applies fix with resume for context continuity
-   - Retest and verify (check for regressions)
-   - Continue until all pass or max iterations reached
-3. **Final Validation**: Confirm all tests pass, certify code
-
-### Error Recovery
- **Max iterations reached**: Revert all changes, report failure
- **Test command fails**: Treat as test failure, diagnose with Gemini
- **Codex fails**: Retry once, skip iteration if still failing
- **Regression detected**: Log warning, include in next diagnosis
-
-## Task Breakdown
- **IMPL-001**: Execute and validate tests with iterative fix cycle
-
-## Implementation Strategy
- **Phase 1**: Initial test execution and failure capture
- **Phase 2**: Iterative Gemini diagnosis + Codex fix + retest
- **Phase 3**: Final validation and code certification
-
-## Success Criteria
- All tests pass (100% pass rate)
- No test failures or errors in final run
- Minimal, surgical code changes
- Iteration logs document fix progression
- Code certified APPROVED for deployment
-```
-
-### Phase 4: TODO_LIST.md Generation
-
-```markdown
-# Tasks: Test Validation for [Source Session]
-
-## Task Progress
- [ ] **IMPL-001**: Execute and validate tests with iterative fix cycle → [📋](./.task/IMPL-001.json)
-
-## Execution Details
- **Source Session**: WFS-[source-session-id]
- **Test Framework**: jest|pytest|cargo
- **Max Iterations**: 5
- **Tools**: Gemini diagnosis + Codex resume fixes
-
-## Status Legend
- `- [ ]` = Pending
- `- [x]` = Completed
-```
+**Quick Reference**:
+- Minimum 2 tasks: IMPL-001 (test-gen) + IMPL-002 (test-fix)
+- Expandable for complex projects (IMPL-003+)
+- IMPL-001: `meta.agent: "@code-developer"`, test generation approach
+- IMPL-002: `meta.agent: "@test-fix-agent"`, `meta.use_codex: {flag}`, test-fix cycle
+- See Phase 2 agent prompt for full schema and requirements

 ## Output Files Structure
 ```
@@ -648,29 +356,52 @@ Diagnose and fix all test failures using iterative Gemini analysis and Codex exe
 ## Integration & Usage

 ### Command Chain
- **Called By**: `/workflow:test-gen` (Phase 4)
- **Calls**: None (terminal command)
- **Followed By**: `/workflow:execute` (user-triggered)
+- **Called By**: `/workflow:test-gen` (Phase 4), `/workflow:test-fix-gen` (Phase 4)
+- **Invokes**: `action-planning-agent` for autonomous task generation
+- **Followed By**: `/workflow:execute` or `/workflow:test-cycle-execute` (user-triggered)

 ### Basic Usage
 ```bash
-# Manual fix mode (default)
+# Agent mode (default, autonomous execution)
 /workflow:tools:test-task-generate --session WFS-test-auth

-# Automated Codex fix mode
+# With automated Codex fixes for IMPL-002
 /workflow:tools:test-task-generate --use-codex --session WFS-test-auth
+
+# CLI execution mode for IMPL-001 test generation
+/workflow:tools:test-task-generate --cli-execute --session WFS-test-auth
+
+# Both flags combined
+/workflow:tools:test-task-generate --cli-execute --use-codex --session WFS-test-auth
 ```

+### Execution Modes
+- **Agent mode** (default): Uses `action-planning-agent` with agent-mode task template
+- **CLI mode** (`--cli-execute`): Uses Gemini/Qwen/Codex with cli-mode task template for IMPL-001
+- **Codex fixes** (`--use-codex`): Enables automated fixes in IMPL-002 task
+
 ### Flag Behavior
- **No flag**: `meta.use_codex=false`, manual fixes presented to user
- **--use-codex**: `meta.use_codex=true`, Codex automatically applies fixes with resume mechanism
+- **No flags**: `meta.use_codex=false` (manual fixes), agent-mode generation
+- **--use-codex**: `meta.use_codex=true` (Codex automated fixes with resume mechanism in IMPL-002)
+- **--cli-execute**: Uses CLI tool execution mode for IMPL-001 test generation
+- **Both flags**: CLI generation + automated Codex fixes
+
+### Output
+- Test task JSON files in `.task/` directory (minimum 2: IMPL-001.json + IMPL-002.json)
+- IMPL_PLAN.md with test generation and fix cycle strategy
+- TODO_LIST.md with test task indicators
+- Session state updated with test metadata
+- MCP enhancements integrated (if available)

 ## Related Commands
- `/workflow:test-gen` - Creates test session and calls this tool
- `/workflow:tools:context-gather` - Provides cross-session context
- `/workflow:tools:concept-enhanced` - Provides test strategy analysis
- `/workflow:execute` - Executes the generated test-fix task
- `@test-fix-agent` - Agent that executes the iterative test-fix cycle
+- `/workflow:test-gen` - Creates test session and calls this tool (Phase 4)
+- `/workflow:test-fix-gen` - Creates test-fix session and calls this tool (Phase 4)
+- `/workflow:tools:test-context-gather` - Gathers test coverage context
+- `/workflow:tools:test-concept-enhanced` - Generates test strategy analysis (TEST_ANALYSIS_RESULTS.md)
+- `/workflow:execute` - Executes the generated test-fix tasks
+- `/workflow:test-cycle-execute` - Executes test-fix cycle with iteration management
+- `@code-developer` - Agent that executes IMPL-001 (test generation)
+- `@test-fix-agent` - Agent that executes IMPL-002 (test execution & fix)

 ## Agent Execution Notes
Author	SHA1	Message	Date
catlog22	b62b42e9f4	feat: migrate test-task-generate to agent-driven architecture - Refactor test-task-generate.md to use action-planning-agent - Add two-phase execution flow (Discovery → Agent Execution) - Integrate Memory-First principle and MCP tool enhancements - Support both agent-mode (default) and cli-execute-mode - Add test-specific context package with TEST_ANALYSIS_RESULTS.md - Align with task-generate-agent.md architecture - Remove 556 lines of redundant old content (Phase 1-4 old structure) - Update test-gen.md and test-fix-gen.md to reflect agent-driven changes Changes include: - Core Philosophy with agent-driven principles - Agent Context Package structure for test sessions - Discovery Actions for test context loading - Agent Invocation with test-specific requirements - Test Task Structure Reference - Updated Integration & Usage sections - Enhanced Related Commands documentation Now test task generation uses autonomous agent planning with MCP enhancements for better test coverage analysis and generation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-08 17:23:55 +08:00
catlog22	52fce757f8	refactor: update tdd-plan to use --cli-execute parameter - Change --agent flag to --cli-execute for consistency - Make Agent Mode the default execution mode - Update CLI Mode to use --cli-execute flag - Align with agent-driven task generation architecture - Update Phase 5 command documentation - Update Related Commands section This completes the migration to agent-driven architecture for both /workflow:plan and /workflow:tdd-plan commands. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-08 17:17:09 +08:00
catlog22	c12f6b888a	feat: migrate task-generate-tdd to agent-driven architecture - Refactor task-generate-tdd.md to use action-planning-agent - Add two-phase execution flow (Discovery → Agent Execution) - Integrate Memory-First principle and MCP tool enhancements - Support both agent-mode (default) and cli-execute-mode - Change --agent flag to --cli-execute for consistency - Add TDD-specific context package with test coverage integration - Align with task-generate-agent.md architecture Now both /workflow:plan and /workflow:tdd-plan use agent-driven task generation for autonomous planning and execution. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-08 17:11:17 +08:00
catlog22	47667b8360	refactor: streamline task generation workflow architecture - Remove 150+ lines of duplicate content from task-generate-agent.md - Implement reference-based design following Content Uniqueness Rules - Simplify plan.md to use task-generate-agent exclusively - Remove --agent parameter (agent mode is now default) - Improve separation of concerns between command and agent layers Changes: - task-generate-agent.md: Replace detailed specs with references to action-planning-agent.md - plan.md: Remove task-generate command, unify on agent-driven approach 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-08 17:03:18 +08:00