refactor: Replace CLI execution flags with semantic-driven tool selection

- Remove --cli-execute flag from plan.md, tdd-plan.md, task-generate-agent.md, task-generate-tdd.md - Remove --use-codex flag from test-gen.md, test-fix-gen.md, test-task-generate.md - Remove meta.use_codex from task JSON schema in action-planning-agent.md and cli-planning-agent.md - Add "Semantic CLI Tool Selection" section to action-planning-agent.md - Document explicit source: metadata.task_description from context-package.json - Update test-fix-agent.md execution mode documentation - Update action-plan-verify.md to remove use_codex validation - Sync SKILL reference copies via analyze_commands.py CLI tool usage now determined semantically from user's task description (e.g., "use Codex for implementation") instead of explicit flags. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2026-02-13 02:41:50 +08:00 · 2025-11-29 15:59:01 +08:00
parent 09114f59c8
commit 132eec900c
32 changed files with 1080 additions and 1050 deletions
--- a/.claude/skills/command-guide/reference/commands/workflow/tools/conflict-resolution.md
+++ b/.claude/skills/command-guide/reference/commands/workflow/tools/conflict-resolution.md
@@ -114,35 +114,44 @@ Task(subagent_type="cli-execution-agent", prompt=`
  - Risk: {conflict_risk}
  - Files: {existing_files_list}

+  ## Exploration Context (from context-package.exploration_results)
+  - Exploration Count: ${contextPackage.exploration_results?.exploration_count || 0}
+  - Angles Analyzed: ${JSON.stringify(contextPackage.exploration_results?.angles || [])}
+  - Pre-identified Conflict Indicators: ${JSON.stringify(contextPackage.exploration_results?.aggregated_insights?.conflict_indicators || [])}
+  - Critical Files: ${JSON.stringify(contextPackage.exploration_results?.aggregated_insights?.critical_files?.map(f => f.path) || [])}
+  - All Patterns: ${JSON.stringify(contextPackage.exploration_results?.aggregated_insights?.all_patterns || [])}
+  - All Integration Points: ${JSON.stringify(contextPackage.exploration_results?.aggregated_insights?.all_integration_points || [])}
+
  ## Analysis Steps

  ### 1. Load Context
  - Read existing files from conflict_detection.existing_files
  - Load plan from .workflow/active/{session_id}/.process/context-package.json
+  - **NEW**: Load exploration_results and use aggregated_insights for enhanced analysis
  - Extract role analyses and requirements

-  ### 2. Execute CLI Analysis (Enhanced with Scenario Uniqueness Detection)
+  ### 2. Execute CLI Analysis (Enhanced with Exploration + Scenario Uniqueness)

  Primary (Gemini):
  cd {project_root} && gemini -p "
-  PURPOSE: Detect conflicts between plan and codebase, including module scenario overlaps
+  PURPOSE: Detect conflicts between plan and codebase, using exploration insights
  TASK:
-  • Compare architectures
+  • **Review pre-identified conflict_indicators from exploration results**
+  • Compare architectures (use exploration key_patterns)
  • Identify breaking API changes
  • Detect data model incompatibilities
  • Assess dependency conflicts
-  • **NEW: Analyze module scenario uniqueness**
-    - Extract new module functionality from plan
-    - Search all existing modules with similar functionality
-    - Compare scenario coverage and identify overlaps
+  • **Analyze module scenario uniqueness**
+    - Use exploration integration_points for precise locations
+    - Cross-validate with exploration critical_files
    - Generate clarification questions for boundary definition
  MODE: analysis
  CONTEXT: @**/*.ts @**/*.js @**/*.tsx @**/*.jsx @.workflow/active/{session_id}/**/*
-  EXPECTED: Conflict list with severity ratings, including ModuleOverlap conflicts with:
-    - Existing module list with scenarios
-    - Overlap analysis matrix
+  EXPECTED: Conflict list with severity ratings, including:
+    - Validation of exploration conflict_indicators
+    - ModuleOverlap conflicts with overlap_analysis
    - Targeted clarification questions
-  RULES: $(cat ~/.claude/workflows/cli-templates/prompts/analysis/02-analyze-code-patterns.txt) | Focus on breaking changes, migration needs, and functional overlaps | analysis=READ-ONLY
+  RULES: $(cat ~/.claude/workflows/cli-templates/prompts/analysis/02-analyze-code-patterns.txt) | Focus on breaking changes, migration needs, and functional overlaps | Prioritize exploration-identified conflicts | analysis=READ-ONLY
  "

  Fallback: Qwen (same prompt) → Claude (manual analysis)
--- a/.claude/skills/command-guide/reference/commands/workflow/tools/context-gather.md
+++ b/.claude/skills/command-guide/reference/commands/workflow/tools/context-gather.md
@@ -36,24 +36,23 @@ Step 1: Context-Package Detection
      ├─ Valid package exists → Return existing (skip execution)
      └─ No valid package → Continue to Step 2

-Step 2: Invoke Context-Search Agent
-   ├─ Phase 1: Initialization & Pre-Analysis
-   │  ├─ Load project.json as primary context
-   │  ├─ Initialize code-index
-   │  └─ Classify complexity
-   ├─ Phase 2: Multi-Source Discovery
-   │  ├─ Track 1: Historical archive analysis
-   │  ├─ Track 2: Reference documentation
-   │  ├─ Track 3: Web examples (Exa MCP)
-   │  └─ Track 4: Codebase analysis (5-layer)
-   └─ Phase 3: Synthesis & Packaging
-      ├─ Apply relevance scoring
-      ├─ Integrate brainstorm artifacts
-      ├─ Perform conflict detection
-      └─ Generate context-package.json
+Step 2: Complexity Assessment & Parallel Explore (NEW)
+   ├─ Analyze task_description → classify Low/Medium/High
+   ├─ Select exploration angles (1-4 based on complexity)
+   ├─ Launch N cli-explore-agents in parallel
+   │  └─ Each outputs: exploration-{angle}.json
+   └─ Generate explorations-manifest.json

-Step 3: Output Verification
-   └─ Verify context-package.json created
+Step 3: Invoke Context-Search Agent (with exploration input)
+   ├─ Phase 1: Initialization & Pre-Analysis
+   ├─ Phase 2: Multi-Source Discovery
+   │  ├─ Track 0: Exploration Synthesis (prioritize & deduplicate)
+   │  ├─ Track 1-4: Existing tracks
+   └─ Phase 3: Synthesis & Packaging
+      └─ Generate context-package.json with exploration_results
+
+Step 4: Output Verification
+   └─ Verify context-package.json contains exploration_results
 ```

 ## Execution Flow
@@ -80,10 +79,139 @@ if (file_exists(contextPackagePath)) {
 }
 ```

-### Step 2: Invoke Context-Search Agent
+### Step 2: Complexity Assessment & Parallel Explore

 **Only execute if Step 1 finds no valid package**

+```javascript
+// 2.1 Complexity Assessment
+function analyzeTaskComplexity(taskDescription) {
+  const text = taskDescription.toLowerCase();
+  if (/architect|refactor|restructure|modular|cross-module/.test(text)) return 'High';
+  if (/multiple|several|integrate|migrate|extend/.test(text)) return 'Medium';
+  return 'Low';
+}
+
+const ANGLE_PRESETS = {
+  architecture: ['architecture', 'dependencies', 'modularity', 'integration-points'],
+  security: ['security', 'auth-patterns', 'dataflow', 'validation'],
+  performance: ['performance', 'bottlenecks', 'caching', 'data-access'],
+  bugfix: ['error-handling', 'dataflow', 'state-management', 'edge-cases'],
+  feature: ['patterns', 'integration-points', 'testing', 'dependencies'],
+  refactor: ['architecture', 'patterns', 'dependencies', 'testing']
+};
+
+function selectAngles(taskDescription, complexity) {
+  const text = taskDescription.toLowerCase();
+  let preset = 'feature';
+  if (/refactor|architect|restructure/.test(text)) preset = 'architecture';
+  else if (/security|auth|permission/.test(text)) preset = 'security';
+  else if (/performance|slow|optimi/.test(text)) preset = 'performance';
+  else if (/fix|bug|error|issue/.test(text)) preset = 'bugfix';
+
+  const count = complexity === 'High' ? 4 : (complexity === 'Medium' ? 3 : 1);
+  return ANGLE_PRESETS[preset].slice(0, count);
+}
+
+const complexity = analyzeTaskComplexity(task_description);
+const selectedAngles = selectAngles(task_description, complexity);
+const sessionFolder = `.workflow/active/${session_id}/.process`;
+
+// 2.2 Launch Parallel Explore Agents
+const explorationTasks = selectedAngles.map((angle, index) =>
+  Task(
+    subagent_type="cli-explore-agent",
+    description=`Explore: ${angle}`,
+    prompt=`
+## Task Objective
+Execute **${angle}** exploration for task planning context. Analyze codebase from this specific angle to discover relevant structure, patterns, and constraints.
+
+## Assigned Context
+- **Exploration Angle**: ${angle}
+- **Task Description**: ${task_description}
+- **Session ID**: ${session_id}
+- **Exploration Index**: ${index + 1} of ${selectedAngles.length}
+- **Output File**: ${sessionFolder}/exploration-${angle}.json
+
+## MANDATORY FIRST STEPS (Execute by Agent)
+**You (cli-explore-agent) MUST execute these steps in order:**
+1. Run: ~/.claude/scripts/get_modules_by_depth.sh (project structure)
+2. Run: rg -l "{keyword_from_task}" --type ts (locate relevant files)
+3. Execute: cat ~/.claude/workflows/cli-templates/schemas/explore-json-schema.json (get output schema reference)
+
+## Exploration Strategy (${angle} focus)
+
+**Step 1: Structural Scan** (Bash)
+- get_modules_by_depth.sh → identify modules related to ${angle}
+- find/rg → locate files relevant to ${angle} aspect
+- Analyze imports/dependencies from ${angle} perspective
+
+**Step 2: Semantic Analysis** (Gemini CLI)
+- How does existing code handle ${angle} concerns?
+- What patterns are used for ${angle}?
+- Where would new code integrate from ${angle} viewpoint?
+
+**Step 3: Write Output**
+- Consolidate ${angle} findings into JSON
+- Identify ${angle}-specific clarification needs
+
+## Expected Output
+
+**File**: ${sessionFolder}/exploration-${angle}.json
+
+**Schema Reference**: Schema obtained in MANDATORY FIRST STEPS step 3, follow schema exactly
+
+**Required Fields** (all ${angle} focused):
+- project_structure: Modules/architecture relevant to ${angle}
+- relevant_files: Files affected from ${angle} perspective
+  **IMPORTANT**: Use object format with relevance scores for synthesis:
+  \`[{path: "src/file.ts", relevance: 0.85, rationale: "Core ${angle} logic"}]\`
+  Scores: 0.7+ high priority, 0.5-0.7 medium, <0.5 low
+- patterns: ${angle}-related patterns to follow
+- dependencies: Dependencies relevant to ${angle}
+- integration_points: Where to integrate from ${angle} viewpoint (include file:line locations)
+- constraints: ${angle}-specific limitations/conventions
+- clarification_needs: ${angle}-related ambiguities (with options array)
+- _metadata.exploration_angle: "${angle}"
+
+## Success Criteria
+- [ ] Schema obtained via cat explore-json-schema.json
+- [ ] get_modules_by_depth.sh executed
+- [ ] At least 3 relevant files identified with ${angle} rationale
+- [ ] Patterns are actionable (code examples, not generic advice)
+- [ ] Integration points include file:line locations
+- [ ] Constraints are project-specific to ${angle}
+- [ ] JSON output follows schema exactly
+- [ ] clarification_needs includes options array
+
+## Output
+Write: ${sessionFolder}/exploration-${angle}.json
+Return: 2-3 sentence summary of ${angle} findings
+`
+  )
+);
+
+// 2.3 Generate Manifest after all complete
+const explorationFiles = bash(`find ${sessionFolder} -name "exploration-*.json" -type f`).split('\n').filter(f => f.trim());
+const explorationManifest = {
+  session_id,
+  task_description,
+  timestamp: new Date().toISOString(),
+  complexity,
+  exploration_count: selectedAngles.length,
+  angles_explored: selectedAngles,
+  explorations: explorationFiles.map(file => {
+    const data = JSON.parse(Read(file));
+    return { angle: data._metadata.exploration_angle, file: file.split('/').pop(), path: file, index: data._metadata.exploration_index };
+  })
+};
+Write(`${sessionFolder}/explorations-manifest.json`, JSON.stringify(explorationManifest, null, 2));
+```
+
+### Step 3: Invoke Context-Search Agent
+
+**Only execute after Step 2 completes**
+
 ```javascript
 Task(
  subagent_type="context-search-agent",
@@ -97,6 +225,12 @@ Task(
 - **Task Description**: ${task_description}
 - **Output Path**: .workflow/${session_id}/.process/context-package.json

+## Exploration Input (from Step 2)
+- **Manifest**: ${sessionFolder}/explorations-manifest.json
+- **Exploration Count**: ${explorationManifest.exploration_count}
+- **Angles**: ${explorationManifest.angles_explored.join(', ')}
+- **Complexity**: ${complexity}
+
 ## Mission
 Execute complete context-search-agent workflow for implementation planning:

@@ -107,7 +241,8 @@ Execute complete context-search-agent workflow for implementation planning:
 4. **Analysis**: Extract keywords, determine scope, classify complexity based on task description and project state

 ### Phase 2: Multi-Source Context Discovery
-Execute all 4 discovery tracks:
+Execute all discovery tracks:
+- **Track 0**: Exploration Synthesis (load ${sessionFolder}/explorations-manifest.json, prioritize critical_files, deduplicate patterns/integration_points)
 - **Track 1**: Historical archive analysis (query manifest.json for lessons learned)
 - **Track 2**: Reference documentation (CLAUDE.md, architecture docs)
 - **Track 3**: Web examples (use Exa MCP for unfamiliar tech/APIs)
@@ -130,6 +265,7 @@ Complete context-package.json with:
 - **dependencies**: {internal[], external[]} with dependency graph
 - **brainstorm_artifacts**: {guidance_specification, role_analyses[], synthesis_output} with content
 - **conflict_detection**: {risk_level, risk_factors, affected_modules[], mitigation_strategy, historical_conflicts[]}
+- **exploration_results**: {manifest_path, exploration_count, angles, explorations[], aggregated_insights} (from Track 0)

 ## Quality Validation
 Before completion verify:
@@ -146,7 +282,7 @@ Report completion with statistics.
 )
 ```

-### Step 3: Output Verification
+### Step 4: Output Verification

 After agent completes, verify output:

@@ -156,6 +292,12 @@ const outputPath = `.workflow/${session_id}/.process/context-package.json`;
 if (!file_exists(outputPath)) {
  throw new Error("❌ Agent failed to generate context-package.json");
 }
+
+// Verify exploration_results included
+const pkg = JSON.parse(Read(outputPath));
+if (pkg.exploration_results?.exploration_count > 0) {
+  console.log(`✅ Exploration results aggregated: ${pkg.exploration_results.exploration_count} angles`);
+}
 ```

 ## Parameter Reference
@@ -176,6 +318,7 @@ Refer to `context-search-agent.md` Phase 3.7 for complete `context-package.json`
 - **dependencies**: Internal and external dependency graphs
 - **brainstorm_artifacts**: Brainstorm documents with full content (if exists)
 - **conflict_detection**: Risk assessment with mitigation strategies and historical conflicts
+- **exploration_results**: Aggregated exploration insights (from parallel explore phase)

 ## Historical Archive Analysis

--- a/.claude/skills/command-guide/reference/commands/workflow/tools/task-generate-agent.md
+++ b/.claude/skills/command-guide/reference/commands/workflow/tools/task-generate-agent.md
@@ -1,10 +1,9 @@
 ---
 name: task-generate-agent
 description: Generate implementation plan documents (IMPL_PLAN.md, task JSONs, TODO_LIST.md) using action-planning-agent - produces planning artifacts, does NOT execute code implementation
-argument-hint: "--session WFS-session-id [--cli-execute]"
+argument-hint: "--session WFS-session-id"
 examples:
  - /workflow:tools:task-generate-agent --session WFS-auth
-  - /workflow:tools:task-generate-agent --session WFS-auth --cli-execute
 ---

 # Generate Implementation Plan Command
@@ -26,7 +25,7 @@ Generate implementation planning documents (IMPL_PLAN.md, task JSONs, TODO_LIST.

 ```
 Input Parsing:
-   ├─ Parse flags: --session, --cli-execute
+   ├─ Parse flags: --session
   └─ Validation: session_id REQUIRED

 Phase 1: Context Preparation (Command)
@@ -65,9 +64,10 @@ Phase 2: Planning Document Generation (Agent)

 2. **Provide Metadata** (simple values):
   - `session_id`
-   - `execution_mode` (agent-mode | cli-execute-mode)
   - `mcp_capabilities` (available MCP tools)

+**Note**: CLI tool usage is now determined semantically by action-planning-agent based on user's task description, not by flags.
+
 ### Phase 2: Planning Document Generation (Agent Responsibility)

 **Purpose**: Generate IMPL_PLAN.md, task JSONs, and TODO_LIST.md - planning documents only, NOT code implementation.
@@ -97,15 +97,28 @@ Output:

 ## CONTEXT METADATA
 Session ID: {session-id}
-Planning Mode: {agent-mode | cli-execute-mode}
 MCP Capabilities: {exa_code, exa_web, code_index}

+## CLI TOOL SELECTION
+Determine CLI tool usage per-step based on user's task description:
+- If user specifies "use Codex/Gemini/Qwen for X" → Add command field to relevant steps
+- Default: Agent execution (no command field) unless user explicitly requests CLI
+
+## EXPLORATION CONTEXT (from context-package.exploration_results)
+- Load exploration_results from context-package.json
+- Use aggregated_insights.critical_files for focus_paths generation
+- Apply aggregated_insights.constraints to acceptance criteria
+- Reference aggregated_insights.all_patterns for implementation approach
+- Use aggregated_insights.all_integration_points for precise modification locations
+- Use conflict_indicators for risk-aware task sequencing
+
 ## EXPECTED DELIVERABLES
 1. Task JSON Files (.task/IMPL-*.json)
   - 6-field schema (id, title, status, context_package_path, meta, context, flow_control)
   - Quantified requirements with explicit counts
   - Artifacts integration from context package
-   - Flow control with pre_analysis steps
+   - **focus_paths enhanced with exploration critical_files**
+   - Flow control with pre_analysis steps (include exploration integration_points analysis)

 2. Implementation Plan (IMPL_PLAN.md)
   - Context analysis and artifact references
--- a/.claude/skills/command-guide/reference/commands/workflow/tools/task-generate-tdd.md
+++ b/.claude/skills/command-guide/reference/commands/workflow/tools/task-generate-tdd.md
@@ -1,24 +1,23 @@
 ---
 name: task-generate-tdd
 description: Autonomous TDD task generation using action-planning-agent with Red-Green-Refactor cycles, test-first structure, and cycle validation
-argument-hint: "--session WFS-session-id [--cli-execute]"
+argument-hint: "--session WFS-session-id"
 examples:
  - /workflow:tools:task-generate-tdd --session WFS-auth
-  - /workflow:tools:task-generate-tdd --session WFS-auth --cli-execute
 ---

 # Autonomous TDD Task Generation Command

 ## Overview
-Autonomous TDD task JSON and IMPL_PLAN.md generation using action-planning-agent with two-phase execution: discovery and document generation. Supports both agent-driven execution (default) and CLI tool execution modes. Generates complete Red-Green-Refactor cycles contained within each task.
+Autonomous TDD task JSON and IMPL_PLAN.md generation using action-planning-agent with two-phase execution: discovery and document generation. Generates complete Red-Green-Refactor cycles contained within each task.

 ## Core Philosophy
 - **Agent-Driven**: Delegate execution to action-planning-agent for autonomous operation
 - **Two-Phase Flow**: Discovery (context gathering) → Output (document generation)
 - **Memory-First**: Reuse loaded documents from conversation memory
 - **MCP-Enhanced**: Use MCP tools for advanced code analysis and research
- **Pre-Selected Templates**: Command selects correct TDD template based on `--cli-execute` flag **before** invoking agent
- **Agent Simplicity**: Agent receives pre-selected template and focuses only on content generation
+- **Semantic CLI Selection**: CLI tool usage determined from user's task description, not flags
+- **Agent Simplicity**: Agent generates content with semantic CLI detection
 - **Path Clarity**: All `focus_paths` prefer absolute paths (e.g., `D:\\project\\src\\module`), or clear relative paths from project root (e.g., `./src/module`)
 - **TDD-First**: Every feature starts with a failing test (Red phase)
 - **Feature-Complete Tasks**: Each task contains complete Red-Green-Refactor cycle
@@ -57,7 +56,7 @@ Autonomous TDD task JSON and IMPL_PLAN.md generation using action-planning-agent

 ```
 Input Parsing:
-   ├─ Parse flags: --session, --cli-execute
+   ├─ Parse flags: --session
   └─ Validation: session_id REQUIRED

 Phase 1: Discovery & Context Loading (Memory-First)
@@ -69,7 +68,7 @@ Phase 1: Discovery & Context Loading (Memory-First)
   └─ Optional: MCP external research

 Phase 2: Agent Execution (Document Generation)
-   ├─ Pre-agent template selection (agent-mode OR cli-execute-mode)
+   ├─ Pre-agent template selection (semantic CLI detection)
   ├─ Invoke action-planning-agent
   ├─ Generate TDD Task JSON Files (.task/IMPL-*.json)
   │  └─ Each task: complete Red-Green-Refactor cycle internally
@@ -86,11 +85,8 @@ Phase 2: Agent Execution (Document Generation)
 ```javascript
 {
  "session_id": "WFS-[session-id]",
-  "execution_mode": "agent-mode" | "cli-execute-mode",  // Determined by flag
-  "task_json_template_path": "~/.claude/workflows/cli-templates/prompts/workflow/task-json-agent-mode.txt"
-                           | "~/.claude/workflows/cli-templates/prompts/workflow/task-json-cli-mode.txt",
-  // Path selected by command based on --cli-execute flag, agent reads it
  "workflow_type": "tdd",
+  // Note: CLI tool usage is determined semantically by action-planning-agent based on user's task description
  "session_metadata": {
    // If in memory: use cached content
    // Else: Load from .workflow/active//{session-id}/workflow-session.json
@@ -199,8 +195,7 @@ Task(

 **Session ID**: WFS-{session-id}
 **Workflow Type**: TDD
-**Execution Mode**: {agent-mode | cli-execute-mode}
-**Task JSON Template Path**: {template_path}
+**Note**: CLI tool usage is determined semantically from user's task description

 ## Phase 1: Discovery Results (Provided Context)

@@ -265,16 +260,15 @@ Refer to: @.claude/agents/action-planning-agent.md for:

 ##### 1. TDD Task JSON Files (.task/IMPL-*.json)
 - **Location**: `.workflow/active//{session-id}/.task/`
- **Template**: Read from `{template_path}` (pre-selected by command based on `--cli-execute` flag)
 - **Schema**: 5-field structure with TDD-specific metadata
  - `meta.tdd_workflow`: true (REQUIRED)
  - `meta.max_iterations`: 3 (Green phase test-fix cycle limit)
-  - `meta.use_codex`: false (manual fixes by default)
  - `context.tdd_cycles`: Array with quantified test cases and coverage
  - `flow_control.implementation_approach`: Exactly 3 steps with `tdd_phase` field
    1. Red Phase (`tdd_phase: "red"`): Write failing tests
    2. Green Phase (`tdd_phase: "green"`): Implement to pass tests
    3. Refactor Phase (`tdd_phase: "refactor"`): Improve code quality
+  - CLI tool usage determined semantically (add `command` field when user requests CLI execution)
 - **Details**: See action-planning-agent.md § TDD Task JSON Generation

 ##### 2. IMPL_PLAN.md (TDD Variant)
@@ -475,16 +469,14 @@ This section provides quick reference for TDD task JSON structure. For complete

 **Basic Usage**:
 ```bash
-# Agent mode (default, autonomous execution)
+# Standard execution
 /workflow:tools:task-generate-tdd --session WFS-auth

-# CLI tool mode (use Gemini/Qwen for generation)
-/workflow:tools:task-generate-tdd --session WFS-auth --cli-execute
+# With semantic CLI request (include in task description)
+# e.g., "Generate TDD tasks for auth module, use Codex for implementation"
 ```

-**Execution Modes**:
- **Agent mode** (default): Uses `action-planning-agent` with agent-mode task template
- **CLI mode** (`--cli-execute`): Uses Gemini/Qwen with cli-mode task template
+**CLI Tool Selection**: Determined semantically from user's task description. Include "use Codex/Gemini/Qwen" in your request for CLI execution.

 **Output**:
 - TDD task JSON files in `.task/` directory (IMPL-N.json format)
@@ -513,7 +505,7 @@ IMPL (Green phase) tasks include automatic test-fix cycle:
 3. **Success Path**: Tests pass → Complete task
 4. **Failure Path**: Tests fail → Enter iterative fix cycle:
   - **Gemini Diagnosis**: Analyze failures with bug-fix template
-   - **Fix Application**: Manual (default) or Codex (if meta.use_codex=true)
+   - **Fix Application**: Agent (default) or CLI (if `command` field present)
   - **Retest**: Verify fix resolves failures
   - **Repeat**: Up to max_iterations (default: 3)
 5. **Safety Net**: Auto-revert all changes if max iterations reached
@@ -522,5 +514,5 @@ IMPL (Green phase) tasks include automatic test-fix cycle:

 ## Configuration Options
 - **meta.max_iterations**: Number of fix attempts (default: 3 for TDD, 5 for test-gen)
- **meta.use_codex**: Enable Codex automated fixes (default: false, manual)
+- **CLI tool usage**: Determined semantically from user's task description via `command` field in implementation_approach

--- a/.claude/skills/command-guide/reference/commands/workflow/tools/test-task-generate.md
+++ b/.claude/skills/command-guide/reference/commands/workflow/tools/test-task-generate.md
@@ -1,11 +1,9 @@
 ---
 name: test-task-generate
 description: Generate test planning documents (IMPL_PLAN.md, test task JSONs, TODO_LIST.md) using action-planning-agent - produces test planning artifacts, does NOT execute tests
-argument-hint: "[--use-codex] [--cli-execute] --session WFS-test-session-id"
+argument-hint: "--session WFS-test-session-id"
 examples:
  - /workflow:tools:test-task-generate --session WFS-test-auth
-  - /workflow:tools:test-task-generate --use-codex --session WFS-test-auth
-  - /workflow:tools:test-task-generate --cli-execute --session WFS-test-auth
 ---

 # Generate Test Planning Documents Command
@@ -26,17 +24,17 @@ Generate test planning documents (IMPL_PLAN.md, test task JSONs, TODO_LIST.md) u

 ### Test Generation (IMPL-001)
 - **Agent Mode** (default): @code-developer generates tests within agent context
- **CLI Execute Mode** (`--cli-execute`): Use Codex CLI for autonomous test generation
+- **CLI Mode**: Use CLI tools when `command` field present in implementation_approach (determined semantically)

 ### Test Execution & Fix (IMPL-002+)
- **Manual Mode** (default): Gemini diagnosis → user applies fixes
- **Codex Mode** (`--use-codex`): Gemini diagnosis → Codex applies fixes with resume mechanism
+- **Agent Mode** (default): Gemini diagnosis → agent applies fixes
+- **CLI Mode**: Gemini diagnosis → CLI applies fixes (when `command` field present in implementation_approach)

 ## Execution Process

 ```
 Input Parsing:
-   ├─ Parse flags: --session, --use-codex, --cli-execute
+   ├─ Parse flags: --session
   └─ Validation: session_id REQUIRED

 Phase 1: Context Preparation (Command)
@@ -44,7 +42,7 @@ Phase 1: Context Preparation (Command)
   │  ├─ session_metadata_path
   │  ├─ test_analysis_results_path (REQUIRED)
   │  └─ test_context_package_path
-   └─ Provide metadata (session_id, execution_mode, use_codex, source_session_id)
+   └─ Provide metadata (session_id, source_session_id)

 Phase 2: Test Document Generation (Agent)
   ├─ Load TEST_ANALYSIS_RESULTS.md as primary requirements source
@@ -83,11 +81,11 @@ Phase 2: Test Document Generation (Agent)

 2. **Provide Metadata** (simple values):
   - `session_id`
-   - `execution_mode` (agent-mode | cli-execute-mode)
-   - `use_codex` flag (true | false)
   - `source_session_id` (if exists)
   - `mcp_capabilities` (available MCP tools)

+**Note**: CLI tool usage is now determined semantically from user's task description, not by flags.
+
 ### Phase 2: Test Document Generation (Agent Responsibility)

 **Purpose**: Generate test-specific IMPL_PLAN.md, task JSONs, and TODO_LIST.md - planning documents only, NOT test execution.
@@ -134,11 +132,14 @@ Output:
 ## CONTEXT METADATA
 Session ID: {test-session-id}
 Workflow Type: test_session
-Planning Mode: {agent-mode | cli-execute-mode}
-Use Codex: {true | false}
 Source Session: {source-session-id} (if exists)
 MCP Capabilities: {exa_code, exa_web, code_index}

+## CLI TOOL SELECTION
+Determine CLI tool usage per-step based on user's task description:
+- If user specifies "use Codex/Gemini/Qwen for X" → Add command field to relevant steps
+- Default: Agent execution (no command field) unless user explicitly requests CLI
+
 ## TEST-SPECIFIC REQUIREMENTS SUMMARY
 (Detailed specifications in your agent definition)

@@ -149,25 +150,26 @@ MCP Capabilities: {exa_code, exa_web, code_index}
 Task Configuration:
  IMPL-001 (Test Generation):
    - meta.type: "test-gen"
-    - meta.agent: "@code-developer" (agent-mode) OR CLI execution (cli-execute-mode)
+    - meta.agent: "@code-developer"
    - meta.test_framework: Specify existing framework (e.g., "jest", "vitest", "pytest")
    - flow_control: Test generation strategy from TEST_ANALYSIS_RESULTS.md
+    - CLI execution: Add `command` field when user requests (determined semantically)

  IMPL-002+ (Test Execution & Fix):
    - meta.type: "test-fix"
    - meta.agent: "@test-fix-agent"
-    - meta.use_codex: true/false (based on flag)
    - flow_control: Test-fix cycle with iteration limits and diagnosis configuration
+    - CLI execution: Add `command` field when user requests (determined semantically)

 ### Test-Fix Cycle Specification (IMPL-002+)
 Required flow_control fields:
  - max_iterations: 5
  - diagnosis_tool: "gemini"
  - diagnosis_template: "~/.claude/workflows/cli-templates/prompts/analysis/01-diagnose-bug-root-cause.txt"
-  - fix_mode: "manual" OR "codex" (based on use_codex flag)
  - cycle_pattern: "test → gemini_diagnose → fix → retest"
  - exit_conditions: ["all_tests_pass", "max_iterations_reached"]
  - auto_revert_on_failure: true
+  - CLI fix: Add `command` field when user specifies CLI tool usage

 ### Automation Framework Configuration
 Select automation tools based on test requirements from TEST_ANALYSIS_RESULTS.md:
@@ -191,8 +193,9 @@ PRIMARY requirements source - extract and map to task JSONs:
 ## EXPECTED DELIVERABLES
 1. Test Task JSON Files (.task/IMPL-*.json)
   - 6-field schema with quantified requirements from TEST_ANALYSIS_RESULTS.md
-   - Test-specific metadata: type, agent, use_codex, test_framework, coverage_target
+   - Test-specific metadata: type, agent, test_framework, coverage_target
   - flow_control includes: reusable_test_tools, test_commands (from project config)
+   - CLI execution via `command` field when user requests (determined semantically)
   - Artifact references from test-context-package.json
   - Absolute paths in context.files_to_test

@@ -213,9 +216,9 @@ Hard Constraints:
  - All requirements quantified from TEST_ANALYSIS_RESULTS.md
  - Test framework matches existing project framework
  - flow_control includes reusable_test_tools and test_commands from project
-  - use_codex flag correctly set in IMPL-002+ tasks
  - Absolute paths for all focus_paths
  - Acceptance criteria include verification commands
+  - CLI `command` field added only when user explicitly requests CLI tool usage

 ## SUCCESS CRITERIA
 - All test planning documents generated successfully
@@ -233,21 +236,18 @@ Hard Constraints:

 ### Usage Examples
 ```bash
-# Agent mode (default)
+# Standard execution
 /workflow:tools:test-task-generate --session WFS-test-auth

-# With automated Codex fixes
-/workflow:tools:test-task-generate --use-codex --session WFS-test-auth
-
-# CLI execution mode for test generation
-/workflow:tools:test-task-generate --cli-execute --session WFS-test-auth
+# With semantic CLI request (include in task description)
+# e.g., "Generate tests, use Codex for implementation and fixes"
 ```

-### Flag Behavior
- **No flags**: `meta.use_codex=false` (manual fixes), agent-mode test generation
- **--use-codex**: `meta.use_codex=true` (Codex automated fixes in IMPL-002+)
- **--cli-execute**: CLI tool execution mode for IMPL-001 test generation
- **Both flags**: CLI generation + automated Codex fixes
+### CLI Tool Selection
+CLI tool usage is determined semantically from user's task description:
+- Include "use Codex" for automated fixes
+- Include "use Gemini" for analysis
+- Default: Agent execution (no `command` field)

 ### Output
 - Test task JSON files in `.task/` directory (minimum 2)