refactor: Replace CLI execution flags with semantic-driven tool selection

- Remove --cli-execute flag from plan.md, tdd-plan.md, task-generate-agent.md, task-generate-tdd.md
- Remove --use-codex flag from test-gen.md, test-fix-gen.md, test-task-generate.md
- Remove meta.use_codex from task JSON schema in action-planning-agent.md and cli-planning-agent.md
- Add "Semantic CLI Tool Selection" section to action-planning-agent.md
- Document explicit source: metadata.task_description from context-package.json
- Update test-fix-agent.md execution mode documentation
- Update action-plan-verify.md to remove use_codex validation
- Sync SKILL reference copies via analyze_commands.py

CLI tool usage now determined semantically from user's task description
(e.g., "use Codex for implementation") instead of explicit flags.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
catlog22
2025-11-29 15:59:01 +08:00
parent 09114f59c8
commit 132eec900c
32 changed files with 1080 additions and 1050 deletions

View File

@@ -43,7 +43,6 @@ You are a pure execution agent specialized in creating actionable implementation
- `context_package_path`: Context package with brainstorming artifacts catalog
- **Metadata**: Simple values
- `session_id`: Workflow session identifier (WFS-[topic])
- `execution_mode`: agent-mode | cli-execute-mode
- `mcp_capabilities`: Available MCP tools (exa_code, exa_web, code_index)
**Legacy Support** (backward compatibility):
@@ -244,8 +243,7 @@ Generate individual `.task/IMPL-*.json` files with the following structure:
"type": "test-gen|test-fix",
"agent": "@code-developer|@test-fix-agent",
"test_framework": "jest|vitest|pytest|junit|mocha",
"coverage_target": "80%",
"use_codex": true|false
"coverage_target": "80%"
}
}
```
@@ -253,7 +251,8 @@ Generate individual `.task/IMPL-*.json` files with the following structure:
**Test-Specific Fields**:
- `test_framework`: Existing test framework from project (required for test tasks)
- `coverage_target`: Target code coverage percentage (optional)
- `use_codex`: Whether to use Codex for automated fixes in test-fix tasks (optional, default: false)
**Note**: CLI tool usage for test-fix tasks is now controlled via `flow_control.implementation_approach` steps with `command` fields, not via `meta.use_codex`.
#### Context Object
@@ -485,15 +484,31 @@ The `implementation_approach` supports **two execution modes** based on the pres
- `bash(codex --full-auto exec '[task]' resume --last --skip-git-repo-check -s danger-full-access)` (multi-step)
- `bash(cd [path] && gemini -p '[prompt]' --approval-mode yolo)` (write mode)
**Mode Selection Strategy**:
- **Default to agent execution** for most tasks
- **Use CLI mode** when:
- User explicitly requests CLI tool (codex/gemini/qwen)
- Task requires multi-step autonomous reasoning beyond agent capability
- Complex refactoring needs specialized tool analysis
- Building on previous CLI execution context (use `resume --last`)
**Semantic CLI Tool Selection**:
**Key Principle**: The `command` field is **optional**. Agent must decide based on task complexity and user preference.
Agent determines CLI tool usage per-step based on user semantics and task nature.
**Source**: Scan `metadata.task_description` from context-package.json for CLI tool preferences.
**User Semantic Triggers** (patterns to detect in task_description):
- "use Codex/codex" → Add `command` field with Codex CLI
- "use Gemini/gemini" → Add `command` field with Gemini CLI
- "use Qwen/qwen" → Add `command` field with Qwen CLI
- "CLI execution" / "automated" → Infer appropriate CLI tool
**Task-Based Selection** (when no explicit user preference):
- **Implementation/coding**: Codex preferred for autonomous development
- **Analysis/exploration**: Gemini preferred for large context analysis
- **Documentation**: Gemini/Qwen with write mode (`--approval-mode yolo`)
- **Testing**: Depends on complexity - simple=agent, complex=Codex
**Default Behavior**: Agent always executes the workflow. CLI commands are embedded in `implementation_approach` steps:
- Agent orchestrates task execution
- When step has `command` field, agent executes it via Bash
- When step has no `command` field, agent implements directly
- This maintains agent control while leveraging CLI tool power
**Key Principle**: The `command` field is **optional**. Agent decides based on user semantics and task complexity.
**Examples**:

View File

@@ -66,8 +66,7 @@ You are a specialized execution agent that bridges CLI analysis tools with task
"task_config": {
"agent": "@test-fix-agent",
"type": "test-fix-iteration",
"max_iterations": 5,
"use_codex": false
"max_iterations": 5
}
}
```
@@ -263,7 +262,6 @@ function extractModificationPoints() {
"analysis_report": ".process/iteration-{iteration}-analysis.md",
"cli_output": ".process/iteration-{iteration}-cli-output.txt",
"max_iterations": "{task_config.max_iterations}",
"use_codex": "{task_config.use_codex}",
"parent_task": "{parent_task_id}",
"created_by": "@cli-planning-agent",
"created_at": "{timestamp}"

View File

@@ -24,8 +24,6 @@ You are a code execution specialist focused on implementing high-quality, produc
- **Context-driven** - Use provided context and existing code patterns
- **Quality over speed** - Write boring, reliable code that works
## Execution Process
### 1. Context Assessment

View File

@@ -142,9 +142,9 @@ run_test_layer "L1-unit" "$UNIT_CMD"
### 3. Failure Diagnosis & Fixing Loop
**Execution Modes**:
**Execution Modes** (determined by `flow_control.implementation_approach`):
**A. Manual Mode (Default, meta.use_codex=false)**:
**A. Agent Mode (Default, no `command` field in steps)**:
```
WHILE tests are failing AND iterations < max_iterations:
1. Use Gemini to diagnose failure (bug-fix template)
@@ -155,17 +155,17 @@ WHILE tests are failing AND iterations < max_iterations:
END WHILE
```
**B. Codex Mode (meta.use_codex=true)**:
**B. CLI Mode (`command` field present in implementation_approach steps)**:
```
WHILE tests are failing AND iterations < max_iterations:
1. Use Gemini to diagnose failure (bug-fix template)
2. Use Codex to apply fixes automatically with resume mechanism
2. Execute `command` field (e.g., Codex) to apply fixes automatically
3. Re-run test suite
4. Verify fix doesn't break other tests
END WHILE
```
**Codex Resume in Test-Fix Cycle** (when `meta.use_codex=true`):
**Codex Resume in Test-Fix Cycle** (when step has `command` with Codex):
- First iteration: Start new Codex session with full context
- Subsequent iterations: Use `resume --last` to maintain fix history and apply consistent strategies