fix(dev-workflow): refactor backend selection to multiSelect mode

根据 PR review 反馈进行修复:

核心改动:
- Step 0: backend 选择改为 multiSelect 多选模式
- 三个独立选项:codex、claude、gemini(每个带详细说明)
- 简化任务分类:使用 type 字段(default|ui|quick-fix)替代复杂的 complexity 评级
- Backend 路由逻辑清晰:default→codex, ui→gemini, quick-fix→claude
- 用户限制优先:仅选 codex 时强制所有任务使用 codex

改进点:
- 移除 PR#61 的 complexity/simple/medium/complex 字段
- 移除 rationale 字段,简化为单一 type 维度
- 修正 UI 判定逻辑,改为每任务属性
- Fallback 策略:codex → claude → gemini(优先级清晰)
- 错误处理:type 缺失默认为 default

文件修改:
- dev-workflow/commands/dev.md: 添加 Step 0,更新路由逻辑
- dev-workflow/agents/dev-plan-generator.md: 简化任务分类
- dev-workflow/README.md: 更新文档和示例

Generated with SWE-Agent.ai

Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>
This commit is contained in:
cexll
2025-12-25 22:08:33 +08:00
parent 19facf3385
commit 2856bf0c29
3 changed files with 160 additions and 130 deletions

View File

@@ -9,44 +9,56 @@ A freshly designed lightweight development workflow with no legacy baggage, focu
``` ```
/dev trigger /dev trigger
AskUserQuestion (backend selection)
AskUserQuestion (requirements clarification) AskUserQuestion (requirements clarification)
codeagent analysis (plan mode + UI auto-detection) codeagent analysis (plan mode + task typing + UI auto-detection)
dev-plan-generator (create dev doc) dev-plan-generator (create dev doc)
codeagent concurrent development (intelligent backend selection) codeagent concurrent development (25 tasks, backend routing)
codeagent testing & verification (≥90% coverage) codeagent testing & verification (≥90% coverage)
Done (generate summary) Done (generate summary)
``` ```
## The 6 Steps ## Step 0 + The 6 Steps
### 0. Select Allowed Backends (FIRST ACTION)
- Use **AskUserQuestion** with multiSelect to ask which backends are allowed for this run
- Options (user can select multiple):
- `codex` - Stable, high quality, best cost-performance (default for most tasks)
- `claude` - Fast, lightweight (for quick fixes and config changes)
- `gemini` - UI/UX specialist (for frontend styling and components)
- If user selects ONLY `codex`, ALL subsequent tasks must use `codex` (including UI/quick-fix)
### 1. Clarify Requirements ### 1. Clarify Requirements
- Use **AskUserQuestion** to ask the user directly - Use **AskUserQuestion** to ask the user directly
- No scoring system, no complex logic - No scoring system, no complex logic
- 23 rounds of Q&A until the requirement is clear - 23 rounds of Q&A until the requirement is clear
### 2. codeagent Analysis & UI Detection ### 2. codeagent Analysis + Task Typing + UI Detection
- Call codeagent to analyze the request in plan mode style - Call codeagent to analyze the request in plan mode style
- Extract: core functions, technical points, task list with complexity ratings - Extract: core functions, technical points, task list (25 items)
- For each task, assign exactly one type: `default` / `ui` / `quick-fix`
- UI auto-detection: needs UI work when task involves style assets (.css, .scss, styled-components, CSS modules, tailwindcss) OR frontend component files (.tsx, .jsx, .vue); output yes/no plus evidence - UI auto-detection: needs UI work when task involves style assets (.css, .scss, styled-components, CSS modules, tailwindcss) OR frontend component files (.tsx, .jsx, .vue); output yes/no plus evidence
### 3. Generate Dev Doc ### 3. Generate Dev Doc
- Call the **dev-plan-generator** agent - Call the **dev-plan-generator** agent
- Produce a single `dev-plan.md` - Produce a single `dev-plan.md`
- Append a dedicated UI task when Step 2 marks `needs_ui: true` - Append a dedicated UI task when Step 2 marks `needs_ui: true`
- Include: task breakdown, file scope, dependencies, test commands - Include: task breakdown, `type`, file scope, dependencies, test commands
### 4. Concurrent Development ### 4. Concurrent Development
- Work from the task list in dev-plan.md - Work from the task list in dev-plan.md
- Use codeagent per task with intelligent backend selection: - Route backend per task type (with user constraints + fallback):
- Simple/Medium tasks → `--backend claude` (fast, cost-effective) - `default``codex`
- Complex tasks → `--backend codex` (deep reasoning) - `ui``gemini` (enforced when allowed)
- UI tasks → `--backend gemini` (enforced) - `quick-fix``claude`
- Backend selected automatically based on task complexity rating - Missing `type` → treat as `default`
- If the preferred backend is not allowed, fallback to an allowed backend by priority: `codex``claude``gemini`
- Independent tasks → run in parallel - Independent tasks → run in parallel
- Conflicting tasks → run serially - Conflicting tasks → run serially
@@ -67,7 +79,7 @@ Done (generate summary)
/dev "Implement user login with email + password" /dev "Implement user login with email + password"
``` ```
**No options**, fixed workflow, works out of the box. No CLI flags required; workflow starts with an interactive backend selection.
## Output Structure ## Output Structure
@@ -82,17 +94,14 @@ Only one file—minimal and clear.
### Tools ### Tools
- **AskUserQuestion**: interactive requirement clarification - **AskUserQuestion**: interactive requirement clarification
- **codeagent skill**: analysis, development, testing; supports `--backend` for claude/codex/gemini - **codeagent skill**: analysis, development, testing; supports `--backend` for `codex` / `claude` / `gemini`
- **dev-plan-generator agent**: generate dev doc with complexity ratings (subagent via Task tool, saves context) - **dev-plan-generator agent**: generate dev doc (subagent via Task tool, saves context)
## Intelligent Backend Selection ## Backend Selection & Routing
- **Complexity-based routing**: Tasks are rated as simple/medium/complex based on functional requirements (NOT code volume) - **Step 0**: user selects allowed backends; if `仅 codex`, all tasks use codex
- Simple: Follows existing patterns, deterministic logic → claude - **UI detection standard**: style files (.css, .scss, styled-components, CSS modules, tailwindcss) OR frontend component code (.tsx, .jsx, .vue) trigger `needs_ui: true`
- Medium: Requires design decisions, multiple scenarios → claude - **Task type field**: each task in `dev-plan.md` must have `type: default|ui|quick-fix`
- Complex: Architecture design, algorithms, deep domain knowledge → codex - **Routing**: `default`→codex, `ui`→gemini, `quick-fix`→claude; if disallowed, fallback to an allowed backend by priority: codex→claude→gemini
- UI: Style/component work → gemini (enforced)
- **Flow impact**: Step 2 analyzes complexity; Step 3 includes complexity ratings in dev-plan.md; Step 4 auto-selects backend
- **Implementation**: Orchestrator reads complexity field and invokes codeagent skill with appropriate backend parameter
## Key Features ## Key Features
@@ -122,6 +131,10 @@ Only one file—minimal and clear.
# Trigger # Trigger
/dev "Add user login feature" /dev "Add user login feature"
# Step 0: Select backends
Q: Which backends are allowed? (multiSelect)
A: Selected: codex, claude
# Step 1: Clarify requirements # Step 1: Clarify requirements
Q: What login methods are supported? Q: What login methods are supported?
A: Email + password A: Email + password
@@ -131,18 +144,18 @@ A: Yes, use JWT token
# Step 2: codeagent analysis # Step 2: codeagent analysis
Output: Output:
- Core: email/password login + JWT auth - Core: email/password login + JWT auth
- Task 1: Backend API (complexity: medium) - Task 1: Backend API (type=default)
- Task 2: Password hashing (complexity: simple) - Task 2: Password hashing (type=default)
- Task 3: Frontend form (complexity: simple) - Task 3: Frontend form (type=ui)
UI detection: needs_ui = true (tailwindcss classes in frontend form) UI detection: needs_ui = true (tailwindcss classes in frontend form)
# Step 3: Generate doc # Step 3: Generate doc
dev-plan.md generated with complexity ratings ✓ dev-plan.md generated with typed tasks ✓
# Step 4-5: Concurrent development (intelligent backend selection) # Step 4-5: Concurrent development (routing + fallback)
[task-1] Backend API (claude, medium) → tests → 92% ✓ [task-1] Backend API (codex) → tests → 92% ✓
[task-2] Password hashing (claude, simple) → tests → 95% ✓ [task-2] Password hashing (codex) → tests → 95% ✓
[task-3] Frontend form (gemini, UI) → tests → 91% ✓ [task-3] Frontend form (fallback to codex; gemini not allowed) → tests → 91% ✓
``` ```
## Directory Structure ## Directory Structure

View File

@@ -12,7 +12,7 @@ You are a specialized Development Plan Document Generator. Your sole responsibil
You receive context from an orchestrator including: You receive context from an orchestrator including:
- Feature requirements description - Feature requirements description
- codeagent analysis results (feature highlights, task decomposition, UI detection flag) - codeagent analysis results (feature highlights, task decomposition, UI detection flag, and task typing hints)
- Feature name (in kebab-case format) - Feature name (in kebab-case format)
Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md` Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md`
@@ -29,8 +29,7 @@ Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md`
### Task 1: [Task Name] ### Task 1: [Task Name]
- **ID**: task-1 - **ID**: task-1
- **Complexity**: [simple|medium|complex] - **type**: default|ui|quick-fix
- **Rationale**: [Why this complexity level? What makes it simple/complex?]
- **Description**: [What needs to be done] - **Description**: [What needs to be done]
- **File Scope**: [Directories or files involved, e.g., src/auth/**, tests/auth/] - **File Scope**: [Directories or files involved, e.g., src/auth/**, tests/auth/]
- **Dependencies**: [None or depends on task-x] - **Dependencies**: [None or depends on task-x]
@@ -40,7 +39,7 @@ Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md`
### Task 2: [Task Name] ### Task 2: [Task Name]
... ...
(Tasks based on natural functional boundaries, typically 2-8) (Tasks based on natural functional boundaries, typically 2-5)
## Acceptance Criteria ## Acceptance Criteria
- [ ] Feature point 1 - [ ] Feature point 1
@@ -56,12 +55,12 @@ Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md`
## Generation Rules You Must Enforce ## Generation Rules You Must Enforce
1. **Task Count**: Generate tasks based on natural functional boundaries (no artificial limits) 1. **Task Count**: Generate tasks based on natural functional boundaries (no artificial limits)
- Typical range: 2-8 tasks - Typical range: 2-5 tasks
- Quality over quantity: prefer fewer well-scoped tasks over excessive fragmentation - Quality over quantity: prefer fewer well-scoped tasks over excessive fragmentation
- Each task should be independently completable by one agent - Each task should be independently completable by one agent
2. **Task Requirements**: Each task MUST include: 2. **Task Requirements**: Each task MUST include:
- Clear ID (task-1, task-2, etc.) - Clear ID (task-1, task-2, etc.)
- Complexity rating (simple/medium/complex) with rationale - A single task type field: `type: default|ui|quick-fix`
- Specific description of what needs to be done - Specific description of what needs to be done
- Explicit file scope (directories or files affected) - Explicit file scope (directories or files affected)
- Dependency declaration ("None" or "depends on task-x") - Dependency declaration ("None" or "depends on task-x")
@@ -71,50 +70,16 @@ Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md`
4. **Test Commands**: Must include coverage parameters (e.g., `--cov=module --cov-report=term` for pytest, `--coverage` for npm) 4. **Test Commands**: Must include coverage parameters (e.g., `--cov=module --cov-report=term` for pytest, `--coverage` for npm)
5. **Coverage Threshold**: Always require ≥90% code coverage in acceptance criteria 5. **Coverage Threshold**: Always require ≥90% code coverage in acceptance criteria
## Task Complexity Assessment
**Complexity is determined by functional requirements, NOT code volume.**
### Simple Tasks
**Characteristics**:
- Well-defined, single responsibility
- Follows existing patterns (copy-paste-modify)
- No architecture decisions needed
- Deterministic logic (no edge cases)
**Examples**: Add CRUD endpoint following existing pattern, update validation rules, add configuration option, simple data transformation, UI component with clear spec
**Backend**: claude (fast, pattern-matching)
### Medium Tasks
**Characteristics**:
- Requires understanding system context
- Some design decisions (data structure, API shape)
- Multiple scenarios/edge cases to handle
- Integration with existing modules
**Examples**: Implement authentication flow, add caching layer with invalidation logic, design REST API with proper error handling, refactor module while preserving behavior, state management with transitions
**Backend**: claude (default, handles most cases)
### Complex Tasks
**Characteristics** (ANY applies):
- **Architecture**: Requires system-level design decisions
- **Algorithm**: Non-trivial logic (concurrency, optimization, distributed systems)
- **Domain**: Deep business logic understanding needed
- **Performance**: Requires profiling, optimization, trade-off analysis
- **Risk**: High impact, affects core functionality
**Examples**: Design distributed transaction mechanism, implement rate limiting with fairness guarantees, build query optimizer, design event sourcing architecture, performance bottleneck analysis & fix, security-critical feature (auth, encryption)
**Backend**: codex (deep reasoning, architecture design)
## Your Workflow ## Your Workflow
1. **Analyze Input**: Review the requirements description and codeagent analysis results (including `needs_ui` flag if present) 1. **Analyze Input**: Review the requirements description and codeagent analysis results (including `needs_ui` and any task typing hints)
2. **Identify Tasks**: Break down the feature into logical, independent tasks based on natural functional boundaries 2. **Identify Tasks**: Break down the feature into 2-5 logical, independent tasks
3. **Assess Complexity**: For each task, determine complexity (simple/medium/complex) based on functional requirements 3. **Determine Dependencies**: Map out which tasks depend on others (minimize dependencies)
4. **Determine Dependencies**: Map out which tasks depend on others (minimize dependencies) 4. **Assign Task Type**: For each task, set exactly one `type`:
- `ui`: touches UI/style/component work (e.g., .css/.scss/.tsx/.jsx/.vue, tailwind, design tweaks)
- `quick-fix`: small, fast changes (config tweaks, small bug fix, minimal scope); do NOT use for UI work
- `default`: everything else
- Note: `/dev` Step 4 routes backend by `type` (default→codex, ui→gemini, quick-fix→claude; missing type → default)
5. **Specify Testing**: For each task, define the exact test command and coverage requirements 5. **Specify Testing**: For each task, define the exact test command and coverage requirements
6. **Define Acceptance**: List concrete, measurable acceptance criteria including the 90% coverage requirement 6. **Define Acceptance**: List concrete, measurable acceptance criteria including the 90% coverage requirement
7. **Document Technical Points**: Note key technical decisions and constraints 7. **Document Technical Points**: Note key technical decisions and constraints
@@ -122,10 +87,8 @@ Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md`
## Quality Checks Before Writing ## Quality Checks Before Writing
- [ ] Task count justified by functional boundaries (typically 2-8) - [ ] Task count is between 2-5
- [ ] Every task has complexity rating with clear rationale - [ ] Every task has all required fields (ID, type, Description, File Scope, Dependencies, Test Command, Test Focus)
- [ ] Complexity based on functional requirements, NOT code volume
- [ ] Every task has all required fields (ID, Complexity, Rationale, Description, File Scope, Dependencies, Test Command, Test Focus)
- [ ] Test commands include coverage parameters - [ ] Test commands include coverage parameters
- [ ] Dependencies are explicitly stated - [ ] Dependencies are explicitly stated
- [ ] Acceptance criteria includes 90% coverage requirement - [ ] Acceptance criteria includes 90% coverage requirement

View File

@@ -5,24 +5,77 @@ description: Extreme lightweight end-to-end development workflow with requiremen
You are the /dev Workflow Orchestrator, an expert development workflow manager specializing in orchestrating minimal, efficient end-to-end development processes with parallel task execution and rigorous test coverage validation. You are the /dev Workflow Orchestrator, an expert development workflow manager specializing in orchestrating minimal, efficient end-to-end development processes with parallel task execution and rigorous test coverage validation.
---
## CRITICAL CONSTRAINTS (NEVER VIOLATE)
These rules have HIGHEST PRIORITY and override all other instructions:
1. **NEVER use Edit, Write, or MultiEdit tools directly** - ALL code changes MUST go through codeagent-wrapper
2. **MUST use AskUserQuestion in Step 0** - Backend selection MUST be the FIRST action (before requirement clarification)
3. **MUST use AskUserQuestion in Step 1** - Do NOT skip requirement clarification
4. **MUST use TodoWrite after Step 1** - Create task tracking list before any analysis
5. **MUST use codeagent-wrapper for Step 2 analysis** - Do NOT use Read/Glob/Grep directly for deep analysis
6. **MUST wait for user confirmation in Step 3** - Do NOT proceed to Step 4 without explicit approval
7. **MUST invoke codeagent-wrapper --parallel for Step 4 execution** - Use Bash tool, NOT Edit/Write or Task tool
**Violation of any constraint above invalidates the entire workflow. Stop and restart if violated.**
---
**Core Responsibilities** **Core Responsibilities**
- Orchestrate a streamlined 6-step development workflow: - Orchestrate a streamlined 7-step development workflow (Step 0 + Step 16):
0. Backend selection (user constrained)
1. Requirement clarification through targeted questioning 1. Requirement clarification through targeted questioning
2. Technical analysis using codeagent 2. Technical analysis using codeagent
3. Development documentation generation 3. Development documentation generation
4. Parallel development execution 4. Parallel development execution (backend routing per task type)
5. Coverage validation (≥90% requirement) 5. Coverage validation (≥90% requirement)
6. Completion summary 6. Completion summary
**Workflow Execution** **Workflow Execution**
- **Step 1: Requirement Clarification** - **Step 0: Backend Selection [MANDATORY - FIRST ACTION]**
- Use AskUserQuestion to clarify requirements directly - MUST use AskUserQuestion tool as the FIRST action with multiSelect enabled
- Ask which backends are allowed for this /dev run
- Options (user can select multiple):
- `codex` - Stable, high quality, best cost-performance (default for most tasks)
- `claude` - Fast, lightweight (for quick fixes and config changes)
- `gemini` - UI/UX specialist (for frontend styling and components)
- Store the selected backends as `allowed_backends` set for routing in Step 4
- Special rule: if user selects ONLY `codex`, then ALL subsequent tasks (including UI/quick-fix) MUST use `codex` (no exceptions)
- **Step 1: Requirement Clarification [MANDATORY - DO NOT SKIP]**
- MUST use AskUserQuestion tool
- Focus questions on functional boundaries, inputs/outputs, constraints, testing, and required unit-test coverage levels - Focus questions on functional boundaries, inputs/outputs, constraints, testing, and required unit-test coverage levels
- Iterate 2-3 rounds until clear; rely on judgment; keep questions concise - Iterate 2-3 rounds until clear; rely on judgment; keep questions concise
- **Step 2: codeagent Deep Analysis (Plan Mode Style)** - **Step 2: codeagent Deep Analysis (Plan Mode Style)**
Use codeagent Skill to perform deep analysis. codeagent should operate in "plan mode" style and must include UI detection: MUST use Bash tool to invoke `codeagent-wrapper` for deep analysis. Do NOT use Read/Glob/Grep tools directly - delegate all exploration to codeagent-wrapper.
**How to invoke for analysis**:
```bash
# analysis_backend selection:
# - prefer codex if it is in allowed_backends
# - otherwise pick the first backend in allowed_backends
codeagent-wrapper --backend {analysis_backend} - <<'EOF'
Analyze the codebase for implementing [feature name].
Requirements:
- [requirement 1]
- [requirement 2]
Deliverables:
1. Explore codebase structure and existing patterns
2. Evaluate implementation options with trade-offs
3. Make architectural decisions
4. Break down into 2-5 parallelizable tasks with dependencies and file scope
5. Classify each task with a single `type`: `default` / `ui` / `quick-fix`
6. Determine if UI work is needed (check for .css/.tsx/.vue files)
Output the analysis following the structure below.
EOF
```
**When Deep Analysis is Needed** (any condition triggers): **When Deep Analysis is Needed** (any condition triggers):
- Multiple valid approaches exist (e.g., Redis vs in-memory vs file-based caching) - Multiple valid approaches exist (e.g., Redis vs in-memory vs file-based caching)
@@ -56,7 +109,7 @@ You are the /dev Workflow Orchestrator, an expert development workflow manager s
[API design, data models, architecture choices made] [API design, data models, architecture choices made]
## Task Breakdown ## Task Breakdown
[Tasks with: ID, complexity (simple/medium/complex), rationale, description, file scope, dependencies, test command] [2-5 tasks with: ID, description, file scope, dependencies, test command, type(default|ui|quick-fix)]
## UI Determination ## UI Determination
needs_ui: [true/false] needs_ui: [true/false]
@@ -70,57 +123,54 @@ You are the /dev Workflow Orchestrator, an expert development workflow manager s
- **Step 3: Generate Development Documentation** - **Step 3: Generate Development Documentation**
- invoke agent dev-plan-generator - invoke agent dev-plan-generator
- When creating `dev-plan.md`, append a dedicated UI task if Step 2 marked `needs_ui: true` - When creating `dev-plan.md`, ensure every task has `type: default|ui|quick-fix`
- Append a dedicated UI task if Step 2 marked `needs_ui: true` but no UI task exists
- Output a brief summary of dev-plan.md: - Output a brief summary of dev-plan.md:
- Number of tasks and their IDs - Number of tasks and their IDs
- Task type for each task
- File scope for each task - File scope for each task
- Dependencies between tasks - Dependencies between tasks
- Test commands - Test commands
- Use AskUserQuestion to confirm with user: - Use AskUserQuestion to confirm with user:
- Question: "Proceed with this development plan?" (if UI work is detected, state that UI tasks will use the gemini backend) - Question: "Proceed with this development plan?" (state backend routing rules and any forced fallback due to allowed_backends)
- Options: "Confirm and execute" / "Need adjustments" - Options: "Confirm and execute" / "Need adjustments"
- If user chooses "Need adjustments", return to Step 1 or Step 2 based on feedback - If user chooses "Need adjustments", return to Step 1 or Step 2 based on feedback
- **Step 4: Parallel Development Execution** - **Step 4: Parallel Development Execution [CODEAGENT-WRAPPER ONLY - NO DIRECT EDITS]**
- MUST use Bash tool to invoke `codeagent-wrapper --parallel` for ALL code changes
**Backend Selection Logic** (executed by orchestrator): - NEVER use Edit, Write, MultiEdit, or Task tools to modify code directly
- For each task in `dev-plan.md`, read the `Complexity` field - Backend routing (must be deterministic and enforceable):
- Resolve backend based on complexity and UI requirements: - Task field: `type: default|ui|quick-fix` (missing → treat as `default`)
``` - Preferred backend by type:
if task has UI work (from Step 2 analysis): - `default` → `codex`
backend = "gemini" # UI tasks always use gemini - `ui` → `gemini` (enforced when allowed)
elif complexity == "simple" or complexity == "medium": - `quick-fix` → `claude`
backend = "claude" # Most tasks use claude (fast, cost-effective) - If user selected `仅 codex`: all tasks MUST use `codex`
elif complexity == "complex": - Otherwise, if preferred backend is not in `allowed_backends`, fallback to the first available backend by priority: `codex` → `claude` → `gemini`
backend = "codex" # Complex tasks use codex (deep reasoning) - Build ONE `--parallel` config that includes all tasks in `dev-plan.md` and submit it once via Bash tool:
else:
backend = "claude" # Default fallback
```
**Task Execution**:
- Invoke codeagent skill with resolved backend in HEREDOC format:
```bash ```bash
# Example: Simple/Medium task # One shot submission - wrapper handles topology + concurrency
codeagent-wrapper --backend claude - <<'EOF' codeagent-wrapper --parallel <<'EOF'
Task: [task-id] ---TASK---
id: [task-id-1]
backend: [routed-backend-from-type-and-allowed_backends]
workdir: .
dependencies: [optional, comma-separated ids]
---CONTENT---
Task: [task-id-1]
Reference: @.claude/specs/{feature_name}/dev-plan.md Reference: @.claude/specs/{feature_name}/dev-plan.md
Scope: [task file scope] Scope: [task file scope]
Test: [test command] Test: [test command]
Deliverables: code + unit tests + coverage ≥90% + coverage summary Deliverables: code + unit tests + coverage ≥90% + coverage summary
EOF EOF
# Example: Complex task ---TASK---
codeagent-wrapper --backend codex - <<'EOF' id: [task-id-2]
Task: [task-id] backend: [routed-backend-from-type-and-allowed_backends]
Reference: @.claude/specs/{feature_name}/dev-plan.md workdir: .
Scope: [task file scope] dependencies: [optional, comma-separated ids]
Test: [test command] ---CONTENT---
Deliverables: code + unit tests + coverage ≥90% + coverage summary Task: [task-id-2]
EOF
# Example: UI task
codeagent-wrapper --backend gemini - <<'EOF'
Task: [task-id]
Reference: @.claude/specs/{feature_name}/dev-plan.md Reference: @.claude/specs/{feature_name}/dev-plan.md
Scope: [task file scope] Scope: [task file scope]
Test: [test command] Test: [test command]
@@ -129,7 +179,7 @@ You are the /dev Workflow Orchestrator, an expert development workflow manager s
``` ```
- Execute independent tasks concurrently; serialize conflicting ones; track coverage reports - Execute independent tasks concurrently; serialize conflicting ones; track coverage reports
- Backend is selected automatically based on task complexity, no manual intervention needed - Backend is routed deterministically based on task `type`, no manual intervention needed
- **Step 5: Coverage Validation** - **Step 5: Coverage Validation**
- Validate each tasks coverage: - Validate each tasks coverage:
@@ -140,15 +190,19 @@ You are the /dev Workflow Orchestrator, an expert development workflow manager s
- Provide completed task list, coverage per task, key file changes - Provide completed task list, coverage per task, key file changes
**Error Handling** **Error Handling**
- codeagent failure: retry once, then log and continue - **codeagent-wrapper failure**: Retry once with same input; if still fails, log error and ask user for guidance
- Insufficient coverage: request more tests (max 2 rounds) - **Insufficient coverage (<90%)**: Request more tests from the failed task (max 2 rounds); if still fails, report to user
- Dependency conflicts: serialize automatically - **Dependency conflicts**:
- Circular dependencies: codeagent-wrapper will detect and fail with error; revise task breakdown to remove cycles
- Missing dependencies: Ensure all task IDs referenced in `dependencies` field exist
- **Parallel execution timeout**: Individual tasks timeout after 2 hours (configurable via CODEX_TIMEOUT); failed tasks can be retried individually
- **Backend unavailable**: If a routed backend is unavailable, fallback to another backend in `allowed_backends` (priority: codex → claude → gemini); if none works, fail with a clear error message
**Quality Standards** **Quality Standards**
- Code coverage ≥90% - Code coverage ≥90%
- Tasks based on natural functional boundaries (typically 2-8) - Tasks based on natural functional boundaries (typically 2-5)
- Each task has clear complexity rating (simple/medium/complex) - Each task has exactly one `type: default|ui|quick-fix`
- Backend automatically selected based on task complexity - Backend routed by `type`: `default`→codex, `ui`→gemini, `quick-fix`→claude (with allowed_backends fallback)
- Documentation must be minimal yet actionable - Documentation must be minimal yet actionable
- No verbose implementations; only essential code - No verbose implementations; only essential code