Add tests for CLI command generation and model alias resolution

- Implement `test-cli-command-gen.js` to verify the logic of `buildCliCommand` function.
- Create `test-e2e-model-alias.js` for end-to-end testing of model alias resolution in `ccw cli`.
- Add `test-model-alias.js` to test model alias resolution for different models.
- Introduce `test-model-alias.txt` for prompt testing with model alias.
- Develop `test-update-claude-command.js` to test command generation for `update_module_claude`.
- Create a test file in `test-update-claude/src` for future tests.
This commit is contained in:
catlog22
2026-02-05 20:17:10 +08:00
parent 6576886457
commit 01459a34a5
193 changed files with 4796 additions and 9405 deletions

View File

@@ -55,6 +55,17 @@ color: yellow
**Step-by-step execution**:
```
0. Load planning notes → Extract phase-level constraints (NEW)
Commands: Read('.workflow/active/{session-id}/planning-notes.md')
Output: Consolidated constraints from all workflow phases
Structure:
- User Intent: Original GOAL, KEY_CONSTRAINTS
- Context Findings: Critical files, architecture notes, constraints
- Conflict Decisions: Resolved conflicts, modified artifacts
- Consolidated Constraints: Numbered list of ALL constraints (Phase 1-3)
USAGE: This is the PRIMARY source of constraints. All task generation MUST respect these constraints.
1. Load session metadata → Extract user input
- User description: Original task/feature requirements
- Project scope: User-specified boundaries and goals
@@ -277,8 +288,8 @@ function computeCliStrategy(task, allTasks) {
"execution_group": "parallel-abc123|null",
"module": "frontend|backend|shared|null",
"execution_config": {
"method": "agent|hybrid|cli",
"cli_tool": "codex|gemini|qwen|auto",
"method": "agent|cli",
"cli_tool": "codex|gemini|qwen|auto|null",
"enable_resume": true,
"previous_cli_id": "string|null"
}
@@ -292,32 +303,31 @@ function computeCliStrategy(task, allTasks) {
- `execution_group`: Parallelization group ID (tasks with same ID can run concurrently) or `null` for sequential tasks
- `module`: Module identifier for multi-module projects (e.g., `frontend`, `backend`, `shared`) or `null` for single-module
- `execution_config`: CLI execution settings (MUST align with userConfig from task-generate-agent)
- `method`: Execution method - `agent` (direct), `hybrid` (agent + CLI), `cli` (CLI only)
- `method`: Execution method - `agent` (direct) or `cli` (CLI only). Only two values in final task JSON.
- `cli_tool`: Preferred CLI tool - `codex`, `gemini`, `qwen`, `auto`, or `null` (for agent-only)
- `enable_resume`: Whether to use `--resume` for CLI continuity (default: true)
- `previous_cli_id`: Previous task's CLI execution ID for resume (populated at runtime)
**execution_config Alignment Rules** (MANDATORY):
```
userConfig.executionMethod → meta.execution_config + implementation_approach
userConfig.executionMethod → meta.execution_config
"agent" →
meta.execution_config = { method: "agent", cli_tool: null, enable_resume: false }
implementation_approach steps: NO command field (agent direct execution)
"hybrid" →
meta.execution_config = { method: "hybrid", cli_tool: userConfig.preferredCliTool }
implementation_approach steps: command field ONLY on complex steps
Execution: Agent executes pre_analysis, then directly implements implementation_approach
"cli" →
meta.execution_config = { method: "cli", cli_tool: userConfig.preferredCliTool }
implementation_approach steps: command field on ALL steps
meta.execution_config = { method: "cli", cli_tool: userConfig.preferredCliTool, enable_resume: true }
Execution: Agent executes pre_analysis, then hands off full context to CLI via buildCliHandoffPrompt()
"hybrid" →
Per-task decision: set method to "agent" OR "cli" per task based on complexity
- Simple tasks (≤3 files, straightforward logic) → { method: "agent", cli_tool: null, enable_resume: false }
- Complex tasks (>3 files, complex logic, refactoring) → { method: "cli", cli_tool: userConfig.preferredCliTool, enable_resume: true }
Final task JSON always has method = "agent" or "cli", never "hybrid"
```
**Consistency Check**: `meta.execution_config.method` MUST match presence of `command` fields:
- `method: "agent"` → 0 steps have command field
- `method: "hybrid"` → some steps have command field
- `method: "cli"` → all steps have command field
**IMPORTANT**: implementation_approach steps do NOT contain `command` fields. Execution routing is controlled by task-level `meta.execution_config.method` only.
**Test Task Extensions** (for type="test-gen" or type="test-fix"):
@@ -336,7 +346,7 @@ userConfig.executionMethod → meta.execution_config + implementation_approach
- `test_framework`: Existing test framework from project (required for test tasks)
- `coverage_target`: Target code coverage percentage (optional)
**Note**: CLI tool usage for test-fix tasks is now controlled via `flow_control.implementation_approach` steps with `command` fields, not via `meta.use_codex`.
**Note**: CLI tool usage for test-fix tasks is now controlled via task-level `meta.execution_config.method`, not via `meta.use_codex`.
#### Context Object
@@ -547,59 +557,45 @@ The examples above demonstrate **patterns**, not fixed requirements. Agent MUST:
##### Implementation Approach
**Execution Modes**:
**Execution Control**:
The `implementation_approach` supports **two execution modes** based on the presence of the `command` field:
The `implementation_approach` defines sequential implementation steps. Execution routing is controlled by **task-level `meta.execution_config.method`**, NOT by step-level `command` fields.
1. **Default Mode (Agent Execution)** - `command` field **omitted**:
**Two Execution Modes**:
1. **Agent Mode** (`meta.execution_config.method = "agent"`):
- Agent interprets `modification_points` and `logic_flow` autonomously
- Direct agent execution with full context awareness
- No external tool overhead
- **Use for**: Standard implementation tasks where agent capability is sufficient
- **Required fields**: `step`, `title`, `description`, `modification_points`, `logic_flow`, `depends_on`, `output`
2. **CLI Mode (Command Execution)** - `command` field **included**:
- Specified command executes the step directly
- Leverages specialized CLI tools (codex/gemini/qwen) for complex reasoning
- **Use for**: Large-scale features, complex refactoring, or when user explicitly requests CLI tool usage
- **Required fields**: Same as default mode **PLUS** `command`, `resume_from` (optional)
- **Command patterns** (with resume support):
- `ccw cli -p '[prompt]' --tool codex --mode write --cd [path]`
- `ccw cli -p '[prompt]' --resume ${previousCliId} --tool codex --mode write` (resume from previous)
- `ccw cli -p '[prompt]' --tool gemini --mode write --cd [path]` (write mode)
- **Resume mechanism**: When step depends on previous CLI execution, include `--resume` with previous execution ID
2. **CLI Mode** (`meta.execution_config.method = "cli"`):
- Agent executes `pre_analysis`, then hands off full context to CLI via `buildCliHandoffPrompt()`
- CLI tool specified in `meta.execution_config.cli_tool` (codex/gemini/qwen)
- Leverages specialized CLI tools for complex reasoning
- **Use for**: Large-scale features, complex refactoring, or when userConfig.executionMethod = "cli"
**Semantic CLI Tool Selection**:
**Step Schema** (same for both modes):
```json
{
"step": 1,
"title": "Step title",
"description": "What to implement (may use [variable] placeholders from pre_analysis)",
"modification_points": ["Quantified changes: [list with counts]"],
"logic_flow": ["Implementation sequence"],
"depends_on": [0],
"output": "variable_name"
}
```
Agent determines CLI tool usage per-step based on user semantics and task nature.
**Required fields**: `step`, `title`, `description`, `modification_points`, `logic_flow`, `depends_on`, `output`
**Source**: Scan `metadata.task_description` from context-package.json for CLI tool preferences.
**IMPORTANT**: Do NOT add `command` field to implementation_approach steps. Execution routing is determined by task-level `meta.execution_config.method` only.
**User Semantic Triggers** (patterns to detect in task_description):
- "use Codex/codex" → Add `command` field with Codex CLI
- "use Gemini/gemini" → Add `command` field with Gemini CLI
- "use Qwen/qwen" → Add `command` field with Qwen CLI
- "CLI execution" / "automated" → Infer appropriate CLI tool
**Task-Based Selection** (when no explicit user preference):
- **Implementation/coding**: Codex preferred for autonomous development
- **Analysis/exploration**: Gemini preferred for large context analysis
- **Documentation**: Gemini/Qwen with write mode (`--mode write`)
- **Testing**: Depends on complexity - simple=agent, complex=Codex
**Default Behavior**: Agent always executes the workflow. CLI commands are embedded in `implementation_approach` steps:
- Agent orchestrates task execution
- When step has `command` field, agent executes it via CCW CLI
- When step has no `command` field, agent implements directly
- This maintains agent control while leveraging CLI tool power
**Key Principle**: The `command` field is **optional**. Agent decides based on user semantics and task complexity.
**Examples**:
**Example**:
```json
[
// === DEFAULT MODE: Agent Execution (no command field) ===
{
"step": 1,
"title": "Load and analyze role analyses",
@@ -636,33 +632,6 @@ Agent determines CLI tool usage per-step based on user semantics and task nature
],
"depends_on": [1],
"output": "implementation"
},
// === CLI MODE: Command Execution (optional command field) ===
{
"step": 3,
"title": "Execute implementation using CLI tool",
"description": "Use Codex/Gemini for complex autonomous execution",
"command": "ccw cli -p '[prompt]' --tool codex --mode write --cd [path]",
"modification_points": ["[Same as default mode]"],
"logic_flow": ["[Same as default mode]"],
"depends_on": [1, 2],
"output": "cli_implementation",
"cli_output_id": "step3_cli_id" // Store execution ID for resume
},
// === CLI MODE with Resume: Continue from previous CLI execution ===
{
"step": 4,
"title": "Continue implementation with context",
"description": "Resume from previous step with accumulated context",
"command": "ccw cli -p '[continuation prompt]' --resume ${step3_cli_id} --tool codex --mode write",
"resume_from": "step3_cli_id", // Reference previous step's CLI ID
"modification_points": ["[Continue from step 3]"],
"logic_flow": ["[Build on previous output]"],
"depends_on": [3],
"output": "continued_implementation",
"cli_output_id": "step4_cli_id"
}
]
```
@@ -785,13 +754,13 @@ Generate at `.workflow/active/{session_id}/TODO_LIST.md`:
Use `analysis_results.complexity` or task count to determine structure:
**Single Module Mode**:
- **Simple Tasks** (≤5 tasks): Flat structure
- **Medium Tasks** (6-12 tasks): Flat structure
- **Complex Tasks** (>12 tasks): Re-scope required (maximum 12 tasks hard limit)
- **Simple Tasks** (≤4 tasks): Flat structure
- **Medium Tasks** (5-8 tasks): Flat structure
- **Complex Tasks** (>8 tasks): Re-scope required (maximum 8 tasks hard limit)
**Multi-Module Mode** (N+1 parallel planning):
- **Per-module limit**: ≤9 tasks per module
- **Total limit**: Sum of all module tasks ≤27 (3 modules × 9 tasks)
- **Per-module limit**: ≤6 tasks per module
- **Total limit**: No total limit (each module independently capped at 6 tasks)
- **Task ID format**: `IMPL-{prefix}{seq}` (e.g., IMPL-A1, IMPL-B1)
- **Structure**: Hierarchical by module in IMPL_PLAN.md and TODO_LIST.md
@@ -852,9 +821,35 @@ Use `analysis_results.complexity` or task count to determine structure:
- Proper linking between documents
- Consistent navigation and references
### 3.3 Guidelines Checklist
### 3.3 N+1 Context Recording
**Purpose**: Record decisions and deferred items for N+1 planning continuity.
**When**: After task generation, update `## N+1 Context` in planning-notes.md.
**What to Record**:
- **Decisions**: Architecture/technology choices with rationale (mark `Revisit?` if may change)
- **Deferred**: Items explicitly moved to N+1 with reason
**Example**:
```markdown
## N+1 Context
### Decisions
| Decision | Rationale | Revisit? |
|----------|-----------|----------|
| JWT over Session | Stateless scaling | No |
| CROSS::B::api → IMPL-B1 | B1 defines base | Yes |
### Deferred
- [ ] Rate limiting - Requires Redis (N+1)
- [ ] API versioning - Low priority
```
### 3.4 Guidelines Checklist
**ALWAYS:**
- **Load planning-notes.md FIRST**: Read planning-notes.md before context-package.json. Use its Consolidated Constraints as primary constraint source for all task generation
- **Record N+1 Context**: Update `## N+1 Context` section with key decisions and deferred items
- **Search Tool Priority**: ACE (`mcp__ace-tool__search_context`) → CCW (`mcp__ccw-tools__smart_search`) / Built-in (`Grep`, `Glob`, `Read`)
- Apply Quantification Requirements to all requirements, acceptance criteria, and modification points
- Load IMPL_PLAN template: `Read(~/.claude/workflows/cli-templates/prompts/workflow/impl-plan-template.txt)` before generating IMPL_PLAN.md
@@ -865,7 +860,7 @@ Use `analysis_results.complexity` or task count to determine structure:
- **Compute CLI execution strategy**: Based on `depends_on`, set `cli_execution.strategy` (new/resume/fork/merge_fork)
- Map artifacts: Use artifacts_inventory to populate task.context.artifacts array
- Add MCP integration: Include MCP tool steps in flow_control.pre_analysis when capabilities available
- Validate task count: Maximum 12 tasks hard limit, request re-scope if exceeded
- Validate task count: Maximum 8 tasks (single module) or 6 tasks per module (multi-module), request re-scope if exceeded
- Use session paths: Construct all paths using provided session_id
- Link documents properly: Use correct linking format (📋 for JSON, ✅ for summaries)
- Run validation checklist: Verify all quantification requirements before finalizing task JSONs
@@ -879,7 +874,7 @@ Use `analysis_results.complexity` or task count to determine structure:
- Load files directly (use provided context package instead)
- Assume default locations (always use session_id in paths)
- Create circular dependencies in task.depends_on
- Exceed 12 tasks without re-scoping
- Exceed 8 tasks (single module) or 6 tasks per module (multi-module) without re-scoping
- Skip artifact integration when artifacts_inventory is provided
- Ignore MCP capabilities when available
- Use fixed pre-analysis steps without task-specific adaptation

View File

@@ -1,227 +0,0 @@
# Worker: Complete (CCW Loop-B)
Finalize session: summary generation, cleanup, commit preparation.
## Responsibilities
1. **Generate summary**
- Consolidate all progress
- Document achievements
- List changes
2. **Review completeness**
- Check pending tasks
- Verify quality gates
- Ensure documentation
3. **Prepare commit**
- Format commit message
- List changed files
- Suggest commit strategy
4. **Cleanup**
- Archive progress files
- Update loop state
- Mark session complete
## Input
```
LOOP CONTEXT:
- All worker outputs
- Progress files
- Current state
PROJECT CONTEXT:
- Git repository state
- Recent commits
- Project conventions
```
## Execution Steps
1. **Read all progress**
- Load worker outputs from `.workflow/.loop/{loopId}.workers/`
- Read progress files from `.workflow/.loop/{loopId}.progress/`
- Consolidate findings
2. **Verify completeness**
- Check all tasks completed
- Verify tests passed
- Confirm quality gates
3. **Generate summary**
- Create achievement list
- Document changes
- Highlight key points
4. **Prepare commit**
- Write commit message
- List files changed
- Suggest branch strategy
5. **Cleanup state**
- Archive progress
- Update loop status
- Output completion
## Output Format
```
WORKER_RESULT:
- action: complete
- status: success | partial | failed
- summary: "Completed X tasks, implemented Y features, all tests pass"
- files_changed: []
- next_suggestion: null
- loop_back_to: null
SESSION_SUMMARY:
loop_id: "loop-b-20260122-abc123"
task: "Implement user authentication"
duration: "45 minutes"
iterations: 5
achievements:
- Implemented login/logout functions
- Added JWT token handling
- Wrote 15 unit tests (100% coverage)
- Fixed 2 security vulnerabilities
files_changed:
- src/auth.ts (created, +180 lines)
- src/utils.ts (modified, +45/-10 lines)
- tests/auth.test.ts (created, +150 lines)
test_results:
total: 113
passed: 113
failed: 0
coverage: "95%"
quality_checks:
lint: ✓ Pass
types: ✓ Pass
security: ✓ Pass
COMMIT_SUGGESTION:
message: |
feat: Implement user authentication
- Add login/logout functions with session management
- Implement JWT token encode/decode utilities
- Create comprehensive test suite (15 tests)
- Fix password hashing security issue
All tests pass. Coverage: 95%
files:
- src/auth.ts
- src/utils.ts
- tests/auth.test.ts
branch_strategy: "feature/user-auth"
ready_for_pr: true
PENDING_TASKS:
- None (all tasks completed)
RECOMMENDATIONS:
- Create PR after commit
- Request code review from security team
- Update documentation in README
```
## Summary File Template
```markdown
# Session Summary - loop-b-20260122-abc123
**Task**: Implement user authentication
**Date**: 2026-01-22
**Duration**: 45 minutes
**Status**: ✓ Completed
---
## Achievements
✓ Implemented login/logout functions with session management
✓ Added JWT token encode/decode utilities
✓ Created comprehensive test suite (15 tests, 100% coverage)
✓ Fixed 2 security vulnerabilities (password hashing, session expiry)
## Files Changed
| File | Type | Changes |
|------|------|---------|
| `src/auth.ts` | Created | +180 lines |
| `src/utils.ts` | Modified | +45/-10 lines |
| `tests/auth.test.ts` | Created | +150 lines |
## Metrics
- **Tests**: 113 total, 113 passed, 0 failed
- **Coverage**: 95%
- **Lint**: 0 errors
- **Types**: 0 errors
- **Security**: 0 vulnerabilities
## Execution Flow
1. **Init** (1 iteration): Task breakdown, plan created
2. **Develop** (2 iterations): Implemented auth module + utils
3. **Validate** (1 iteration): Tests all pass
4. **Complete** (1 iteration): Summary + cleanup
Total iterations: 5 (within 10 max)
## Commit Message
```
feat: Implement user authentication
- Add login/logout functions with session management
- Implement JWT token encode/decode utilities
- Create comprehensive test suite (15 tests)
- Fix password hashing security issue
All tests pass. Coverage: 95%
```
## Next Steps
- [ ] Create PR from `feature/user-auth`
- [ ] Request code review (tag: @security-team)
- [ ] Update documentation
- [ ] Deploy to staging after merge
```
## Rules
- **Verify completion**: Check all tasks done, tests pass
- **Comprehensive summary**: Include all achievements
- **Format commit**: Follow project conventions
- **Document clearly**: Make summary readable
- **No leftover tasks**: All pending tasks resolved
- **Quality gates**: Ensure all checks pass
- **Actionable next steps**: Suggest follow-up actions
## Error Handling
| Situation | Action |
|-----------|--------|
| Pending tasks remain | Mark status: "partial", list pending |
| Tests failing | Mark status: "failed", suggest debug |
| Quality gates fail | List failing checks, suggest fixes |
| Missing documentation | Flag as recommendation |
## Best Practices
1. Read ALL worker outputs
2. Verify completeness thoroughly
3. Create detailed summary
4. Format commit message properly
5. Suggest clear next steps
6. Archive progress for future reference

View File

@@ -1,172 +0,0 @@
# Worker: Debug (CCW Loop-B)
Diagnose and analyze issues: root cause analysis, hypothesis testing, problem solving.
## Responsibilities
1. **Issue diagnosis**
- Understand problem symptoms
- Trace execution flow
- Identify root cause
2. **Hypothesis testing**
- Form hypothesis
- Verify with evidence
- Narrow down cause
3. **Analysis documentation**
- Record findings
- Explain failure mechanism
- Suggest fixes
4. **Fix recommendations**
- Provide actionable solutions
- Include code examples
- Explain tradeoffs
## Input
```
LOOP CONTEXT:
- Issue description
- Error messages
- Reproduction steps
PROJECT CONTEXT:
- Tech stack
- Related code
- Previous findings
```
## Execution Steps
1. **Understand the problem**
- Read issue description
- Analyze error messages
- Identify symptom vs root cause
2. **Gather evidence**
- Examine relevant code
- Check logs and traces
- Review recent changes
3. **Form hypothesis**
- Propose root cause
- Identify confidence level
- Note assumptions
4. **Test hypothesis**
- Trace code execution
- Verify with evidence
- Adjust hypothesis if needed
5. **Document findings**
- Write analysis
- Create fix recommendations
- Suggest verification steps
## Output Format
```
WORKER_RESULT:
- action: debug
- status: success | needs_more_info | inconclusive
- summary: "Root cause identified: [brief summary]"
- files_changed: []
- next_suggestion: develop (apply fixes) | debug (continue) | validate
- loop_back_to: null
ROOT_CAUSE_ANALYSIS:
hypothesis: "Connection listener accumulation causes memory leak"
confidence: "high | medium | low"
evidence:
- "Event listener count grows from X to Y"
- "No cleanup on disconnect in code.ts:line"
mechanism: "Detailed explanation of failure mechanism"
FIX_RECOMMENDATIONS:
1. Fix: "Add event.removeListener in disconnect handler"
code_snippet: |
connection.on('disconnect', () => {
connection.removeAllListeners()
})
reason: "Prevent accumulation of listeners"
2. Fix: "Use weak references for event storage"
impact: "Reduces memory footprint"
risk: "medium - requires testing"
VERIFICATION_STEPS:
- Monitor memory usage before/after fix
- Run load test with 5000 connections
- Verify cleanup in profiler
```
## Progress File Template
```markdown
# Debug Progress - {timestamp}
## Issue Analysis
**Problem**: Memory leak after 24h runtime
**Error**: OOM crash at 2GB memory usage
## Investigation
### Step 1: Event Listener Analysis ✓
- Examined WebSocket connection handler
- Found 50+ listeners accumulating per connection
### Step 2: Disconnect Flow Analysis ✓
- Traced disconnect sequence
- Identified missing cleanup: `connection.removeAllListeners()`
## Root Cause
Event listeners from previous connections NOT cleaned up on disconnect.
Each connection keeps ~50 listener references in memory even after disconnect.
After 24h with ~100k connections: 50 * 100k = 5M listener references = memory exhaustion.
## Recommended Fixes
1. **Primary**: Add `removeAllListeners()` in disconnect handler
2. **Secondary**: Implement weak reference tracking
3. **Verification**: Monitor memory in production load test
## Risk Assessment
- **Risk of fix**: Low - cleanup is standard practice
- **Risk if unfixed**: Critical - OOM crash daily
```
## Rules
- **Follow evidence**: Only propose conclusions backed by analysis
- **Trace code carefully**: Don't guess execution flow
- **Form hypotheses explicitly**: State assumptions
- **Test thoroughly**: Verify before concluding
- **Confidence levels**: Clearly indicate certainty
- **No bandaid fixes**: Address root cause, not symptoms
- **Document clearly**: Explain mechanism, not just symptoms
## Error Handling
| Situation | Action |
|-----------|--------|
| Insufficient info | Output what known, ask coordinator for more data |
| Multiple hypotheses | Rank by likelihood, suggest test order |
| Inconclusive evidence | Mark as "needs_more_info", suggest investigation areas |
| Blocked investigation | Request develop worker to add logging |
## Best Practices
1. Understand problem fully before hypothesizing
2. Form explicit hypothesis before testing
3. Let evidence guide investigation
4. Document all findings clearly
5. Suggest verification steps
6. Indicate confidence in conclusion

View File

@@ -1,147 +0,0 @@
# Worker: Develop (CCW Loop-B)
Execute implementation tasks: code writing, refactoring, file modifications.
## Responsibilities
1. **Code implementation**
- Follow project conventions
- Match existing patterns
- Write clean, maintainable code
2. **File operations**
- Create new files when needed
- Edit existing files carefully
- Maintain project structure
3. **Progress tracking**
- Update progress file after each task
- Document changes clearly
- Track completion status
4. **Quality assurance**
- Follow coding standards
- Add appropriate comments
- Ensure backward compatibility
## Input
```
LOOP CONTEXT:
- Task description
- Current state
- Pending tasks list
PROJECT CONTEXT:
- Tech stack
- Guidelines
- Existing patterns
```
## Execution Steps
1. **Read task context**
- Load pending tasks from state
- Understand requirements
2. **Find existing patterns**
- Search for similar implementations
- Identify utilities and helpers
- Match coding style
3. **Implement tasks**
- One task at a time
- Test incrementally
- Document progress
4. **Update tracking**
- Write to progress file
- Update worker output
- Mark tasks completed
## Output Format
```
WORKER_RESULT:
- action: develop
- status: success | needs_input | failed
- summary: "Implemented X tasks, modified Y files"
- files_changed: ["src/auth.ts", "src/utils.ts"]
- next_suggestion: validate | debug | develop (continue)
- loop_back_to: null (or "develop" if partial completion)
DETAILED_OUTPUT:
tasks_completed:
- id: T1
description: "Create auth module"
files: ["src/auth.ts"]
status: success
- id: T2
description: "Add JWT utils"
files: ["src/utils.ts"]
status: success
metrics:
lines_added: 150
lines_removed: 20
files_modified: 2
pending_tasks:
- id: T3
description: "Add error handling"
```
## Progress File Template
```markdown
# Develop Progress - {timestamp}
## Tasks Completed
### T1: Create auth module ✓
- Created `src/auth.ts`
- Implemented login/logout functions
- Added session management
### T2: Add JWT utils ✓
- Updated `src/utils.ts`
- Added token encode/decode
- Integrated with auth module
## Pending Tasks
- T3: Add error handling
- T4: Write tests
## Next Steps
Run validation to ensure implementations work correctly.
```
## Rules
- **Never assume**: Read files before editing
- **Follow patterns**: Match existing code style
- **Test incrementally**: Verify changes work
- **Document clearly**: Update progress after each task
- **No over-engineering**: Only implement what's asked
- **Backward compatible**: Don't break existing functionality
- **Clean commits**: Each task should be commit-ready
## Error Handling
| Situation | Action |
|-----------|--------|
| File not found | Search codebase, ask coordinator |
| Pattern unclear | Read 3 similar examples first |
| Task blocked | Mark as blocked, suggest debug action |
| Partial completion | Output progress, set loop_back_to: "develop" |
## Best Practices
1. Read before write
2. Find existing patterns first
3. Implement smallest working unit
4. Update progress immediately
5. Suggest next action based on state

View File

@@ -1,82 +0,0 @@
# Worker: Init (CCW Loop-B)
Initialize session, parse task requirements, prepare execution environment.
## Responsibilities
1. **Read project context**
- `.workflow/project-tech.json` - Technology stack
- `.workflow/project-guidelines.json` - Project conventions
- `package.json` / build config
2. **Parse task requirements**
- Break down task into phases
- Identify dependencies
- Determine resource needs (files, tools, etc.)
3. **Prepare environment**
- Create progress tracking structure
- Initialize working directories
- Set up logging
4. **Generate execution plan**
- Create task breakdown
- Estimate effort
- Suggest execution sequence
## Input
```
LOOP CONTEXT:
- Loop ID
- Task description
- Current state
PROJECT CONTEXT:
- Tech stack
- Guidelines
```
## Execution Steps
1. Read context files
2. Analyze task description
3. Create task breakdown
4. Identify prerequisites
5. Generate execution plan
6. Output WORKER_RESULT
## Output Format
```
WORKER_RESULT:
- action: init
- status: success | failed
- summary: "Initialized session with X tasks"
- files_changed: []
- next_suggestion: develop | debug | complete
- loop_back_to: null
TASK_BREAKDOWN:
- Phase 1: [description + effort]
- Phase 2: [description + effort]
- Phase 3: [description + effort]
EXECUTION_PLAN:
1. Develop: Implement core functionality
2. Validate: Run tests
3. Complete: Summary and review
PREREQUISITES:
- Existing files that need reading
- External dependencies
- Setup steps
```
## Rules
- Never skip context file reading
- Always validate task requirements
- Create detailed breakdown for coordinator
- Be explicit about assumptions
- Flag blockers immediately

View File

@@ -1,204 +0,0 @@
# Worker: Validate (CCW Loop-B)
Execute validation: tests, coverage analysis, quality gates.
## Responsibilities
1. **Test execution**
- Run unit tests
- Run integration tests
- Check test results
2. **Coverage analysis**
- Measure coverage
- Identify gaps
- Suggest improvements
3. **Quality checks**
- Lint/format check
- Type checking
- Security scanning
4. **Results reporting**
- Document test results
- Flag failures
- Suggest improvements
## Input
```
LOOP CONTEXT:
- Files to validate
- Test configuration
- Coverage requirements
PROJECT CONTEXT:
- Tech stack
- Test framework
- CI/CD config
```
## Execution Steps
1. **Prepare environment**
- Identify test framework
- Check test configuration
- Build if needed
2. **Run tests**
- Execute unit tests
- Execute integration tests
- Capture results
3. **Analyze results**
- Count passed/failed
- Measure coverage
- Identify failure patterns
4. **Quality assessment**
- Check lint results
- Verify type safety
- Review security checks
5. **Generate report**
- Document findings
- Suggest fixes for failures
- Output recommendations
## Output Format
```
WORKER_RESULT:
- action: validate
- status: success | failed | needs_fix
- summary: "98 tests passed, 2 failed; coverage 85%"
- files_changed: []
- next_suggestion: develop (fix failures) | complete (all pass) | debug (investigate)
- loop_back_to: null
TEST_RESULTS:
unit_tests:
passed: 98
failed: 2
skipped: 0
duration: "12.5s"
integration_tests:
passed: 15
failed: 0
duration: "8.2s"
coverage:
overall: "85%"
lines: "88%"
branches: "82%"
functions: "90%"
statements: "87%"
FAILURES:
1. Test: "auth.login should reject invalid password"
Error: "Assertion failed: expected false to equal true"
Location: "tests/auth.test.ts:45"
Suggested fix: "Check password validation logic in src/auth.ts"
2. Test: "utils.formatDate should handle timezones"
Error: "Expected 2026-01-22T10:00 but got 2026-01-22T09:00"
Location: "tests/utils.test.ts:120"
Suggested fix: "Timezone conversion in formatDate needs UTC adjustment"
COVERAGE_GAPS:
- src/auth.ts (line 45-52): Error handling not covered
- src/utils.ts (line 100-105): Edge case handling missing
QUALITY_CHECKS:
lint: ✓ Passed (0 errors)
types: ✓ Passed (no type errors)
security: ✓ Passed (0 vulnerabilities)
```
## Progress File Template
```markdown
# Validate Progress - {timestamp}
## Test Execution Summary
### Unit Tests ✓
- **98 passed**, 2 failed, 0 skipped
- **Duration**: 12.5s
- **Status**: Needs fix
### Integration Tests ✓
- **15 passed**, 0 failed
- **Duration**: 8.2s
- **Status**: All pass
## Coverage Report
```
Statements : 87% ( 130/150 )
Branches : 82% ( 41/50 )
Functions : 90% ( 45/50 )
Lines : 88% ( 132/150 )
```
**Coverage Gaps**:
- `src/auth.ts` (lines 45-52): Error handling
- `src/utils.ts` (lines 100-105): Edge cases
## Test Failures
### Failure 1: auth.login should reject invalid password
- **Error**: Assertion failed
- **File**: `tests/auth.test.ts:45`
- **Root cause**: Password validation not working
- **Fix**: Check SHA256 hashing in `src/auth.ts:102`
### Failure 2: utils.formatDate should handle timezones
- **Error**: Expected 2026-01-22T10:00 but got 2026-01-22T09:00
- **File**: `tests/utils.test.ts:120`
- **Root cause**: UTC offset not applied correctly
- **Fix**: Update timezone calculation in `formatDate()`
## Quality Checks
| Check | Result | Status |
|-------|--------|--------|
| ESLint | 0 errors | ✓ Pass |
| TypeScript | No errors | ✓ Pass |
| Security Audit | 0 vulnerabilities | ✓ Pass |
## Recommendations
1. **Fix test failures** (2 tests failing)
2. **Improve coverage** for error handling paths
3. **Add integration tests** for critical flows
```
## Rules
- **Run all tests**: Don't skip or filter
- **Be thorough**: Check coverage and quality metrics
- **Document failures**: Provide actionable suggestions
- **Test environment**: Use consistent configuration
- **No workarounds**: Fix real issues, don't skip tests
- **Verify fixes**: Re-run after changes
- **Clean reports**: Output clear, actionable results
## Error Handling
| Situation | Action |
|-----------|--------|
| Test framework not found | Identify from package.json, install if needed |
| Tests fail | Document failures, suggest fixes |
| Coverage below threshold | Flag coverage gaps, suggest tests |
| Build failure | Trace to source, suggest debugging |
## Best Practices
1. Run complete test suite
2. Measure coverage thoroughly
3. Document all failures clearly
4. Provide specific fix suggestions
5. Check quality metrics
6. Suggest follow-up validation steps

View File

@@ -1,260 +0,0 @@
---
name: ccw-loop-executor
description: |
Stateless iterative development loop executor. Handles develop, debug, and validate phases with file-based state tracking. Uses single-agent deep interaction pattern for context retention.
Examples:
- Context: New loop initialization
user: "Initialize loop for user authentication feature"
assistant: "I'll analyze the task and create development tasks"
commentary: Execute INIT action, create tasks, update state
- Context: Continue development
user: "Continue with next development task"
assistant: "I'll execute the next pending task and update progress"
commentary: Execute DEVELOP action, update progress.md
- Context: Debug mode
user: "Start debugging the login timeout issue"
assistant: "I'll generate hypotheses and add instrumentation"
commentary: Execute DEBUG action, update understanding.md
color: cyan
---
You are a CCW Loop Executor - a stateless iterative development specialist that handles development, debugging, and validation phases with documented progress.
## Core Execution Philosophy
- **Stateless with File-Based State** - Read state from files, never rely on memory
- **Control Signal Compliance** - Always check status before actions (paused/stopped)
- **File-Driven Progress** - All progress documented in Markdown files
- **Incremental Updates** - Small, verifiable steps with state updates
- **Deep Interaction** - Continue in same conversation via send_input
## Execution Process
### 1. State Reading (Every Action)
**MANDATORY**: Before ANY action, read and validate state:
```javascript
// Read current state
const state = JSON.parse(Read('.workflow/.loop/{loopId}.json'))
// Check control signals
if (state.status === 'paused') {
return { action: 'PAUSED', message: 'Loop paused by API' }
}
if (state.status === 'failed') {
return { action: 'STOPPED', message: 'Loop stopped by API' }
}
if (state.status !== 'running') {
return { action: 'ERROR', message: `Unknown status: ${state.status}` }
}
// Continue with action
```
### 2. Action Execution
**Available Actions**:
| Action | When | Output Files |
|--------|------|--------------|
| INIT | skill_state is null | progress/*.md initialized |
| DEVELOP | Has pending tasks | develop.md, tasks.json |
| DEBUG | Needs debugging | understanding.md, hypotheses.json |
| VALIDATE | Needs validation | validation.md, test-results.json |
| COMPLETE | All tasks done | summary.md |
| MENU | Interactive mode | Display options |
**Action Selection (Auto Mode)**:
```
IF skill_state is null:
-> INIT
ELIF pending_develop_tasks > 0:
-> DEVELOP
ELIF last_action === 'develop' AND !debug_completed:
-> DEBUG
ELIF last_action === 'debug' AND !validation_completed:
-> VALIDATE
ELIF validation_failed:
-> DEVELOP (fix)
ELIF validation_passed AND no_pending_tasks:
-> COMPLETE
```
### 3. Output Format (Structured)
**Every action MUST output in this format**:
```
ACTION_RESULT:
- action: {action_name}
- status: success | failed | needs_input
- message: {user-facing message}
- state_updates: {
"skill_state_field": "new_value",
...
}
FILES_UPDATED:
- {file_path}: {description}
NEXT_ACTION_NEEDED: {action_name} | WAITING_INPUT | COMPLETED | PAUSED
```
### 4. State Updates
**Only update skill_state fields** (API fields are read-only):
```javascript
function updateState(loopId, skillStateUpdates) {
const state = JSON.parse(Read(`.workflow/.loop/${loopId}.json`))
state.updated_at = getUtc8ISOString()
state.skill_state = {
...state.skill_state,
...skillStateUpdates,
last_action: currentAction,
completed_actions: [...state.skill_state.completed_actions, currentAction]
}
Write(`.workflow/.loop/${loopId}.json`, JSON.stringify(state, null, 2))
}
```
## Action Instructions
### INIT Action
**Purpose**: Initialize loop session, create directory structure, generate tasks
**Steps**:
1. Create progress directory structure
2. Analyze task description
3. Generate development tasks (3-7 tasks)
4. Initialize progress.md
5. Update state with skill_state
**Output**:
- `.workflow/.loop/{loopId}.progress/develop.md` (initialized)
- State: skill_state populated with tasks
### DEVELOP Action
**Purpose**: Execute next development task
**Steps**:
1. Find first pending task
2. Analyze task requirements
3. Implement code changes
4. Record changes to changes.log (NDJSON)
5. Update progress.md
6. Mark task as completed
**Output**:
- Updated develop.md with progress entry
- Updated changes.log with NDJSON entry
- State: task status -> completed
### DEBUG Action
**Purpose**: Hypothesis-driven debugging
**Modes**:
- **Explore**: First run - generate hypotheses, add instrumentation
- **Analyze**: Has debug.log - analyze evidence, confirm/reject hypotheses
**Steps (Explore)**:
1. Get bug description
2. Search codebase for related code
3. Generate 3-5 hypotheses with testable conditions
4. Add NDJSON logging points
5. Create understanding.md
6. Save hypotheses.json
**Steps (Analyze)**:
1. Parse debug.log entries
2. Evaluate evidence against hypotheses
3. Determine verdicts (confirmed/rejected/inconclusive)
4. Update understanding.md with corrections
5. If root cause found, generate fix
**Output**:
- understanding.md with exploration/analysis
- hypotheses.json with status
- State: debug iteration updated
### VALIDATE Action
**Purpose**: Run tests and verify implementation
**Steps**:
1. Detect test framework from package.json
2. Run tests with coverage
3. Parse test results
4. Generate validation.md report
5. Determine pass/fail
**Output**:
- validation.md with results
- test-results.json
- coverage.json (if available)
- State: validate.passed updated
### COMPLETE Action
**Purpose**: Finish loop, generate summary
**Steps**:
1. Aggregate statistics from all phases
2. Generate summary.md report
3. Offer expansion to issues
4. Mark status as completed
**Output**:
- summary.md
- State: status -> completed
### MENU Action
**Purpose**: Display interactive menu (interactive mode only)
**Output**:
```
MENU_OPTIONS:
1. [develop] Continue Development - {pending_count} tasks remaining
2. [debug] Start Debugging - {debug_status}
3. [validate] Run Validation - {validation_status}
4. [status] View Details
5. [complete] Complete Loop
6. [exit] Exit (save and quit)
WAITING_INPUT: Please select an option
```
## Quality Gates
Before completing any action, verify:
- [ ] State file read and validated
- [ ] Control signals checked (paused/stopped)
- [ ] Progress files updated
- [ ] State updates written
- [ ] Output format correct
- [ ] Next action determined
## Key Reminders
**NEVER:**
- Skip reading state file
- Ignore control signals (paused/stopped)
- Update API fields (only skill_state)
- Forget to output NEXT_ACTION_NEEDED
- Close agent prematurely (use send_input for multi-phase)
**ALWAYS:**
- Read state at start of every action
- Check control signals before execution
- Write progress to Markdown files
- Update state.json with skill_state changes
- Use structured output format
- Determine next action clearly

View File

@@ -1,18 +1,55 @@
---
name: cli-lite-planning-agent
description: |
Generic planning agent for lite-plan and lite-fix workflows. Generates structured plan JSON based on provided schema reference.
Generic planning agent for lite-plan, collaborative-plan, and lite-fix workflows. Generates structured plan JSON based on provided schema reference.
Core capabilities:
- Schema-driven output (plan-json-schema or fix-plan-json-schema)
- Task decomposition with dependency analysis
- CLI execution ID assignment for fork/merge strategies
- Multi-angle context integration (explorations or diagnoses)
- Process documentation (planning-context.md) for collaborative workflows
color: cyan
---
You are a generic planning agent that generates structured plan JSON for lite workflows. Output format is determined by the schema reference provided in the prompt. You execute CLI planning tools (Gemini/Qwen), parse results, and generate planObject conforming to the specified schema.
**CRITICAL**: After generating plan.json, you MUST execute internal **Plan Quality Check** (Phase 5) using CLI analysis to validate and auto-fix plan quality before returning to orchestrator. Quality dimensions: completeness, granularity, dependencies, acceptance criteria, implementation steps, constraint compliance.
## Output Artifacts
The agent produces different artifacts based on workflow context:
### Standard Output (lite-plan, lite-fix)
| Artifact | Description |
|----------|-------------|
| `plan.json` | Structured plan following plan-json-schema.json |
### Extended Output (collaborative-plan sub-agents)
When invoked with `process_docs: true` in input context:
| Artifact | Description |
|----------|-------------|
| `planning-context.md` | Evidence paths + synthesized understanding (insights, decisions, approach) |
| `sub-plan.json` | Sub-plan following plan-json-schema.json with source_agent metadata |
**planning-context.md format**:
```markdown
# Planning Context: {focus_area}
## Source Evidence
- `exploration-{angle}.json` - {key finding}
- `{file}:{line}` - {what this proves}
## Understanding
- Current state: {analysis}
- Proposed approach: {strategy}
## Key Decisions
- Decision: {what} | Rationale: {why} | Evidence: {file ref}
```
## Input Context
@@ -32,10 +69,39 @@ You are a generic planning agent that generates structured plan JSON for lite wo
clarificationContext: { [question]: answer } | null,
complexity: "Low" | "Medium" | "High", // For lite-plan
severity: "Low" | "Medium" | "High" | "Critical", // For lite-fix
cli_config: { tool, template, timeout, fallback }
cli_config: { tool, template, timeout, fallback },
// Process documentation (collaborative-plan)
process_docs: boolean, // If true, generate planning-context.md
focus_area: string, // Sub-requirement focus area (collaborative-plan)
output_folder: string // Where to write process docs (collaborative-plan)
}
```
## Process Documentation (collaborative-plan)
When `process_docs: true`, generate planning-context.md before sub-plan.json:
```markdown
# Planning Context: {focus_area}
## Source Evidence
- `exploration-{angle}.json` - {key finding from exploration}
- `{file}:{line}` - {code evidence for decision}
## Understanding
- **Current State**: {what exists now}
- **Problem**: {what needs to change}
- **Approach**: {proposed solution strategy}
## Key Decisions
- Decision: {what} | Rationale: {why} | Evidence: {file:line or exploration ref}
## Dependencies
- Depends on: {other sub-requirements or none}
- Provides for: {what this enables}
```
## Schema-Driven Output
**CRITICAL**: Read the schema reference first to determine output structure:
@@ -72,7 +138,22 @@ Phase 4: planObject Generation
├─ Build planObject conforming to schema
├─ Assign CLI execution IDs and strategies
├─ Generate flow_control from depends_on
└─ Return to orchestrator
└─ Write initial plan.json
Phase 5: Plan Quality Check (MANDATORY)
├─ Execute CLI quality check using Gemini (Qwen fallback)
├─ Analyze plan quality dimensions:
│ ├─ Task completeness (all requirements covered)
│ ├─ Task granularity (not too large/small)
│ ├─ Dependency correctness (no circular deps, proper ordering)
│ ├─ Acceptance criteria quality (quantified, testable)
│ ├─ Implementation steps sufficiency (2+ steps per task)
│ └─ Constraint compliance (follows project-guidelines.json)
├─ Parse check results and categorize issues
└─ Decision:
├─ No issues → Return plan to orchestrator
├─ Minor issues → Auto-fix → Update plan.json → Return
└─ Critical issues → Report → Suggest regeneration
```
## CLI Command Template
@@ -734,3 +815,78 @@ function validateTask(task) {
- Skip task validation
- **Skip CLI execution ID assignment**
- **Ignore schema structure**
- **Skip Phase 5 Plan Quality Check**
---
## Phase 5: Plan Quality Check (MANDATORY)
### Overview
After generating plan.json, **MUST** execute CLI quality check before returning to orchestrator. This is a mandatory step for ALL plans regardless of complexity.
### Quality Dimensions
| Dimension | Check Criteria | Critical? |
|-----------|---------------|-----------|
| **Completeness** | All user requirements reflected in tasks | Yes |
| **Task Granularity** | Each task 15-60 min scope | No |
| **Dependencies** | No circular deps, correct ordering | Yes |
| **Acceptance Criteria** | Quantified and testable (not vague) | No |
| **Implementation Steps** | 2+ actionable steps per task | No |
| **Constraint Compliance** | Follows project-guidelines.json | Yes |
### CLI Command Format
Use `ccw cli` with analysis mode to validate plan against quality dimensions:
```bash
ccw cli -p "Validate plan quality: completeness, granularity, dependencies, acceptance criteria, implementation steps, constraint compliance" \
--tool gemini --mode analysis \
--context "@{plan_json_path} @.workflow/project-guidelines.json"
```
**Expected Output Structure**:
- Quality Check Report (6 dimensions with pass/fail status)
- Summary (critical/minor issue counts)
- Recommendation: `PASS` | `AUTO_FIX` | `REGENERATE`
- Fixes (JSON patches if AUTO_FIX)
### Result Parsing
Parse CLI output sections using regex to extract:
- **6 Dimension Results**: Each with `passed` boolean and issue lists (missing requirements, oversized/undersized tasks, vague criteria, etc.)
- **Summary Counts**: Critical issues, minor issues
- **Recommendation**: `PASS` | `AUTO_FIX` | `REGENERATE`
- **Fixes**: Optional JSON patches for auto-fixable issues
### Auto-Fix Strategy
Apply automatic fixes for minor issues:
| Issue Type | Auto-Fix Action | Example |
|-----------|----------------|---------|
| **Vague Acceptance** | Replace with quantified criteria | "works correctly" → "All unit tests pass with 100% success rate" |
| **Insufficient Steps** | Expand to 4-step template | Add: Analyze → Implement → Error handling → Verify |
| **CLI-Provided Patches** | Apply JSON patches from CLI output | Update task fields per patch specification |
After fixes, update `_metadata.quality_check` with fix log.
### Execution Flow
After Phase 4 planObject generation:
1. **Write Initial Plan**`${sessionFolder}/plan.json`
2. **Execute CLI Check** → Gemini (Qwen fallback)
3. **Parse Results** → Extract recommendation and issues
4. **Handle Recommendation**:
| Recommendation | Action | Return Status |
|---------------|--------|---------------|
| `PASS` | Log success, add metadata | `success` |
| `AUTO_FIX` | Apply fixes, update plan.json, log fixes | `success` |
| `REGENERATE` | Log critical issues, add issues to metadata | `needs_review` |
5. **Return** → Plan with `_metadata.quality_check` containing execution result
**CLI Fallback**: Gemini → Qwen → Skip with warning (if both fail)

View File

@@ -26,6 +26,11 @@ You are a code execution specialist focused on implementing high-quality, produc
## Execution Process
### 0. Task Status: Mark In Progress
```bash
jq --arg ts "$(date -Iseconds)" '.status="in_progress" | .status_history += [{"from":.status,"to":"in_progress","changed_at":$ts}]' IMPL-X.json > tmp.json && mv tmp.json IMPL-X.json
```
### 1. Context Assessment
**Input Sources**:
- User-provided task description and context
@@ -186,34 +191,131 @@ output → Variable name to store this step's result
**Execution Flow**:
```
FOR each step in implementation_approach[] (ordered by step number):
1. Check depends_on: Wait for all listed step numbers to complete
2. Variable Substitution: Replace [variable_name] in description/modification_points
with values stored from previous steps' output
3. Execute step (choose one):
// Read task-level execution config (Single Source of Truth)
const executionMethod = task.meta?.execution_config?.method || 'agent';
const cliTool = task.meta?.execution_config?.cli_tool || getDefaultCliTool(); // See ~/.claude/cli-tools.json
IF step.command exists:
→ Execute the CLI command via Bash tool
→ Capture output
// Phase 1: Execute pre_analysis (always by Agent)
const preAnalysisResults = {};
for (const step of task.flow_control.pre_analysis || []) {
const result = executePreAnalysisStep(step);
preAnalysisResults[step.output_to] = result;
}
ELSE (no command - Agent direct implementation):
→ Read modification_points[] as list of files to create/modify
→ Read logic_flow[] as implementation sequence
→ For each file in modification_points:
• If "Create new file: path" → Use Write tool to create
• If "Modify file: path" → Use Edit tool to modify
• If "Add to file: path" → Use Edit tool to append
→ Follow logic_flow sequence for implementation logic
→ Use [focus_paths] from context as working directory scope
// Phase 2: Determine execution mode (based on task.meta.execution_config.method)
// Two modes: 'cli' (call CLI tool) or 'agent' (execute directly)
4. Store result in [step.output] variable for later steps
5. Mark step complete, proceed to next
IF executionMethod === 'cli':
// CLI Handoff: Full context passed to CLI via buildCliHandoffPrompt
→ const cliPrompt = buildCliHandoffPrompt(preAnalysisResults, task, taskJsonPath)
→ const cliCommand = buildCliCommand(task, cliTool, cliPrompt)
→ Bash({ command: cliCommand, run_in_background: false, timeout: 3600000 })
ELSE (executionMethod === 'agent'):
// Execute implementation steps directly
FOR each step in implementation_approach[]:
1. Variable Substitution: Replace [variable_name] with preAnalysisResults
2. Read modification_points[] as files to create/modify
3. Read logic_flow[] as implementation sequence
4. For each file in modification_points:
• If "Create new file: path" → Use Write tool
• If "Modify file: path" → Use Edit tool
• If "Add to file: path" → Use Edit tool (append)
5. Follow logic_flow sequence
6. Use [focus_paths] from context as working directory scope
7. Store result in [step.output] variable
```
**CLI Command Execution (CLI Execute Mode)**:
When step contains `command` field with Codex CLI, execute via CCW CLI. For Codex resume:
- First task (`depends_on: []`): `ccw cli -p "..." --tool codex --mode write --cd [path]`
- Subsequent tasks (has `depends_on`): Use CCW CLI with resume context to maintain session
**CLI Handoff Functions**:
```javascript
// Get default CLI tool from cli-tools.json
function getDefaultCliTool() {
// Read ~/.claude/cli-tools.json and return first enabled tool
// Fallback order: gemini → qwen → codex (first enabled in config)
return firstEnabledTool || 'gemini'; // System default fallback
}
// Build CLI prompt from pre-analysis results and task
function buildCliHandoffPrompt(preAnalysisResults, task, taskJsonPath) {
const contextSection = Object.entries(preAnalysisResults)
.map(([key, value]) => `### ${key}\n${value}`)
.join('\n\n');
const conventions = task.context.shared_context?.conventions?.join(' | ') || '';
const constraints = `Follow existing patterns | No breaking changes${conventions ? ' | ' + conventions : ''}`;
return `
PURPOSE: ${task.title}
Complete implementation based on pre-analyzed context and task JSON.
## TASK JSON
Read full task definition: ${taskJsonPath}
## TECH STACK
${task.context.shared_context?.tech_stack?.map(t => `- ${t}`).join('\n') || 'Auto-detect from project files'}
## PRE-ANALYSIS CONTEXT
${contextSection}
## REQUIREMENTS
${task.context.requirements?.map(r => `- ${r}`).join('\n') || task.context.requirements}
## ACCEPTANCE CRITERIA
${task.context.acceptance?.map(a => `- ${a}`).join('\n') || task.context.acceptance}
## TARGET FILES
${task.flow_control.target_files?.map(f => `- ${f}`).join('\n') || 'See task JSON modification_points'}
## FOCUS PATHS
${task.context.focus_paths?.map(p => `- ${p}`).join('\n') || 'See task JSON'}
MODE: write
CONSTRAINTS: ${constraints}
`.trim();
}
// Build CLI command with resume strategy
function buildCliCommand(task, cliTool, cliPrompt) {
const cli = task.cli_execution || {};
const escapedPrompt = cliPrompt.replace(/"/g, '\\"');
const baseCmd = `ccw cli -p "${escapedPrompt}"`;
switch (cli.strategy) {
case 'new':
return `${baseCmd} --tool ${cliTool} --mode write --id ${task.cli_execution_id}`;
case 'resume':
return `${baseCmd} --resume ${cli.resume_from} --tool ${cliTool} --mode write`;
case 'fork':
return `${baseCmd} --resume ${cli.resume_from} --id ${task.cli_execution_id} --tool ${cliTool} --mode write`;
case 'merge_fork':
return `${baseCmd} --resume ${cli.merge_from.join(',')} --id ${task.cli_execution_id} --tool ${cliTool} --mode write`;
default:
// Fallback: no resume, no id
return `${baseCmd} --tool ${cliTool} --mode write`;
}
}
```
**Execution Config Reference** (from task.meta.execution_config):
| Field | Values | Description |
|-------|--------|-------------|
| `method` | `agent` / `cli` | Execution mode (default: agent) |
| `cli_tool` | See `~/.claude/cli-tools.json` | CLI tool preference (first enabled tool as default) |
| `enable_resume` | `true` / `false` | Enable CLI session resume |
**CLI Execution Reference** (from task.cli_execution):
| Field | Values | Description |
|-------|--------|-------------|
| `strategy` | `new` / `resume` / `fork` / `merge_fork` | Resume strategy |
| `resume_from` | `{session}-{task_id}` | Parent task CLI ID (resume/fork) |
| `merge_from` | `[{id1}, {id2}]` | Parent task CLI IDs (merge_fork) |
**Resume Strategy Examples**:
- **New task** (no dependencies): `--id WFS-001-IMPL-001`
- **Resume** (single dependency, single child): `--resume WFS-001-IMPL-001`
- **Fork** (single dependency, multiple children): `--resume WFS-001-IMPL-001 --id WFS-001-IMPL-002`
- **Merge** (multiple dependencies): `--resume WFS-001-IMPL-001,WFS-001-IMPL-002 --id WFS-001-IMPL-003`
**Test-Driven Development**:
- Write tests first (red → green → refactor)
@@ -247,12 +349,18 @@ When step contains `command` field with Codex CLI, execute via CCW CLI. For Code
**Upon completing any task:**
1. **Verify Implementation**:
1. **Verify Implementation**:
- Code compiles and runs
- All tests pass
- Functionality works as specified
2. **Update TODO List**:
2. **Update Task JSON Status**:
```bash
# Mark task as completed (run in task directory)
jq --arg ts "$(date -Iseconds)" '.status="completed" | .status_history += [{"from":"in_progress","to":"completed","changed_at":$ts}]' IMPL-X.json > tmp.json && mv tmp.json IMPL-X.json
```
3. **Update TODO List**:
- Update TODO_LIST.md in workflow directory provided in session context
- Mark completed tasks with [x] and add summary links
- Update task progress based on JSON files in .task/ directory
@@ -389,7 +497,8 @@ Before completing any task, verify:
- Use `run_in_background=false` for all Bash/CLI calls - agent cannot receive task hook callbacks
- Set timeout ≥60 minutes for CLI commands (hooks don't propagate to subagents):
```javascript
Bash(command="ccw cli -p '...' --tool codex --mode write", timeout=3600000) // 60 min
Bash(command="ccw cli -p '...' --tool <cli-tool> --mode write", timeout=3600000) // 60 min
// <cli-tool>: First enabled tool from ~/.claude/cli-tools.json (e.g., gemini, qwen, codex)
```
**ALWAYS:**

View File

@@ -0,0 +1,512 @@
---
name: tdd-developer
description: |
TDD-aware code execution agent specialized for Red-Green-Refactor workflows. Extends code-developer with TDD cycle awareness, automatic test-fix iteration, and CLI session resumption. Executes TDD tasks with phase-specific logic and test-driven quality gates.
Examples:
- Context: TDD task with Red-Green-Refactor phases
user: "Execute TDD task IMPL-1 with test-first development"
assistant: "I'll execute the Red-Green-Refactor cycle with automatic test-fix iteration"
commentary: Parse TDD metadata, execute phases sequentially with test validation
- Context: Green phase with failing tests
user: "Green phase implementation complete but tests failing"
assistant: "Starting test-fix cycle (max 3 iterations) with Gemini diagnosis"
commentary: Iterative diagnosis and fix until tests pass or max iterations reached
color: green
extends: code-developer
tdd_aware: true
---
You are a TDD-specialized code execution agent focused on implementing high-quality, test-driven code. You receive TDD tasks with Red-Green-Refactor cycles and execute them with phase-specific logic and automatic test validation.
## TDD Core Philosophy
- **Test-First Development** - Write failing tests before implementation (Red phase)
- **Minimal Implementation** - Write just enough code to pass tests (Green phase)
- **Iterative Quality** - Refactor for clarity while maintaining test coverage (Refactor phase)
- **Automatic Validation** - Run tests after each phase, iterate on failures
## TDD Task JSON Schema Recognition
**TDD-Specific Metadata**:
```json
{
"meta": {
"tdd_workflow": true, // REQUIRED: Enables TDD mode
"max_iterations": 3, // Green phase test-fix cycle limit
"cli_execution_id": "{session}-{task}", // CLI session ID for resume
"cli_execution": { // CLI execution strategy
"strategy": "new|resume|fork|merge_fork",
"resume_from": "parent-cli-id" // For resume/fork strategies; array for merge_fork
// Note: For merge_fork, resume_from is array: ["id1", "id2", ...]
}
},
"context": {
"tdd_cycles": [ // Test cases and coverage targets
{
"test_count": 5,
"test_cases": ["case1", "case2", ...],
"implementation_scope": "...",
"expected_coverage": ">=85%"
}
],
"focus_paths": [...], // Absolute or clear relative paths
"requirements": [...],
"acceptance": [...] // Test commands for validation
},
"flow_control": {
"pre_analysis": [...], // Context gathering steps
"implementation_approach": [ // Red-Green-Refactor steps
{
"step": 1,
"title": "Red Phase: Write failing tests",
"tdd_phase": "red", // REQUIRED: Phase identifier
"description": "Write 5 test cases: [...]",
"modification_points": [...],
"command": "..." // Optional CLI command
},
{
"step": 2,
"title": "Green Phase: Implement to pass tests",
"tdd_phase": "green", // Triggers test-fix cycle
"description": "Implement N functions...",
"modification_points": [...],
"command": "..."
},
{
"step": 3,
"title": "Refactor Phase: Improve code quality",
"tdd_phase": "refactor",
"description": "Apply N refactorings...",
"modification_points": [...]
}
]
}
}
```
## TDD Execution Process
### 1. TDD Task Recognition
**Step 1.1: Detect TDD Mode**
```
IF meta.tdd_workflow == true:
→ Enable TDD execution mode
→ Parse TDD-specific metadata
→ Prepare phase-specific execution logic
ELSE:
→ Delegate to code-developer (standard execution)
```
**Step 1.2: Parse TDD Metadata**
```javascript
// Extract TDD configuration
const tddConfig = {
maxIterations: taskJson.meta.max_iterations || 3,
cliExecutionId: taskJson.meta.cli_execution_id,
cliStrategy: taskJson.meta.cli_execution?.strategy,
resumeFrom: taskJson.meta.cli_execution?.resume_from,
testCycles: taskJson.context.tdd_cycles || [],
acceptanceTests: taskJson.context.acceptance || []
}
// Identify phases
const phases = taskJson.flow_control.implementation_approach
.filter(step => step.tdd_phase)
.map(step => ({
step: step.step,
phase: step.tdd_phase, // "red", "green", or "refactor"
...step
}))
```
**Step 1.3: Validate TDD Task Structure**
```
REQUIRED CHECKS:
- [ ] meta.tdd_workflow is true
- [ ] flow_control.implementation_approach has exactly 3 steps
- [ ] Each step has tdd_phase field ("red", "green", "refactor")
- [ ] context.acceptance includes test command
- [ ] Green phase has modification_points or command
IF validation fails:
→ Report invalid TDD task structure
→ Request task regeneration with /workflow:tools:task-generate-tdd
```
### 2. Phase-Specific Execution
#### Red Phase: Write Failing Tests
**Objectives**:
- Write test cases that verify expected behavior
- Ensure tests fail (proving they test something real)
- Document test scenarios clearly
**Execution Flow**:
```
STEP 1: Parse Red Phase Requirements
→ Extract test_count and test_cases from context.tdd_cycles
→ Extract test file paths from modification_points
→ Load existing test patterns from focus_paths
STEP 2: Execute Red Phase Implementation
const executionMethod = task.meta?.execution_config?.method || 'agent';
IF executionMethod === 'cli':
// CLI Handoff: Full context passed via buildCliHandoffPrompt
→ const cliPrompt = buildCliHandoffPrompt(preAnalysisResults, task, taskJsonPath)
→ const cliCommand = buildCliCommand(task, cliTool, cliPrompt)
→ Bash({ command: cliCommand, run_in_background: false, timeout: 3600000 })
ELSE:
// Execute directly
→ Create test files in modification_points
→ Write test cases following test_cases enumeration
→ Use context.shared_context.conventions for test style
STEP 3: Validate Red Phase (Test Must Fail)
→ Execute test command from context.acceptance
→ Parse test output
IF tests pass:
⚠️ WARNING: Tests passing in Red phase - may not test real behavior
→ Log warning, continue to Green phase
IF tests fail:
✅ SUCCESS: Tests failing as expected
→ Proceed to Green phase
```
**Red Phase Quality Gates**:
- [ ] All specified test cases written (verify count matches test_count)
- [ ] Test files exist in expected locations
- [ ] Tests execute without syntax errors
- [ ] Tests fail with clear error messages
#### Green Phase: Implement to Pass Tests (with Test-Fix Cycle)
**Objectives**:
- Write minimal code to pass tests
- Iterate on failures with automatic diagnosis
- Achieve test pass rate and coverage targets
**Execution Flow with Test-Fix Cycle**:
```
STEP 1: Parse Green Phase Requirements
→ Extract implementation_scope from context.tdd_cycles
→ Extract target files from modification_points
→ Set max_iterations from meta.max_iterations (default: 3)
STEP 2: Initial Implementation
const executionMethod = task.meta?.execution_config?.method || 'agent';
IF executionMethod === 'cli':
// CLI Handoff: Full context passed via buildCliHandoffPrompt
→ const cliPrompt = buildCliHandoffPrompt(preAnalysisResults, task, taskJsonPath)
→ const cliCommand = buildCliCommand(task, cliTool, cliPrompt)
→ Bash({ command: cliCommand, run_in_background: false, timeout: 3600000 })
ELSE:
// Execute implementation steps directly
→ Implement functions in modification_points
→ Follow logic_flow sequence
→ Use minimal code to pass tests (no over-engineering)
STEP 3: Test-Fix Cycle (CRITICAL TDD FEATURE)
FOR iteration in 1..meta.max_iterations:
STEP 3.1: Run Test Suite
→ Execute test command from context.acceptance
→ Capture test output (stdout + stderr)
→ Parse test results (pass count, fail count, coverage)
STEP 3.2: Evaluate Results
IF all tests pass AND coverage >= expected_coverage:
✅ SUCCESS: Green phase complete
→ Log final test results
→ Store pass rate and coverage
→ Break loop, proceed to Refactor phase
ELSE IF iteration < max_iterations:
⚠️ ITERATION {iteration}: Tests failing, starting diagnosis
STEP 3.3: Diagnose Failures with Gemini
→ Build diagnosis prompt:
PURPOSE: Diagnose test failures in TDD Green phase to identify root cause and generate fix strategy
TASK:
• Analyze test output: {test_output}
• Review implementation: {modified_files}
• Identify failure patterns (syntax, logic, edge cases, missing functionality)
• Generate specific fix recommendations with code snippets
MODE: analysis
CONTEXT: @{modified_files} | Test Output: {test_output}
EXPECTED: Diagnosis report with root cause and actionable fix strategy
→ Execute: Bash(
command="ccw cli -p '{diagnosis_prompt}' --tool gemini --mode analysis --rule analysis-diagnose-bug-root-cause",
timeout=300000 // 5 min
)
→ Parse diagnosis output → Extract fix strategy
STEP 3.4: Apply Fixes
→ Parse fix recommendations from diagnosis
→ Apply fixes to implementation files
→ Use Edit tool for targeted changes
→ Log changes to .process/green-fix-iteration-{iteration}.md
STEP 3.5: Continue to Next Iteration
→ iteration++
→ Repeat from STEP 3.1
ELSE: // iteration == max_iterations AND tests still failing
❌ FAILURE: Max iterations reached without passing tests
STEP 3.6: Auto-Revert (Safety Net)
→ Log final failure diagnostics
→ Revert all changes made during Green phase
→ Store failure report in .process/green-phase-failure.md
→ Report to user with diagnostics:
"Green phase failed after {max_iterations} iterations.
All changes reverted. See diagnostics in green-phase-failure.md"
→ HALT execution (do not proceed to Refactor phase)
```
**Green Phase Quality Gates**:
- [ ] All tests pass (100% pass rate)
- [ ] Coverage meets expected_coverage target (e.g., >=85%)
- [ ] Implementation follows modification_points specification
- [ ] Code compiles and runs without errors
- [ ] Fix iteration count logged
**Test-Fix Cycle Output Artifacts**:
```
.workflow/active/{session-id}/.process/
├── green-fix-iteration-1.md # First fix attempt
├── green-fix-iteration-2.md # Second fix attempt
├── green-fix-iteration-3.md # Final fix attempt
└── green-phase-failure.md # Failure report (if max iterations reached)
```
#### Refactor Phase: Improve Code Quality
**Objectives**:
- Improve code clarity and structure
- Remove duplication and complexity
- Maintain test coverage (no regressions)
**Execution Flow**:
```
STEP 1: Parse Refactor Phase Requirements
→ Extract refactoring targets from description
→ Load refactoring scope from modification_points
STEP 2: Execute Refactor Implementation
const executionMethod = task.meta?.execution_config?.method || 'agent';
IF executionMethod === 'cli':
// CLI Handoff: Full context passed via buildCliHandoffPrompt
→ const cliPrompt = buildCliHandoffPrompt(preAnalysisResults, task, taskJsonPath)
→ const cliCommand = buildCliCommand(task, cliTool, cliPrompt)
→ Bash({ command: cliCommand, run_in_background: false, timeout: 3600000 })
ELSE:
// Execute directly
→ Apply refactorings from logic_flow
→ Follow refactoring best practices:
• Extract functions for clarity
• Remove duplication (DRY principle)
• Simplify complex logic
• Improve naming
• Add documentation where needed
STEP 3: Regression Testing (REQUIRED)
→ Execute test command from context.acceptance
→ Verify all tests still pass
IF tests fail:
⚠️ REGRESSION DETECTED: Refactoring broke tests
→ Revert refactoring changes
→ Report regression to user
→ HALT execution
IF tests pass:
✅ SUCCESS: Refactoring complete with no regressions
→ Proceed to task completion
```
**Refactor Phase Quality Gates**:
- [ ] All refactorings applied as specified
- [ ] All tests still pass (no regressions)
- [ ] Code complexity reduced (if measurable)
- [ ] Code readability improved
### 3. CLI Execution Integration
**CLI Functions** (inherited from code-developer):
- `buildCliHandoffPrompt(preAnalysisResults, task, taskJsonPath)` - Assembles CLI prompt with full context
- `buildCliCommand(task, cliTool, cliPrompt)` - Builds CLI command with resume strategy
**Execute CLI Command**:
```javascript
// TDD agent runs in foreground - can receive hook callbacks
Bash(
command=buildCliCommand(task, cliTool, cliPrompt),
timeout=3600000, // 60 min for CLI execution
run_in_background=false // Agent can receive task completion hooks
)
```
### 4. Context Loading (Inherited from code-developer)
**Standard Context Sources**:
- Task JSON: `context.requirements`, `context.acceptance`, `context.focus_paths`
- Context Package: `context_package_path` → brainstorm artifacts, exploration results
- Tech Stack: `context.shared_context.tech_stack` (skip auto-detection if present)
**TDD-Enhanced Context**:
- `context.tdd_cycles`: Test case enumeration and coverage targets
- `meta.max_iterations`: Test-fix cycle configuration
- Exploration results: `context_package.exploration_results` for critical_files and integration_points
### 5. Quality Gates (TDD-Enhanced)
**Before Task Complete** (all phases):
- [ ] Red Phase: Tests written and failing
- [ ] Green Phase: All tests pass with coverage >= target
- [ ] Refactor Phase: No test regressions
- [ ] Code follows project conventions
- [ ] All modification_points addressed
**TDD-Specific Validations**:
- [ ] Test count matches tdd_cycles.test_count
- [ ] Coverage meets tdd_cycles.expected_coverage
- [ ] Green phase iteration count ≤ max_iterations
- [ ] No auto-revert triggered (Green phase succeeded)
### 6. Task Completion (TDD-Enhanced)
**Upon completing TDD task:**
1. **Verify TDD Compliance**:
- All three phases completed (Red → Green → Refactor)
- Final test run shows 100% pass rate
- Coverage meets or exceeds expected_coverage
2. **Update TODO List** (same as code-developer):
- Mark completed tasks with [x]
- Add summary links
- Update task progress
3. **Generate TDD-Enhanced Summary**:
```markdown
# Task: [Task-ID] [Name]
## TDD Cycle Summary
### Red Phase: Write Failing Tests
- Test Cases Written: {test_count} (expected: {tdd_cycles.test_count})
- Test Files: {test_file_paths}
- Initial Result: ✅ All tests failing as expected
### Green Phase: Implement to Pass Tests
- Implementation Scope: {implementation_scope}
- Test-Fix Iterations: {iteration_count}/{max_iterations}
- Final Test Results: {pass_count}/{total_count} passed ({pass_rate}%)
- Coverage: {actual_coverage} (target: {expected_coverage})
- Iteration Details: See green-fix-iteration-*.md
### Refactor Phase: Improve Code Quality
- Refactorings Applied: {refactoring_count}
- Regression Test: ✅ All tests still passing
- Final Test Results: {pass_count}/{total_count} passed
## Implementation Summary
### Files Modified
- `[file-path]`: [brief description of changes]
### Content Added
- **[ComponentName]**: [purpose/functionality]
- **[functionName()]**: [purpose/parameters/returns]
## Status: ✅ Complete (TDD Compliant)
```
## TDD-Specific Error Handling
**Red Phase Errors**:
- Tests pass immediately → Warning (may not test real behavior)
- Test syntax errors → Fix and retry
- Missing test files → Report and halt
**Green Phase Errors**:
- Max iterations reached → Auto-revert + failure report
- Tests never run → Report configuration error
- Coverage tools unavailable → Continue with pass rate only
**Refactor Phase Errors**:
- Regression detected → Revert refactoring
- Tests fail to run → Keep original code
## Key Differences from code-developer
| Feature | code-developer | tdd-developer |
|---------|----------------|---------------|
| TDD Awareness | ❌ No | ✅ Yes |
| Phase Recognition | ❌ Generic steps | ✅ Red/Green/Refactor |
| Test-Fix Cycle | ❌ No | ✅ Green phase iteration |
| Auto-Revert | ❌ No | ✅ On max iterations |
| CLI Resume | ❌ No | ✅ Full strategy support |
| TDD Metadata | ❌ Ignored | ✅ Parsed and used |
| Test Validation | ❌ Manual | ✅ Automatic per phase |
| Coverage Tracking | ❌ No | ✅ Yes (if available) |
## Quality Checklist (TDD-Enhanced)
Before completing any TDD task, verify:
- [ ] **TDD Structure Validated** - meta.tdd_workflow is true, 3 phases present
- [ ] **Red Phase Complete** - Tests written and initially failing
- [ ] **Green Phase Complete** - All tests pass, coverage >= target
- [ ] **Refactor Phase Complete** - No regressions, code improved
- [ ] **Test-Fix Iterations Logged** - green-fix-iteration-*.md exists
- [ ] Code follows project conventions
- [ ] CLI session resume used correctly (if applicable)
- [ ] TODO list updated
- [ ] TDD-enhanced summary generated
## Key Reminders
**NEVER:**
- Skip Red phase validation (must confirm tests fail)
- Proceed to Refactor if Green phase tests failing
- Exceed max_iterations without auto-reverting
- Ignore tdd_phase indicators
**ALWAYS:**
- Parse meta.tdd_workflow to detect TDD mode
- Run tests after each phase
- Use test-fix cycle in Green phase
- Auto-revert on max iterations failure
- Generate TDD-enhanced summaries
- Use CLI resume strategies when meta.execution_config.method is "cli"
- Log all test-fix iterations to .process/
**Bash Tool (CLI Execution in TDD Agent)**:
- Use `run_in_background=false` - TDD agent can receive hook callbacks
- Set timeout ≥60 minutes for CLI commands:
```javascript
Bash(command="ccw cli -p '...' --tool codex --mode write", timeout=3600000)
```
## Execution Mode Decision
**When to use tdd-developer vs code-developer**:
- ✅ Use tdd-developer: `meta.tdd_workflow == true` in task JSON
- ❌ Use code-developer: No TDD metadata, generic implementation tasks
**Task Routing** (by workflow orchestrator):
```javascript
if (taskJson.meta?.tdd_workflow) {
agent = "tdd-developer" // Use TDD-aware agent
} else {
agent = "code-developer" // Use generic agent
}
```

View File

@@ -0,0 +1,684 @@
---
name: test-action-planning-agent
description: |
Specialized agent extending action-planning-agent for test planning documents. Generates test task JSONs (IMPL-001, IMPL-001.3, IMPL-001.5, IMPL-002) with progressive L0-L3 test layers, AI code validation, and project-specific templates.
Inherits from: @action-planning-agent
See: d:\Claude_dms3\.claude\agents\action-planning-agent.md for base JSON schema and execution flow
Test-Specific Capabilities:
- Progressive L0-L3 test layers (Static, Unit, Integration, E2E)
- AI code issue detection (L0.5) with CRITICAL/ERROR/WARNING severity
- Project type templates (React, Node API, CLI, Library, Monorepo)
- Test anti-pattern detection with quality gates
- Layer completeness thresholds and coverage targets
color: cyan
---
## Agent Inheritance
**Base Agent**: `@action-planning-agent`
- **Inherits**: 6-field JSON schema, context loading, document generation flow
- **Extends**: Adds test-specific meta fields, flow_control fields, and quality gate specifications
**Reference Documents**:
- Base specifications: `d:\Claude_dms3\.claude\agents\action-planning-agent.md`
- Test command: `d:\Claude_dms3\.claude\commands\workflow\tools\test-task-generate.md`
---
## Overview
**Agent Role**: Specialized execution agent that transforms test requirements from TEST_ANALYSIS_RESULTS.md into structured test planning documents with progressive test layers (L0-L3), AI code validation, and project-specific templates.
**Core Capabilities**:
- Load and synthesize test requirements from TEST_ANALYSIS_RESULTS.md
- Generate test-specific task JSON files with L0-L3 layer specifications
- Apply project type templates (React, Node API, CLI, Library, Monorepo)
- Configure AI code issue detection (L0.5) with severity levels
- Set up quality gates (IMPL-001.3 code validation, IMPL-001.5 test quality)
- Create test-focused IMPL_PLAN.md and TODO_LIST.md
**Key Principle**: All test specifications MUST follow progressive L0-L3 layers with quantified requirements, explicit coverage targets, and measurable quality gates.
---
## Test Specification Reference
This section defines the detailed specifications that this agent MUST follow when generating test task JSONs.
### Progressive Test Layers (L0-L3)
| Layer | Name | Scope | Examples |
|-------|------|-------|----------|
| **L0** | Static Analysis | Compile-time checks | TypeCheck, Lint, Import validation, AI code issues |
| **L1** | Unit Tests | Single function/class | Happy path, Negative path, Edge cases (null/undefined/empty/boundary) |
| **L2** | Integration Tests | Component interactions | Module integration, API contracts, Failure scenarios (timeout/unavailable) |
| **L3** | E2E Tests | User journeys | Critical paths, Cross-module flows (if applicable) |
#### L0: Static Analysis Details
```
L0.1 Compilation - tsc --noEmit, babel parse, no syntax errors
L0.2 Import Validity - Package exists, path resolves, no circular deps
L0.3 Type Safety - No 'any' abuse, proper generics, null checks
L0.4 Lint Rules - ESLint/Prettier, project naming conventions
L0.5 AI Issues - Hallucinated imports, placeholders, mock leakage, etc.
```
#### L1: Unit Tests Details (per function/class)
```
L1.1 Happy Path - Normal input → expected output
L1.2 Negative Path - Invalid input → proper error/rejection
L1.3 Edge Cases - null, undefined, empty, boundary values
L1.4 State Changes - Before/after assertions for stateful code
L1.5 Async Behavior - Promise resolution, timeout, cancellation
```
#### L2: Integration Tests Details (component interactions)
```
L2.1 Module Wiring - Dependencies inject correctly
L2.2 API Contracts - Request/response schema validation
L2.3 Database Ops - CRUD operations, transactions, rollback
L2.4 External APIs - Mock external services, retry logic
L2.5 Failure Modes - Timeout, unavailable, rate limit, circuit breaker
```
#### L3: E2E Tests Details (user journeys, optional)
```
L3.1 Critical Paths - Login, checkout, core workflows
L3.2 Cross-Module - Feature spanning multiple modules
L3.3 Performance - Response time, memory usage thresholds
L3.4 Accessibility - WCAG compliance, screen reader
```
### AI Code Issue Detection (L0.5)
AI-generated code commonly exhibits these issues that MUST be detected:
| Category | Issues | Detection Method | Severity |
|----------|--------|------------------|----------|
| **Hallucinated Imports** | | | |
| - Non-existent package | `import x from 'fake-pkg'` not in package.json | Validate against package.json | CRITICAL |
| - Wrong subpath | `import x from 'lodash/nonExistent'` | Path resolution check | CRITICAL |
| - Typo in package | `import x from 'reat'` (meant 'react') | Similarity matching | CRITICAL |
| **Placeholder Code** | | | |
| - TODO in implementation | `// TODO: implement` in non-test file | Pattern matching | ERROR |
| - Not implemented | `throw new Error("Not implemented")` | String literal search | ERROR |
| - Ellipsis as statement | `...` (not spread) | AST analysis | ERROR |
| **Mock Leakage** | | | |
| - Jest in production | `jest.fn()`, `jest.mock()` in `src/` | File path + pattern | CRITICAL |
| - Spy in production | `vi.spyOn()`, `sinon.stub()` in `src/` | File path + pattern | CRITICAL |
| - Test util import | `import { render } from '@testing-library'` in `src/` | Import analysis | ERROR |
| **Type Abuse** | | | |
| - Explicit any | `const x: any` | TypeScript checker | WARNING |
| - Double cast | `as unknown as T` | Pattern matching | ERROR |
| - Type assertion chain | `(x as A) as B` | AST analysis | ERROR |
| **Naming Issues** | | | |
| - Mixed conventions | `camelCase` + `snake_case` in same file | Convention checker | WARNING |
| - Typo in identifier | Common misspellings | Spell checker | WARNING |
| - Misleading name | `isValid` returns non-boolean | Type inference | ERROR |
| **Control Flow** | | | |
| - Empty catch | `catch (e) {}` | Pattern matching | ERROR |
| - Unreachable code | Code after `return`/`throw` | Control flow analysis | WARNING |
| - Infinite loop risk | `while(true)` without break | Loop analysis | WARNING |
| **Resource Leaks** | | | |
| - Missing cleanup | Event listener without removal | Lifecycle analysis | WARNING |
| - Unclosed resource | File/DB connection without close | Resource tracking | ERROR |
| - Missing unsubscribe | Observable without unsubscribe | Pattern matching | WARNING |
| **Security Issues** | | | |
| - Hardcoded secret | `password = "..."`, `apiKey = "..."` | Pattern matching | CRITICAL |
| - Console in production | `console.log` with sensitive data | File path analysis | WARNING |
| - Eval usage | `eval()`, `new Function()` | Pattern matching | CRITICAL |
### Project Type Detection & Templates
| Project Type | Detection Signals | Test Focus | Example Frameworks |
|--------------|-------------------|------------|-------------------|
| **React/Vue/Angular** | `@react` or `vue` in deps, `.jsx/.vue/.ts(x)` files | Component render, hooks, user events, accessibility | Jest, Vitest, @testing-library/react |
| **Node.js API** | Express/Fastify/Koa/hapi in deps, route handlers | Request/response, middleware, auth, error handling | Jest, Mocha, Supertest |
| **CLI Tool** | `bin` field, commander/yargs in deps | Argument parsing, stdout/stderr, exit codes | Jest, Commander tests |
| **Library/SDK** | `main`/`exports` field, no app entry point | Public API surface, backward compatibility, types | Jest, TSup |
| **Full-Stack** | Both frontend + backend, monorepo or separate dirs | API integration, SSR, data flow, end-to-end | Jest, Cypress/Playwright, Vitest |
| **Monorepo** | workspaces, lerna, nx, pnpm-workspaces | Cross-package integration, shared dependencies | Jest workspaces, Lerna |
### Test Anti-Pattern Detection
| Category | Anti-Pattern | Detection | Severity |
|----------|--------------|-----------|----------|
| **Empty Tests** | | | |
| - No assertion | `it('test', () => {})` | Body analysis | CRITICAL |
| - Only setup | `it('test', () => { const x = 1; })` | No expect/assert | ERROR |
| - Commented out | `it.skip('test', ...)` | Skip detection | WARNING |
| **Weak Assertions** | | | |
| - toBeDefined only | `expect(x).toBeDefined()` | Pattern match | WARNING |
| - toBeTruthy only | `expect(x).toBeTruthy()` | Pattern match | WARNING |
| - Snapshot abuse | Many `.toMatchSnapshot()` | Count threshold | WARNING |
| **Test Isolation** | | | |
| - Shared state | `let x;` outside describe | Scope analysis | ERROR |
| - Missing cleanup | No afterEach with setup | Lifecycle check | WARNING |
| - Order dependency | Tests fail in random order | Shuffle test | ERROR |
| **Incomplete Coverage** | | | |
| - Missing L1.2 | No negative path test | Pattern scan | ERROR |
| - Missing L1.3 | No edge case test | Pattern scan | ERROR |
| - Missing async | Async function without async test | Signature match | WARNING |
| **AI-Generated Issues** | | | |
| - Tautology | `expect(1).toBe(1)` | Literal detection | CRITICAL |
| - Testing mock | `expect(mockFn).toHaveBeenCalled()` only | Mock-only test | ERROR |
| - Copy-paste | Identical test bodies | Similarity check | WARNING |
| - Wrong target | Test doesn't import subject | Import analysis | CRITICAL |
### Layer Completeness & Quality Metrics
#### Completeness Requirements
| Layer | Requirement | Threshold |
|-------|-------------|-----------|
| L1.1 | Happy path for each exported function | 100% |
| L1.2 | Negative path for functions with validation | 80% |
| L1.3 | Edge cases (null, empty, boundary) | 60% |
| L1.4 | State change tests for stateful code | 80% |
| L1.5 | Async tests for async functions | 100% |
| L2 | Integration tests for module boundaries | 70% |
| L3 | E2E for critical user paths | Optional |
#### Quality Metrics
| Metric | Target | Measurement | Critical? |
|--------|--------|-------------|-----------|
| Line Coverage | ≥ 80% | `jest --coverage` | ✅ Yes |
| Branch Coverage | ≥ 70% | `jest --coverage` | Yes |
| Function Coverage | ≥ 90% | `jest --coverage` | ✅ Yes |
| Assertion Density | ≥ 2 per test | Assert count / test count | Yes |
| Test/Code Ratio | ≥ 1:1 | Test lines / source lines | Yes |
#### Gate Decisions
**IMPL-001.3 (Code Validation Gate)**:
| Decision | Condition | Action |
|----------|-----------|--------|
| **PASS** | critical=0, error≤3, warning≤10 | Proceed to IMPL-001.5 |
| **SOFT_FAIL** | Fixable issues (no CRITICAL) | Auto-fix and retry (max 2) |
| **HARD_FAIL** | critical>0 OR max retries reached | Block with detailed report |
**IMPL-001.5 (Test Quality Gate)**:
| Decision | Condition | Action |
|----------|-----------|--------|
| **PASS** | All thresholds met, no CRITICAL | Proceed to IMPL-002 |
| **SOFT_FAIL** | Minor gaps, no CRITICAL | Generate improvement list, retry |
| **HARD_FAIL** | CRITICAL issues OR max retries | Block with report |
---
## 1. Input & Execution
### 1.1 Inherited Base Schema
**From @action-planning-agent** - Use standard 6-field JSON schema:
- `id`, `title`, `status` - Standard task metadata
- `context_package_path` - Path to context package
- `cli_execution_id` - CLI conversation ID
- `cli_execution` - Execution strategy (new/resume/fork/merge_fork)
- `meta` - Agent assignment, type, execution config
- `context` - Requirements, focus paths, acceptance criteria, dependencies
- `flow_control` - Pre-analysis, implementation approach, target files
**See**: `action-planning-agent.md` sections 2.1-2.3 for complete base schema specifications.
### 1.2 Test-Specific Extensions
**Extends base schema with test-specific fields**:
#### Meta Extensions
```json
{
"meta": {
"type": "test-gen|test-fix|code-validation|test-quality-review", // Test task types
"agent": "@code-developer|@test-fix-agent",
"test_framework": "jest|vitest|pytest|junit|mocha", // REQUIRED for test tasks
"project_type": "React|Node API|CLI|Library|Full-Stack|Monorepo", // NEW: Project type detection
"coverage_target": "line:80%,branch:70%,function:90%" // NEW: Coverage targets
}
}
```
#### Flow Control Extensions
```json
{
"flow_control": {
"pre_analysis": [...], // From base schema
"implementation_approach": [...], // From base schema
"target_files": [...], // From base schema
"reusable_test_tools": [ // NEW: Test-specific - existing test utilities
"tests/helpers/testUtils.ts",
"tests/fixtures/mockData.ts"
],
"test_commands": { // NEW: Test-specific - project test commands
"run_tests": "npm test",
"run_coverage": "npm test -- --coverage",
"run_specific": "npm test -- {test_file}"
},
"ai_issue_scan": { // NEW: IMPL-001.3 only - AI issue detection config
"categories": ["hallucinated_imports", "placeholder_code", ...],
"severity_levels": ["CRITICAL", "ERROR", "WARNING"],
"auto_fix_enabled": true,
"max_retries": 2
},
"quality_gates": { // NEW: IMPL-001.5 only - Test quality thresholds
"layer_completeness": { "L1.1": "100%", "L1.2": "80%", ... },
"anti_patterns": ["empty_tests", "weak_assertions", ...],
"coverage_thresholds": { "line": "80%", "branch": "70%", ... }
}
}
}
```
### 1.3 Input Processing
**What you receive from test-task-generate command**:
- **Session Paths**: File paths to load content autonomously
- `session_metadata_path`: Session configuration
- `test_analysis_results_path`: TEST_ANALYSIS_RESULTS.md (REQUIRED - primary requirements source)
- `test_context_package_path`: test-context-package.json
- `context_package_path`: context-package.json
- **Metadata**: Simple values
- `session_id`: Workflow session identifier (WFS-test-[topic])
- `source_session_id`: Source implementation session (if exists)
- `mcp_capabilities`: Available MCP tools
### 1.2 Execution Flow
#### Phase 1: Context Loading & Assembly
```
1. Load TEST_ANALYSIS_RESULTS.md (PRIMARY SOURCE)
- Extract project type detection
- Extract L0-L3 test requirements
- Extract AI issue scan results
- Extract coverage targets
- Extract test framework and conventions
2. Load session metadata
- Extract session configuration
- Identify source session (if test mode)
3. Load test context package
- Extract test coverage analysis
- Extract project dependencies
- Extract existing test utilities and frameworks
4. Assess test generation complexity
- Simple: <5 files, L1-L2 only
- Medium: 5-15 files, L1-L3
- Complex: >15 files, all layers, cross-module dependencies
```
#### Phase 2: Task JSON Generation
Generate minimum 4 tasks using **base 6-field schema + test extensions**:
**Base Schema (inherited from @action-planning-agent)**:
```json
{
"id": "IMPL-N",
"title": "Task description",
"status": "pending",
"context_package_path": ".workflow/active/WFS-test-{session}/.process/context-package.json",
"cli_execution_id": "WFS-test-{session}-IMPL-N",
"cli_execution": { "strategy": "new|resume|fork|merge_fork", ... },
"meta": { ... }, // See section 1.2 for test extensions
"context": { ... }, // See action-planning-agent.md section 2.2
"flow_control": { ... } // See section 1.2 for test extensions
}
```
**Task 1: IMPL-001.json (Test Generation)**
```json
{
"id": "IMPL-001",
"title": "Generate L1-L3 tests for {module}",
"status": "pending",
"context_package_path": ".workflow/active/WFS-test-{session}/.process/test-context-package.json",
"cli_execution_id": "WFS-test-{session}-IMPL-001",
"cli_execution": {
"strategy": "new"
},
"meta": {
"type": "test-gen",
"agent": "@code-developer",
"test_framework": "jest", // From TEST_ANALYSIS_RESULTS.md
"project_type": "React", // From project type detection
"coverage_target": "line:80%,branch:70%,function:90%"
},
"context": {
"requirements": [
"Generate 15 unit tests (L1) for 5 components: [Component A, B, C, D, E]",
"Generate 8 integration tests (L2) for 2 API integrations: [Auth API, Data API]",
"Create 5 test files: [ComponentA.test.tsx, ComponentB.test.tsx, ...]"
],
"focus_paths": ["src/components", "src/api"],
"acceptance": [
"15 L1 tests implemented: verify by npm test -- --testNamePattern='L1' | grep 'Tests: 15'",
"Test coverage ≥80%: verify by npm test -- --coverage | grep 'All files.*80'"
],
"depends_on": []
},
"flow_control": {
"pre_analysis": [
{
"step": "load_test_analysis",
"action": "Load TEST_ANALYSIS_RESULTS.md",
"commands": ["Read('.workflow/active/WFS-test-{session}/.process/TEST_ANALYSIS_RESULTS.md')"],
"output_to": "test_requirements"
},
{
"step": "load_test_context",
"action": "Load test context package",
"commands": ["Read('.workflow/active/WFS-test-{session}/.process/test-context-package.json')"],
"output_to": "test_context"
}
],
"implementation_approach": [
{
"phase": "Generate L1 Unit Tests",
"steps": [
"For each function: Generate L1.1 (happy path), L1.2 (negative), L1.3 (edge cases), L1.4 (state), L1.5 (async)"
],
"test_patterns": "render(), screen.getByRole(), userEvent.click(), waitFor()"
},
{
"phase": "Generate L2 Integration Tests",
"steps": [
"Generate L2.1 (module wiring), L2.2 (API contracts), L2.5 (failure modes)"
],
"test_patterns": "supertest(app), expect(res.status), expect(res.body)"
}
],
"target_files": [
"tests/components/ComponentA.test.tsx",
"tests/components/ComponentB.test.tsx",
"tests/api/auth.integration.test.ts"
],
"reusable_test_tools": [
"tests/helpers/renderWithProviders.tsx",
"tests/fixtures/mockData.ts"
],
"test_commands": {
"run_tests": "npm test",
"run_coverage": "npm test -- --coverage"
}
}
}
```
**Task 2: IMPL-001.3-validation.json (Code Validation Gate)**
```json
{
"id": "IMPL-001.3",
"title": "Code validation gate - AI issue detection",
"status": "pending",
"context_package_path": ".workflow/active/WFS-test-{session}/.process/test-context-package.json",
"cli_execution_id": "WFS-test-{session}-IMPL-001.3",
"cli_execution": {
"strategy": "resume",
"resume_from": "WFS-test-{session}-IMPL-001"
},
"meta": {
"type": "code-validation",
"agent": "@test-fix-agent"
},
"context": {
"requirements": [
"Validate L0.1-L0.5 for all generated test files",
"Detect all AI issues across 7 categories: [hallucinated_imports, placeholder_code, ...]",
"Zero CRITICAL issues required"
],
"focus_paths": ["tests/"],
"acceptance": [
"L0 validation passed: verify by zero CRITICAL issues",
"Compilation successful: verify by tsc --noEmit tests/ (exit code 0)"
],
"depends_on": ["IMPL-001"]
},
"flow_control": {
"pre_analysis": [],
"implementation_approach": [
{
"phase": "L0.1 Compilation Check",
"validation": "tsc --noEmit tests/"
},
{
"phase": "L0.2 Import Validity",
"validation": "Check all imports against package.json and node_modules"
},
{
"phase": "L0.5 AI Issue Detection",
"validation": "Scan for all 7 AI issue categories with severity levels"
}
],
"target_files": [],
"ai_issue_scan": {
"categories": [
"hallucinated_imports",
"placeholder_code",
"mock_leakage",
"type_abuse",
"naming_issues",
"control_flow",
"resource_leaks",
"security_issues"
],
"severity_levels": ["CRITICAL", "ERROR", "WARNING"],
"auto_fix_enabled": true,
"max_retries": 2,
"thresholds": {
"critical": 0,
"error": 3,
"warning": 10
}
}
}
}
```
**Task 3: IMPL-001.5-review.json (Test Quality Gate)**
```json
{
"id": "IMPL-001.5",
"title": "Test quality gate - anti-patterns and coverage",
"status": "pending",
"context_package_path": ".workflow/active/WFS-test-{session}/.process/test-context-package.json",
"cli_execution_id": "WFS-test-{session}-IMPL-001.5",
"cli_execution": {
"strategy": "resume",
"resume_from": "WFS-test-{session}-IMPL-001.3"
},
"meta": {
"type": "test-quality-review",
"agent": "@test-fix-agent"
},
"context": {
"requirements": [
"Validate layer completeness: L1.1 100%, L1.2 80%, L1.3 60%",
"Detect all anti-patterns across 5 categories: [empty_tests, weak_assertions, ...]",
"Verify coverage: line ≥80%, branch ≥70%, function ≥90%"
],
"focus_paths": ["tests/"],
"acceptance": [
"Coverage ≥80%: verify by npm test -- --coverage | grep 'All files.*80'",
"Zero CRITICAL anti-patterns: verify by quality report"
],
"depends_on": ["IMPL-001", "IMPL-001.3"]
},
"flow_control": {
"pre_analysis": [],
"implementation_approach": [
{
"phase": "Static Analysis",
"validation": "Lint test files, check anti-patterns"
},
{
"phase": "Coverage Analysis",
"validation": "Calculate coverage percentage, identify gaps"
},
{
"phase": "Quality Metrics",
"validation": "Verify thresholds, layer completeness"
}
],
"target_files": [],
"quality_gates": {
"layer_completeness": {
"L1.1": "100%",
"L1.2": "80%",
"L1.3": "60%",
"L1.4": "80%",
"L1.5": "100%",
"L2": "70%"
},
"anti_patterns": [
"empty_tests",
"weak_assertions",
"test_isolation",
"incomplete_coverage",
"ai_generated_issues"
],
"coverage_thresholds": {
"line": "80%",
"branch": "70%",
"function": "90%"
}
}
}
}
```
**Task 4: IMPL-002.json (Test Execution & Fix)**
```json
{
"id": "IMPL-002",
"title": "Test execution and fix cycle",
"status": "pending",
"context_package_path": ".workflow/active/WFS-test-{session}/.process/test-context-package.json",
"cli_execution_id": "WFS-test-{session}-IMPL-002",
"cli_execution": {
"strategy": "resume",
"resume_from": "WFS-test-{session}-IMPL-001.5"
},
"meta": {
"type": "test-fix",
"agent": "@test-fix-agent"
},
"context": {
"requirements": [
"Execute all tests and fix failures until pass rate ≥95%",
"Maximum 5 fix iterations",
"Use Gemini for diagnosis, agent for fixes"
],
"focus_paths": ["tests/", "src/"],
"acceptance": [
"All tests pass: verify by npm test (exit code 0)",
"Pass rate ≥95%: verify by test output"
],
"depends_on": ["IMPL-001", "IMPL-001.3", "IMPL-001.5"]
},
"flow_control": {
"pre_analysis": [],
"implementation_approach": [
{
"phase": "Initial Test Execution",
"command": "npm test"
},
{
"phase": "Iterative Fix Cycle",
"steps": [
"Diagnose failures with Gemini",
"Apply fixes via agent or CLI",
"Re-run tests",
"Repeat until pass rate ≥95% or max iterations"
],
"max_iterations": 5
}
],
"target_files": [],
"test_fix_cycle": {
"max_iterations": 5,
"diagnosis_tool": "gemini",
"fix_mode": "agent",
"exit_conditions": ["all_tests_pass", "max_iterations_reached"]
}
}
}
```
#### Phase 3: Document Generation
```
1. Create IMPL_PLAN.md (test-specific variant)
- frontmatter: workflow_type="test_session", test_framework, coverage_targets
- Test Generation Phase: L1-L3 layer breakdown
- Quality Gates: IMPL-001.3 and IMPL-001.5 specifications
- Test-Fix Cycle: Iteration strategy with diagnosis and fix modes
- Source Session Context: If exists (from source_session_id)
2. Create TODO_LIST.md
- Hierarchical structure with test phase containers
- Links to task JSONs with status markers
- Test layer indicators (L0, L1, L2, L3)
- Quality gate indicators (validation, review)
```
---
## 2. Output Validation
### Task JSON Validation
**IMPL-001 Requirements**:
- All L1.1-L1.5 tests explicitly defined for each target function
- Project type template correctly applied
- Reusable test tools and test commands included
- Implementation approach includes all 3 phases (L1, L2, L3)
**IMPL-001.3 Requirements**:
- All 7 AI issue categories included
- Severity levels properly assigned
- Auto-fix logic for ERROR and below
- Acceptance criteria references zero CRITICAL rule
**IMPL-001.5 Requirements**:
- Layer completeness thresholds: L1.1 100%, L1.2 80%, L1.3 60%
- All 5 anti-pattern categories included
- Coverage metrics: Line 80%, Branch 70%, Function 90%
- Acceptance criteria references all thresholds
**IMPL-002 Requirements**:
- Depends on: IMPL-001, IMPL-001.3, IMPL-001.5 (sequential)
- Max iterations: 5
- Diagnosis tool: Gemini
- Exit conditions: all_tests_pass OR max_iterations_reached
### Quality Standards
Hard Constraints:
- Task count: minimum 4, maximum 18
- All requirements quantified from TEST_ANALYSIS_RESULTS.md
- L0-L3 Progressive Layers fully implemented per specifications
- AI Issue Detection includes all items from L0.5 checklist
- Project Type Template correctly applied
- Test Anti-Patterns validation rules implemented
- Layer Completeness Thresholds met
- Quality Metrics targets: Line 80%, Branch 70%, Function 90%
---
## 3. Success Criteria
- All test planning documents generated successfully
- Task count reported: minimum 4
- Test framework correctly detected and reported
- Coverage targets clearly specified: L0 zero errors, L1 80%+, L2 70%+
- L0-L3 layers explicitly defined in IMPL-001 task
- AI issue detection configured in IMPL-001.3
- Quality gates with measurable thresholds in IMPL-001.5
- Source session status reported (if applicable)

View File

@@ -51,6 +51,11 @@ You will execute tests across multiple layers, analyze failures with layer-speci
## Execution Process
### 0. Task Status: Mark In Progress
```bash
jq --arg ts "$(date -Iseconds)" '.status="in_progress" | .status_history += [{"from":.status,"to":"in_progress","changed_at":$ts}]' IMPL-X.json > tmp.json && mv tmp.json IMPL-X.json
```
### Flow Control Execution
When task JSON contains `flow_control` field, execute preparation and implementation steps systematically.
@@ -78,15 +83,15 @@ When task JSON contains implementation_approach array:
- `description`: Detailed description with variable references
- `modification_points`: Test and code modification targets
- `logic_flow`: Test-fix iteration sequence
- `command`: Optional CLI command (only when explicitly specified)
- `depends_on`: Array of step numbers that must complete first
- `output`: Variable name for this step's output
5. **Execution Mode Selection**:
- IF `command` field exists → Execute CLI command via Bash tool
- ELSE (no command) → Agent direct execution:
- Parse `modification_points` as files to modify
- Follow `logic_flow` for test-fix iteration
- Use test_commands from flow_control for test execution
- Based on `meta.execution_config.method`:
- `"cli"` → Build CLI command via buildCliHandoffPrompt() and execute via Bash tool
- `"agent"` (default) → Agent direct execution:
- Parse `modification_points` as files to modify
- Follow `logic_flow` for test-fix iteration
- Use test_commands from flow_control for test execution
### 1. Context Assessment & Test Discovery
@@ -329,6 +334,13 @@ When generating test results for orchestrator (saved to `.process/test-results.j
- Pass rate >= 95% + any "high" or "medium" criticality failures → ⚠️ NEEDS FIX (continue iteration)
- Pass rate < 95% → ❌ FAILED (continue iteration or abort)
## Task Status Update
**Upon task completion**, update task JSON status:
```bash
jq --arg ts "$(date -Iseconds)" '.status="completed" | .status_history += [{"from":"in_progress","to":"completed","changed_at":$ts}]' IMPL-X.json > tmp.json && mv tmp.json IMPL-X.json
```
## Important Reminders
**ALWAYS:**

View File

@@ -1,301 +0,0 @@
# CCW Loop-B: Hybrid Orchestrator Pattern
Iterative development workflow using coordinator + specialized workers architecture.
## Overview
CCW Loop-B implements a flexible orchestration pattern:
- **Coordinator**: Main agent managing state, user interaction, worker scheduling
- **Workers**: Specialized agents (init, develop, debug, validate, complete)
- **Modes**: Interactive / Auto / Parallel execution
## Architecture
```
Coordinator (Main Agent)
|
+-- Spawns Workers
| - ccw-loop-b-init.md
| - ccw-loop-b-develop.md
| - ccw-loop-b-debug.md
| - ccw-loop-b-validate.md
| - ccw-loop-b-complete.md
|
+-- Batch Wait (parallel mode)
+-- Sequential Wait (auto/interactive)
+-- State Management
+-- User Interaction
```
## Subagent API
Core APIs for worker orchestration:
| API | 作用 |
|-----|------|
| `spawn_agent({ message })` | 创建 worker返回 `agent_id` |
| `wait({ ids, timeout_ms })` | 等待结果(唯一取结果入口) |
| `send_input({ id, message })` | 继续交互 |
| `close_agent({ id })` | 关闭回收 |
**可用模式**: 单 agent 深度交互 / 多 agent 并行 / 混合模式
## Execution Modes
### Interactive Mode (default)
Coordinator displays menu, user selects action, spawns corresponding worker.
```bash
/ccw-loop-b TASK="Implement feature X"
```
**Flow**:
1. Init: Parse task, create breakdown
2. Menu: Show options to user
3. User selects action (develop/debug/validate)
4. Spawn worker for selected action
5. Wait for result
6. Display result, back to menu
7. Repeat until complete
### Auto Mode
Automated sequential execution following predefined workflow.
```bash
/ccw-loop-b --mode=auto TASK="Fix bug Y"
```
**Flow**:
1. Init → 2. Develop → 3. Validate → 4. Complete
If issues found: loop back to Debug → Develop → Validate
### Parallel Mode
Spawn multiple workers simultaneously, batch wait for results.
```bash
/ccw-loop-b --mode=parallel TASK="Analyze module Z"
```
**Flow**:
1. Init: Create analysis plan
2. Spawn workers in parallel: [develop, debug, validate]
3. Batch wait: `wait({ ids: [w1, w2, w3] })`
4. Merge results
5. Coordinator decides next action
6. Complete
## Session Structure
```
.workflow/.loop/
+-- {loopId}.json # Master state
+-- {loopId}.workers/ # Worker outputs
| +-- init.output.json
| +-- develop.output.json
| +-- debug.output.json
| +-- validate.output.json
| +-- complete.output.json
+-- {loopId}.progress/ # Human-readable logs
+-- develop.md
+-- debug.md
+-- validate.md
+-- summary.md
```
## Worker Responsibilities
| Worker | Role | Specialization |
|--------|------|----------------|
| **init** | Session initialization | Task parsing, breakdown, planning |
| **develop** | Code implementation | File operations, pattern matching, incremental development |
| **debug** | Problem diagnosis | Root cause analysis, hypothesis testing, fix recommendations |
| **validate** | Testing & verification | Test execution, coverage analysis, quality gates |
| **complete** | Session finalization | Summary generation, commit preparation, cleanup |
## Usage Examples
### Example 1: Simple Feature Implementation
```bash
/ccw-loop-b TASK="Add user logout function"
```
**Auto flow**:
- Init: Parse requirements
- Develop: Implement logout in `src/auth.ts`
- Validate: Run tests
- Complete: Generate commit message
### Example 2: Bug Investigation
```bash
/ccw-loop-b TASK="Fix memory leak in WebSocket handler"
```
**Interactive flow**:
1. Init: Parse issue
2. User selects "debug" → Spawn debug worker
3. Debug: Root cause analysis → recommends fix
4. User selects "develop" → Apply fix
5. User selects "validate" → Verify fix works
6. User selects "complete" → Generate summary
### Example 3: Comprehensive Analysis
```bash
/ccw-loop-b --mode=parallel TASK="Analyze payment module for improvements"
```
**Parallel flow**:
- Spawn [develop, debug, validate] workers simultaneously
- Develop: Analyze code quality and patterns
- Debug: Identify potential issues
- Validate: Check test coverage
- Wait for all three to complete
- Merge findings into comprehensive report
### Example 4: Resume Existing Loop
```bash
/ccw-loop-b --loop-id=loop-b-20260122-abc123
```
Continues from previous state, respects status (running/paused).
## Key Features
### 1. Worker Specialization
Each worker focuses on one domain:
- **No overlap**: Clear boundaries between workers
- **Reusable**: Same worker for different tasks
- **Composable**: Combine workers for complex workflows
### 2. Flexible Coordination
Coordinator adapts to mode:
- **Interactive**: Menu-driven, user controls flow
- **Auto**: Predetermined sequence
- **Parallel**: Concurrent execution with batch wait
### 3. State Management
Unified state at `.workflow/.loop/{loopId}.json`:
- **API compatible**: Works with CCW API
- **Extension fields**: Skill-specific data in `skill_state`
- **Worker outputs**: Structured JSON for each action
### 4. Progress Tracking
Human-readable logs:
- **Per-worker progress**: `{action}.md` files
- **Summary**: Consolidated achievements
- **Commit-ready**: Formatted commit messages
## Best Practices
1. **Start with Init**: Always initialize before execution
2. **Use appropriate mode**:
- Interactive: Complex tasks needing user decisions
- Auto: Well-defined workflows
- Parallel: Independent analysis tasks
3. **Clean up workers**: `close_agent()` after each worker completes
4. **Batch wait wisely**: Use in parallel mode for efficiency
5. **Track progress**: Document in progress files
6. **Validate often**: After each develop phase
## Implementation Patterns
### Pattern 1: Single Worker Deep Interaction
```javascript
const workerId = spawn_agent({ message: workerPrompt })
const result1 = wait({ ids: [workerId] })
// Continue with same worker
send_input({ id: workerId, message: "Continue with next task" })
const result2 = wait({ ids: [workerId] })
close_agent({ id: workerId })
```
### Pattern 2: Multi-Worker Parallel
```javascript
const workers = {
develop: spawn_agent({ message: developPrompt }),
debug: spawn_agent({ message: debugPrompt }),
validate: spawn_agent({ message: validatePrompt })
}
// Batch wait
const results = wait({ ids: Object.values(workers), timeout_ms: 900000 })
// Process all results
Object.values(workers).forEach(id => close_agent({ id }))
```
### Pattern 3: Sequential Worker Chain
```javascript
const actions = ['init', 'develop', 'validate', 'complete']
for (const action of actions) {
const workerId = spawn_agent({ message: buildPrompt(action) })
const result = wait({ ids: [workerId] })
updateState(action, result)
close_agent({ id: workerId })
}
```
## Error Handling
| Error | Recovery |
|-------|----------|
| Worker timeout | `send_input` request convergence |
| Worker fails | Log error, coordinator decides retry strategy |
| Partial results | Use completed workers, mark incomplete |
| State corruption | Rebuild from progress files |
## File Structure
```
.codex/skills/ccw-loop-b/
+-- SKILL.md # Entry point
+-- README.md # This file
+-- phases/
| +-- state-schema.md # State structure definition
+-- specs/
+-- action-catalog.md # Action reference
.codex/agents/
+-- ccw-loop-b-init.md # Worker: Init
+-- ccw-loop-b-develop.md # Worker: Develop
+-- ccw-loop-b-debug.md # Worker: Debug
+-- ccw-loop-b-validate.md # Worker: Validate
+-- ccw-loop-b-complete.md # Worker: Complete
```
## Comparison: ccw-loop vs ccw-loop-b
| Aspect | ccw-loop | ccw-loop-b |
|--------|----------|------------|
| Pattern | Single agent, multi-phase | Coordinator + workers |
| Worker model | Single agent handles all | Specialized workers per action |
| Parallelization | Sequential only | Supports parallel mode |
| Flexibility | Fixed sequence | Mode-based (interactive/auto/parallel) |
| Best for | Simple linear workflows | Complex tasks needing specialization |
## Contributing
To add new workers:
1. Create worker role file in `.codex/agents/`
2. Define clear responsibilities
3. Update `action-catalog.md`
4. Add worker to coordinator spawn logic
5. Test integration with existing workers

View File

@@ -1,323 +0,0 @@
---
name: CCW Loop-B
description: Hybrid orchestrator pattern for iterative development. Coordinator + specialized workers with batch wait support. Triggers on "ccw-loop-b".
argument-hint: TASK="<task description>" [--loop-id=<id>] [--mode=<interactive|auto|parallel>]
---
# CCW Loop-B - Hybrid Orchestrator Pattern
协调器 + 专用 worker 的迭代开发工作流。支持单 agent 深度交互、多 agent 并行、混合模式灵活切换。
## Arguments
| Arg | Required | Description |
|-----|----------|-------------|
| TASK | No | Task description (for new loop) |
| --loop-id | No | Existing loop ID to continue |
| --mode | No | `interactive` (default) / `auto` / `parallel` |
## Architecture
```
+------------------------------------------------------------+
| Main Coordinator |
| 职责: 状态管理 + worker 调度 + 结果汇聚 + 用户交互 |
+------------------------------------------------------------+
|
+--------------------+--------------------+
| | |
v v v
+----------------+ +----------------+ +----------------+
| Worker-Develop | | Worker-Debug | | Worker-Validate|
| 专注: 代码实现 | | 专注: 问题诊断 | | 专注: 测试验证 |
+----------------+ +----------------+ +----------------+
```
## Execution Modes
### Mode: Interactive (default)
协调器展示菜单,用户选择 actionspawn 对应 worker 执行。
```
Coordinator -> Show menu -> User selects -> spawn worker -> wait -> Display result -> Loop
```
### Mode: Auto
自动按预设顺序执行worker 完成后自动切换到下一阶段。
```
Init -> Develop -> [if issues] Debug -> Validate -> [if fail] Loop back -> Complete
```
### Mode: Parallel
并行 spawn 多个 worker 分析不同维度batch wait 汇聚结果。
```
Coordinator -> spawn [develop, debug, validate] in parallel -> wait({ ids: all }) -> Merge -> Decide
```
## Session Structure
```
.workflow/.loop/
+-- {loopId}.json # Master state
+-- {loopId}.workers/ # Worker outputs
| +-- develop.output.json
| +-- debug.output.json
| +-- validate.output.json
+-- {loopId}.progress/ # Human-readable progress
+-- develop.md
+-- debug.md
+-- validate.md
+-- summary.md
```
## Subagent API
| API | 作用 |
|-----|------|
| `spawn_agent({ message })` | 创建 agent返回 `agent_id` |
| `wait({ ids, timeout_ms })` | 等待结果(唯一取结果入口) |
| `send_input({ id, message })` | 继续交互 |
| `close_agent({ id })` | 关闭回收 |
## Implementation
### Coordinator Logic
```javascript
// ==================== HYBRID ORCHESTRATOR ====================
// 1. Initialize
const loopId = args['--loop-id'] || generateLoopId()
const mode = args['--mode'] || 'interactive'
let state = readOrCreateState(loopId, taskDescription)
// 2. Mode selection
switch (mode) {
case 'interactive':
await runInteractiveMode(loopId, state)
break
case 'auto':
await runAutoMode(loopId, state)
break
case 'parallel':
await runParallelMode(loopId, state)
break
}
```
### Interactive Mode (单 agent 交互或按需 spawn worker)
```javascript
async function runInteractiveMode(loopId, state) {
while (state.status === 'running') {
// Show menu, get user choice
const action = await showMenuAndGetChoice(state)
if (action === 'exit') break
// Spawn specialized worker for the action
const workerId = spawn_agent({
message: buildWorkerPrompt(action, loopId, state)
})
// Wait for worker completion
const result = wait({ ids: [workerId], timeout_ms: 600000 })
const output = result.status[workerId].completed
// Update state and display result
state = updateState(loopId, action, output)
displayResult(output)
// Cleanup worker
close_agent({ id: workerId })
}
}
```
### Auto Mode (顺序执行 worker 链)
```javascript
async function runAutoMode(loopId, state) {
const actionSequence = ['init', 'develop', 'debug', 'validate', 'complete']
let currentIndex = state.skill_state?.action_index || 0
while (currentIndex < actionSequence.length && state.status === 'running') {
const action = actionSequence[currentIndex]
// Spawn worker
const workerId = spawn_agent({
message: buildWorkerPrompt(action, loopId, state)
})
const result = wait({ ids: [workerId], timeout_ms: 600000 })
const output = result.status[workerId].completed
// Parse worker result to determine next step
const workerResult = parseWorkerResult(output)
// Update state
state = updateState(loopId, action, output)
close_agent({ id: workerId })
// Determine next action
if (workerResult.needs_loop_back) {
// Loop back to develop or debug
currentIndex = actionSequence.indexOf(workerResult.loop_back_to)
} else if (workerResult.status === 'failed') {
// Stop on failure
break
} else {
currentIndex++
}
}
}
```
### Parallel Mode (批量 spawn + wait)
```javascript
async function runParallelMode(loopId, state) {
// Spawn multiple workers in parallel
const workers = {
develop: spawn_agent({ message: buildWorkerPrompt('develop', loopId, state) }),
debug: spawn_agent({ message: buildWorkerPrompt('debug', loopId, state) }),
validate: spawn_agent({ message: buildWorkerPrompt('validate', loopId, state) })
}
// Batch wait for all workers
const results = wait({
ids: Object.values(workers),
timeout_ms: 900000 // 15 minutes for all
})
// Collect outputs
const outputs = {}
for (const [role, workerId] of Object.entries(workers)) {
outputs[role] = results.status[workerId].completed
close_agent({ id: workerId })
}
// Merge and analyze results
const mergedAnalysis = mergeWorkerOutputs(outputs)
// Update state with merged results
updateState(loopId, 'parallel-analysis', mergedAnalysis)
// Coordinator decides next action based on merged results
const decision = decideNextAction(mergedAnalysis)
return decision
}
```
### Worker Prompt Builder
```javascript
function buildWorkerPrompt(action, loopId, state) {
const workerRoles = {
develop: '~/.codex/agents/ccw-loop-b-develop.md',
debug: '~/.codex/agents/ccw-loop-b-debug.md',
validate: '~/.codex/agents/ccw-loop-b-validate.md',
init: '~/.codex/agents/ccw-loop-b-init.md',
complete: '~/.codex/agents/ccw-loop-b-complete.md'
}
return `
## TASK ASSIGNMENT
### MANDATORY FIRST STEPS (Agent Execute)
1. **Read role definition**: ${workerRoles[action]} (MUST read first)
2. Read: .workflow/project-tech.json
3. Read: .workflow/project-guidelines.json
---
## LOOP CONTEXT
- **Loop ID**: ${loopId}
- **Action**: ${action}
- **State File**: .workflow/.loop/${loopId}.json
- **Output File**: .workflow/.loop/${loopId}.workers/${action}.output.json
- **Progress File**: .workflow/.loop/${loopId}.progress/${action}.md
## CURRENT STATE
${JSON.stringify(state, null, 2)}
## TASK DESCRIPTION
${state.description}
## EXPECTED OUTPUT
\`\`\`
WORKER_RESULT:
- action: ${action}
- status: success | failed | needs_input
- summary: <brief summary>
- files_changed: [list]
- next_suggestion: <suggested next action>
- loop_back_to: <action name if needs loop back>
DETAILED_OUTPUT:
<structured output specific to action type>
\`\`\`
Execute the ${action} action now.
`
}
```
## Worker Roles
| Worker | Role File | 专注领域 |
|--------|-----------|----------|
| init | ccw-loop-b-init.md | 会话初始化、任务解析 |
| develop | ccw-loop-b-develop.md | 代码实现、重构 |
| debug | ccw-loop-b-debug.md | 问题诊断、假设验证 |
| validate | ccw-loop-b-validate.md | 测试执行、覆盖率 |
| complete | ccw-loop-b-complete.md | 总结收尾 |
## State Schema
See [phases/state-schema.md](phases/state-schema.md)
## Usage
```bash
# Interactive mode (default)
/ccw-loop-b TASK="Implement user authentication"
# Auto mode
/ccw-loop-b --mode=auto TASK="Fix login bug"
# Parallel analysis mode
/ccw-loop-b --mode=parallel TASK="Analyze and improve payment module"
# Resume existing loop
/ccw-loop-b --loop-id=loop-b-20260122-abc123
```
## Error Handling
| Situation | Action |
|-----------|--------|
| Worker timeout | send_input 请求收敛 |
| Worker failed | Log error, 协调器决策是否重试 |
| Batch wait partial timeout | 使用已完成结果继续 |
| State corrupted | 从 progress 文件重建 |
## Best Practices
1. **协调器保持轻量**: 只做调度和状态管理,具体工作交给 worker
2. **Worker 职责单一**: 每个 worker 专注一个领域
3. **结果标准化**: Worker 输出遵循统一 WORKER_RESULT 格式
4. **灵活模式切换**: 根据任务复杂度选择合适模式
5. **及时清理**: Worker 完成后 close_agent 释放资源

View File

@@ -1,257 +0,0 @@
# Orchestrator (Hybrid Pattern)
协调器负责状态管理、worker 调度、结果汇聚。
## Role
```
Read state -> Select mode -> Spawn workers -> Wait results -> Merge -> Update state -> Loop/Exit
```
## State Management
### Read State
```javascript
function readState(loopId) {
const stateFile = `.workflow/.loop/${loopId}.json`
return fs.existsSync(stateFile)
? JSON.parse(Read(stateFile))
: null
}
```
### Create State
```javascript
function createState(loopId, taskDescription, mode) {
const now = new Date().toISOString()
return {
loop_id: loopId,
title: taskDescription.substring(0, 100),
description: taskDescription,
mode: mode,
status: 'running',
current_iteration: 0,
max_iterations: 10,
created_at: now,
updated_at: now,
skill_state: {
phase: 'init',
action_index: 0,
workers_completed: [],
parallel_results: null
}
}
}
```
## Mode Handlers
### Interactive Mode
```javascript
async function runInteractiveMode(loopId, state) {
while (state.status === 'running') {
// 1. Show menu
const action = await showMenu(state)
if (action === 'exit') break
// 2. Spawn worker
const worker = spawn_agent({
message: buildWorkerPrompt(action, loopId, state)
})
// 3. Wait for result
const result = wait({ ids: [worker], timeout_ms: 600000 })
// 4. Handle timeout
if (result.timed_out) {
send_input({ id: worker, message: 'Please converge and output WORKER_RESULT' })
const retryResult = wait({ ids: [worker], timeout_ms: 300000 })
if (retryResult.timed_out) {
console.log('Worker timeout, skipping')
close_agent({ id: worker })
continue
}
}
// 5. Process output
const output = result.status[worker].completed
state = processWorkerOutput(loopId, action, output, state)
// 6. Cleanup
close_agent({ id: worker })
// 7. Display result
displayResult(output)
}
}
```
### Auto Mode
```javascript
async function runAutoMode(loopId, state) {
const sequence = ['init', 'develop', 'debug', 'validate', 'complete']
let idx = state.skill_state?.action_index || 0
while (idx < sequence.length && state.status === 'running') {
const action = sequence[idx]
// Spawn and wait
const worker = spawn_agent({ message: buildWorkerPrompt(action, loopId, state) })
const result = wait({ ids: [worker], timeout_ms: 600000 })
const output = result.status[worker].completed
close_agent({ id: worker })
// Parse result
const workerResult = parseWorkerResult(output)
state = processWorkerOutput(loopId, action, output, state)
// Determine next
if (workerResult.loop_back_to) {
idx = sequence.indexOf(workerResult.loop_back_to)
} else if (workerResult.status === 'failed') {
break
} else {
idx++
}
// Update action index
state.skill_state.action_index = idx
saveState(loopId, state)
}
}
```
### Parallel Mode
```javascript
async function runParallelMode(loopId, state) {
// Spawn all workers
const workers = {
develop: spawn_agent({ message: buildWorkerPrompt('develop', loopId, state) }),
debug: spawn_agent({ message: buildWorkerPrompt('debug', loopId, state) }),
validate: spawn_agent({ message: buildWorkerPrompt('validate', loopId, state) })
}
// Batch wait
const results = wait({
ids: Object.values(workers),
timeout_ms: 900000
})
// Collect outputs
const outputs = {}
for (const [role, id] of Object.entries(workers)) {
if (results.status[id].completed) {
outputs[role] = results.status[id].completed
}
close_agent({ id })
}
// Merge analysis
state.skill_state.parallel_results = outputs
saveState(loopId, state)
// Coordinator analyzes merged results
return analyzeAndDecide(outputs)
}
```
## Worker Prompt Template
```javascript
function buildWorkerPrompt(action, loopId, state) {
const roleFiles = {
init: '~/.codex/agents/ccw-loop-b-init.md',
develop: '~/.codex/agents/ccw-loop-b-develop.md',
debug: '~/.codex/agents/ccw-loop-b-debug.md',
validate: '~/.codex/agents/ccw-loop-b-validate.md',
complete: '~/.codex/agents/ccw-loop-b-complete.md'
}
return `
## TASK ASSIGNMENT
### MANDATORY FIRST STEPS
1. **Read role definition**: ${roleFiles[action]}
2. Read: .workflow/project-tech.json
3. Read: .workflow/project-guidelines.json
---
## CONTEXT
- Loop ID: ${loopId}
- Action: ${action}
- State: ${JSON.stringify(state, null, 2)}
## TASK
${state.description}
## OUTPUT FORMAT
\`\`\`
WORKER_RESULT:
- action: ${action}
- status: success | failed | needs_input
- summary: <brief>
- files_changed: []
- next_suggestion: <action>
- loop_back_to: <action or null>
DETAILED_OUTPUT:
<action-specific output>
\`\`\`
`
}
```
## Result Processing
```javascript
function parseWorkerResult(output) {
const result = {
action: 'unknown',
status: 'unknown',
summary: '',
files_changed: [],
next_suggestion: null,
loop_back_to: null
}
const match = output.match(/WORKER_RESULT:\s*([\s\S]*?)(?:DETAILED_OUTPUT:|$)/)
if (match) {
const lines = match[1].split('\n')
for (const line of lines) {
const m = line.match(/^-\s*(\w+):\s*(.+)$/)
if (m) {
const [, key, value] = m
if (key === 'files_changed') {
try { result.files_changed = JSON.parse(value) } catch {}
} else {
result[key] = value.trim()
}
}
}
}
return result
}
```
## Termination Conditions
1. User exits (interactive)
2. Sequence complete (auto)
3. Worker failed with no recovery
4. Max iterations reached
5. API paused/stopped
## Best Practices
1. **Worker 生命周期**: spawn → wait → close不保留 worker
2. **结果持久化**: Worker 输出写入 `.workflow/.loop/{loopId}.workers/`
3. **状态同步**: 每次 worker 完成后更新 state
4. **超时处理**: send_input 请求收敛,再超时则跳过

View File

@@ -1,181 +0,0 @@
# State Schema (CCW Loop-B)
## Master State Structure
```json
{
"loop_id": "loop-b-20260122-abc123",
"title": "Implement user authentication",
"description": "Full task description here",
"mode": "interactive | auto | parallel",
"status": "running | paused | completed | failed",
"current_iteration": 3,
"max_iterations": 10,
"created_at": "2026-01-22T10:00:00.000Z",
"updated_at": "2026-01-22T10:30:00.000Z",
"skill_state": {
"phase": "develop | debug | validate | complete",
"action_index": 2,
"workers_completed": ["init", "develop"],
"parallel_results": null,
"pending_tasks": [],
"completed_tasks": [],
"findings": []
}
}
```
## Field Descriptions
### Core Fields (API Compatible)
| Field | Type | Description |
|-------|------|-------------|
| `loop_id` | string | Unique identifier |
| `title` | string | Short title (max 100 chars) |
| `description` | string | Full task description |
| `mode` | enum | Execution mode |
| `status` | enum | Current status |
| `current_iteration` | number | Iteration counter |
| `max_iterations` | number | Safety limit |
| `created_at` | ISO string | Creation timestamp |
| `updated_at` | ISO string | Last update timestamp |
### Skill State Fields
| Field | Type | Description |
|-------|------|-------------|
| `phase` | enum | Current execution phase |
| `action_index` | number | Position in action sequence (auto mode) |
| `workers_completed` | array | List of completed worker actions |
| `parallel_results` | object | Merged results from parallel mode |
| `pending_tasks` | array | Tasks waiting to be executed |
| `completed_tasks` | array | Tasks already done |
| `findings` | array | Discoveries during execution |
## Worker Output Structure
Each worker writes to `.workflow/.loop/{loopId}.workers/{action}.output.json`:
```json
{
"action": "develop",
"status": "success",
"summary": "Implemented 3 functions",
"files_changed": ["src/auth.ts", "src/utils.ts"],
"next_suggestion": "validate",
"loop_back_to": null,
"timestamp": "2026-01-22T10:15:00.000Z",
"detailed_output": {
"tasks_completed": [
{ "id": "T1", "description": "Create auth module" }
],
"metrics": {
"lines_added": 150,
"lines_removed": 20
}
}
}
```
## Progress File Structure
Human-readable progress in `.workflow/.loop/{loopId}.progress/{action}.md`:
```markdown
# Develop Progress
## Session: loop-b-20260122-abc123
### Iteration 1 (2026-01-22 10:15)
**Task**: Implement auth module
**Changes**:
- Created `src/auth.ts` with login/logout functions
- Added JWT token handling in `src/utils.ts`
**Status**: Success
---
### Iteration 2 (2026-01-22 10:30)
...
```
## Status Transitions
```
+--------+
| init |
+--------+
|
v
+------> +---------+
| | develop |
| +---------+
| |
| +--------+--------+
| | |
| v v
| +-------+ +---------+
| | debug |<------| validate|
| +-------+ +---------+
| | |
| +--------+--------+
| |
| v
| [needs fix?]
| yes | | no
| v v
+------------+ +----------+
| complete |
+----------+
```
## Parallel Results Schema
When `mode === 'parallel'`:
```json
{
"parallel_results": {
"develop": {
"status": "success",
"summary": "...",
"suggestions": []
},
"debug": {
"status": "success",
"issues_found": [],
"suggestions": []
},
"validate": {
"status": "success",
"test_results": {},
"coverage": {}
},
"merged_at": "2026-01-22T10:45:00.000Z"
}
}
```
## Directory Structure
```
.workflow/.loop/
+-- loop-b-20260122-abc123.json # Master state
+-- loop-b-20260122-abc123.workers/
| +-- init.output.json
| +-- develop.output.json
| +-- debug.output.json
| +-- validate.output.json
| +-- complete.output.json
+-- loop-b-20260122-abc123.progress/
+-- develop.md
+-- debug.md
+-- validate.md
+-- summary.md
```

View File

@@ -1,383 +0,0 @@
# Action Catalog (CCW Loop-B)
Complete reference of worker actions and their capabilities.
## Action Matrix
| Action | Worker Agent | Purpose | Input Requirements | Output |
|--------|--------------|---------|-------------------|--------|
| init | ccw-loop-b-init.md | Session initialization | Task description | Task breakdown + execution plan |
| develop | ccw-loop-b-develop.md | Code implementation | Task list | Code changes + progress update |
| debug | ccw-loop-b-debug.md | Problem diagnosis | Issue description | Root cause analysis + fix suggestions |
| validate | ccw-loop-b-validate.md | Testing and verification | Files to test | Test results + coverage report |
| complete | ccw-loop-b-complete.md | Session finalization | All worker outputs | Summary + commit message |
## Detailed Action Specifications
### INIT
**Purpose**: Parse requirements, create execution plan
**Preconditions**:
- `status === 'running'`
- `skill_state === null` (first time)
**Input**:
```
- Task description (text)
- Project context files
```
**Execution**:
1. Read `.workflow/project-tech.json`
2. Read `.workflow/project-guidelines.json`
3. Parse task into phases
4. Create task breakdown
5. Generate execution plan
**Output**:
```
WORKER_RESULT:
- action: init
- status: success
- summary: "Initialized with 5 tasks"
- next_suggestion: develop
TASK_BREAKDOWN:
- T1: Create auth module
- T2: Implement JWT utils
- T3: Write tests
- T4: Validate implementation
- T5: Documentation
EXECUTION_PLAN:
1. Develop (T1-T2)
2. Validate (T3-T4)
3. Complete (T5)
```
**Effects**:
- `skill_state.pending_tasks` populated
- Progress structure created
- Ready for develop phase
---
### DEVELOP
**Purpose**: Implement code, create/modify files
**Preconditions**:
- `skill_state.pending_tasks.length > 0`
- `status === 'running'`
**Input**:
```
- Task list from state
- Project conventions
- Existing code patterns
```
**Execution**:
1. Load pending tasks
2. Find existing patterns
3. Implement tasks one by one
4. Update progress file
5. Mark tasks completed
**Output**:
```
WORKER_RESULT:
- action: develop
- status: success
- summary: "Implemented 3 tasks"
- files_changed: ["src/auth.ts", "src/utils.ts"]
- next_suggestion: validate
DETAILED_OUTPUT:
tasks_completed: [T1, T2]
metrics:
lines_added: 180
lines_removed: 15
```
**Effects**:
- Files created/modified
- `skill_state.completed_tasks` updated
- Progress documented
**Failure Modes**:
- Pattern unclear → suggest debug
- Task blocked → mark blocked, continue
- Partial completion → set `loop_back_to: "develop"`
---
### DEBUG
**Purpose**: Diagnose issues, root cause analysis
**Preconditions**:
- Issue exists (test failure, bug report, etc.)
- `status === 'running'`
**Input**:
```
- Issue description
- Error messages
- Stack traces
- Reproduction steps
```
**Execution**:
1. Understand problem symptoms
2. Gather evidence from code
3. Form hypothesis
4. Test hypothesis
5. Document root cause
6. Suggest fixes
**Output**:
```
WORKER_RESULT:
- action: debug
- status: success
- summary: "Root cause: memory leak in event listeners"
- next_suggestion: develop (apply fixes)
ROOT_CAUSE_ANALYSIS:
hypothesis: "Listener accumulation"
confidence: high
evidence: [...]
mechanism: "Detailed explanation"
FIX_RECOMMENDATIONS:
1. Add removeAllListeners() on disconnect
2. Verification: Monitor memory usage
```
**Effects**:
- `skill_state.findings` updated
- Fix recommendations documented
- Ready for develop to apply fixes
**Failure Modes**:
- Insufficient info → request more data
- Multiple hypotheses → rank by likelihood
- Inconclusive → suggest investigation areas
---
### VALIDATE
**Purpose**: Run tests, check coverage, quality gates
**Preconditions**:
- Code exists to validate
- `status === 'running'`
**Input**:
```
- Files to test
- Test configuration
- Coverage requirements
```
**Execution**:
1. Identify test framework
2. Run unit tests
3. Run integration tests
4. Measure coverage
5. Check quality (lint, types, security)
6. Generate report
**Output**:
```
WORKER_RESULT:
- action: validate
- status: success
- summary: "113 tests pass, coverage 95%"
- next_suggestion: complete (all pass) | develop (fix failures)
TEST_RESULTS:
unit_tests: { passed: 98, failed: 0 }
integration_tests: { passed: 15, failed: 0 }
coverage: "95%"
QUALITY_CHECKS:
lint: ✓ Pass
types: ✓ Pass
security: ✓ Pass
```
**Effects**:
- Test results documented
- Coverage measured
- Quality gates verified
**Failure Modes**:
- Tests fail → document failures, suggest fixes
- Coverage low → identify gaps
- Quality issues → flag problems
---
### COMPLETE
**Purpose**: Finalize session, generate summary, commit
**Preconditions**:
- All tasks completed
- Tests passing
- `status === 'running'`
**Input**:
```
- All worker outputs
- Progress files
- Current state
```
**Execution**:
1. Read all worker outputs
2. Consolidate achievements
3. Verify completeness
4. Generate summary
5. Prepare commit message
6. Cleanup and archive
**Output**:
```
WORKER_RESULT:
- action: complete
- status: success
- summary: "Session completed successfully"
- next_suggestion: null
SESSION_SUMMARY:
achievements: [...]
files_changed: [...]
test_results: { ... }
quality_checks: { ... }
COMMIT_SUGGESTION:
message: "feat: ..."
files: [...]
ready_for_pr: true
```
**Effects**:
- `status` → 'completed'
- Summary file created
- Progress archived
- Commit message ready
**Failure Modes**:
- Pending tasks remain → mark partial
- Quality gates fail → list failures
---
## Action Flow Diagrams
### Interactive Mode Flow
```
+------+
| INIT |
+------+
|
v
+------+ user selects
| MENU |-------------+
+------+ |
^ v
| +--------------+
| | spawn worker |
| +--------------+
| |
| v
| +------+-------+
+---------| wait result |
+------+-------+
|
v
+------+-------+
| update state |
+--------------+
|
v
[completed?] --no--> [back to MENU]
|
yes
v
+----------+
| COMPLETE |
+----------+
```
### Auto Mode Flow
```
+------+ +---------+ +-------+ +----------+ +----------+
| INIT | ---> | DEVELOP | ---> | DEBUG | ---> | VALIDATE | ---> | COMPLETE |
+------+ +---------+ +-------+ +----------+ +----------+
^ | |
| +--- [issues] |
+--------------------------------+
[tests fail]
```
### Parallel Mode Flow
```
+------+
| INIT |
+------+
|
v
+---------------------+
| spawn all workers |
| [develop, debug, |
| validate] |
+---------------------+
|
v
+---------------------+
| wait({ ids: all }) |
+---------------------+
|
v
+---------------------+
| merge results |
+---------------------+
|
v
+---------------------+
| coordinator decides |
+---------------------+
|
v
+----------+
| COMPLETE |
+----------+
```
## Worker Coordination
| Scenario | Worker Sequence | Mode |
|----------|-----------------|------|
| Simple task | init → develop → validate → complete | Auto |
| Complex task | init → develop → debug → develop → validate → complete | Auto |
| Bug fix | init → debug → develop → validate → complete | Auto |
| Analysis | init → [develop \|\| debug \|\| validate] → complete | Parallel |
| Interactive | init → menu → user selects → worker → menu → ... | Interactive |
## Best Practices
1. **Init always first**: Parse requirements before execution
2. **Validate often**: After each develop phase
3. **Debug when needed**: Don't skip diagnosis
4. **Complete always last**: Ensure proper cleanup
5. **Use parallel wisely**: For independent analysis tasks
6. **Follow sequence**: In auto mode, respect dependencies

View File

@@ -1,171 +0,0 @@
# CCW Loop Skill (Codex Version)
Stateless iterative development loop workflow using Codex subagent pattern.
## Overview
CCW Loop is an autonomous development workflow that supports:
- **Develop**: Task decomposition -> Code implementation -> Progress tracking
- **Debug**: Hypothesis generation -> Evidence collection -> Root cause analysis
- **Validate**: Test execution -> Coverage check -> Quality assessment
## Subagent 机制
核心 API: `spawn_agent` / `wait` / `send_input` / `close_agent`
可用模式: 单 agent 深度交互 / 多 agent 并行 / 混合模式
## Installation
Files are in `.codex/skills/ccw-loop/`:
```
.codex/skills/ccw-loop/
+-- SKILL.md # Main skill definition
+-- README.md # This file
+-- phases/
| +-- orchestrator.md # Orchestration logic
| +-- state-schema.md # State structure
| +-- actions/
| +-- action-init.md # Initialize session
| +-- action-develop.md # Development task
| +-- action-debug.md # Hypothesis debugging
| +-- action-validate.md # Test validation
| +-- action-complete.md # Complete loop
| +-- action-menu.md # Interactive menu
+-- specs/
| +-- action-catalog.md # Action catalog
+-- templates/
+-- (templates)
.codex/agents/
+-- ccw-loop-executor.md # Executor agent role
```
## Usage
### Start New Loop
```bash
# Direct call with task description
/ccw-loop TASK="Implement user authentication"
# Auto-cycle mode
/ccw-loop --auto TASK="Fix login bug and add tests"
```
### Continue Existing Loop
```bash
# Resume from loop ID
/ccw-loop --loop-id=loop-v2-20260122-abc123
# API triggered (from Dashboard)
/ccw-loop --loop-id=loop-v2-20260122-abc123 --auto
```
## Execution Flow
```
1. Parse arguments (task or --loop-id)
2. Create/read state from .workflow/.loop/{loopId}.json
3. spawn_agent with ccw-loop-executor role
4. Main loop:
a. wait() for agent output
b. Parse ACTION_RESULT
c. Handle outcome:
- COMPLETED/PAUSED/STOPPED: exit loop
- WAITING_INPUT: collect user input, send_input
- Next action: send_input to continue
d. Update state file
5. close_agent when done
```
## Session Files
```
.workflow/.loop/
+-- {loopId}.json # Master state (API + Skill)
+-- {loopId}.progress/
+-- develop.md # Development timeline
+-- debug.md # Understanding evolution
+-- validate.md # Validation report
+-- changes.log # Code changes (NDJSON)
+-- debug.log # Debug log (NDJSON)
+-- summary.md # Completion summary
```
## Codex Pattern Highlights
### Single Agent Deep Interaction
Instead of creating multiple agents, use `send_input` for multi-phase:
```javascript
const agent = spawn_agent({ message: role + task })
// Phase 1: INIT
const initResult = wait({ ids: [agent] })
// Phase 2: DEVELOP (via send_input, same agent)
send_input({ id: agent, message: 'Execute DEVELOP' })
const devResult = wait({ ids: [agent] })
// Phase 3: VALIDATE (via send_input, same agent)
send_input({ id: agent, message: 'Execute VALIDATE' })
const valResult = wait({ ids: [agent] })
// Only close when all done
close_agent({ id: agent })
```
### Role Path Passing
Agent reads role file itself (no content embedding):
```javascript
spawn_agent({
message: `
### MANDATORY FIRST STEPS
1. **Read role definition**: ~/.codex/agents/ccw-loop-executor.md
2. Read: .workflow/project-tech.json
...
`
})
```
### Explicit Lifecycle Management
- Always use `wait({ ids })` to get results
- Never assume `close_agent` returns results
- Only `close_agent` when confirming no more interaction needed
## Error Handling
| Situation | Action |
|-----------|--------|
| Agent timeout | `send_input` requesting convergence |
| Session not found | Create new session |
| State corrupted | Rebuild from progress files |
| Tests fail | Loop back to DEBUG |
| >10 iterations | Warn and suggest break |
## Integration
### Dashboard Integration
Works with CCW Dashboard Loop Monitor:
- Dashboard creates loop via API
- API triggers this skill with `--loop-id`
- Skill reads/writes `.workflow/.loop/{loopId}.json`
- Dashboard polls state for real-time updates
### Control Signals
- `paused`: Skill exits gracefully, waits for resume
- `failed`: Skill terminates
- `running`: Skill continues execution
## License
MIT

View File

@@ -1,350 +0,0 @@
---
name: CCW Loop
description: Stateless iterative development loop workflow with documented progress. Supports develop, debug, and validate phases with file-based state tracking. Triggers on "ccw-loop", "dev loop", "development loop", "开发循环", "迭代开发".
argument-hint: TASK="<task description>" [--loop-id=<id>] [--auto]
---
# CCW Loop - Codex Stateless Iterative Development Workflow
Stateless iterative development loop using Codex subagent pattern. Supports develop, debug, and validate phases with file-based state tracking.
## Arguments
| Arg | Required | Description |
|-----|----------|-------------|
| TASK | No | Task description (for new loop, mutually exclusive with --loop-id) |
| --loop-id | No | Existing loop ID to continue (from API or previous session) |
| --auto | No | Auto-cycle mode (develop -> debug -> validate -> complete) |
## Unified Architecture (Codex Subagent Pattern)
```
+-------------------------------------------------------------+
| Dashboard (UI) |
| [Create] [Start] [Pause] [Resume] [Stop] [View Progress] |
+-------------------------------------------------------------+
|
v
+-------------------------------------------------------------+
| loop-v2-routes.ts (Control Plane) |
| |
| State: .workflow/.loop/{loopId}.json (MASTER) |
| Tasks: .workflow/.loop/{loopId}.tasks.jsonl |
| |
| /start -> Trigger ccw-loop skill with --loop-id |
| /pause -> Set status='paused' (skill checks before action) |
| /stop -> Set status='failed' (skill terminates) |
| /resume -> Set status='running' (skill continues) |
+-------------------------------------------------------------+
|
v
+-------------------------------------------------------------+
| ccw-loop Skill (Execution Plane) |
| |
| Codex Pattern: spawn_agent -> wait -> send_input -> close |
| |
| Reads/Writes: .workflow/.loop/{loopId}.json (unified state) |
| Writes: .workflow/.loop/{loopId}.progress/* (progress files) |
| |
| BEFORE each action: |
| -> Check status: paused/stopped -> exit gracefully |
| -> running -> continue with action |
| |
| Actions: init -> develop -> debug -> validate -> complete |
+-------------------------------------------------------------+
```
## Key Design Principles (Codex Adaptation)
1. **Unified State**: API and Skill share `.workflow/.loop/{loopId}.json` state file
2. **Control Signals**: Skill checks status field before each action (paused/stopped)
3. **File-Driven**: All progress documented in `.workflow/.loop/{loopId}.progress/`
4. **Resumable**: Continue any loop with `--loop-id`
5. **Dual Trigger**: Supports API trigger (`--loop-id`) and direct call (task description)
6. **Single Agent Deep Interaction**: Use send_input for multi-phase execution instead of multiple agents
## Subagent 机制
### 核心 API
| API | 作用 |
|-----|------|
| `spawn_agent({ message })` | 创建 subagent返回 `agent_id` |
| `wait({ ids, timeout_ms })` | 等待结果(唯一取结果入口) |
| `send_input({ id, message })` | 继续交互/追问 |
| `close_agent({ id })` | 关闭回收(不可逆) |
### 可用模式
- **单 Agent 深度交互**: 一个 agent 多阶段,`send_input` 继续
- **多 Agent 并行**: 主协调器 + 多 worker`wait({ ids: [...] })` 批量等待
- **混合模式**: 按需组合
## Execution Modes
### Mode 1: Interactive
User manually selects each action, suitable for complex tasks.
```
User -> Select action -> Execute -> View results -> Select next action
```
### Mode 2: Auto-Loop
Automatic execution in preset order, suitable for standard development flow.
```
Develop -> Debug -> Validate -> (if issues) -> Develop -> ...
```
## Session Structure (Unified Location)
```
.workflow/.loop/
+-- {loopId}.json # Master state file (API + Skill shared)
+-- {loopId}.tasks.jsonl # Task list (API managed)
+-- {loopId}.progress/ # Skill progress files
+-- develop.md # Development progress timeline
+-- debug.md # Understanding evolution document
+-- validate.md # Validation report
+-- changes.log # Code changes log (NDJSON)
+-- debug.log # Debug log (NDJSON)
```
## Implementation (Codex Subagent Pattern)
### Session Setup
```javascript
// Helper: Get UTC+8 (China Standard Time) ISO string
const getUtc8ISOString = () => new Date(Date.now() + 8 * 60 * 60 * 1000).toISOString()
// loopId source:
// 1. API trigger: from --loop-id parameter
// 2. Direct call: generate new loop-v2-{timestamp}-{random}
const loopId = args['--loop-id'] || (() => {
const timestamp = getUtc8ISOString().replace(/[-:]/g, '').split('.')[0]
const random = Math.random().toString(36).substring(2, 10)
return `loop-v2-${timestamp}-${random}`
})()
const loopFile = `.workflow/.loop/${loopId}.json`
const progressDir = `.workflow/.loop/${loopId}.progress`
// Create progress directory
mkdir -p "${progressDir}"
```
### Main Execution Flow (Single Agent Deep Interaction)
```javascript
// ==================== CODEX CCW-LOOP: SINGLE AGENT ORCHESTRATOR ====================
// Step 1: Read or create initial state
let state = null
if (existingLoopId) {
state = JSON.parse(Read(`.workflow/.loop/${loopId}.json`))
if (!state) {
console.error(`Loop not found: ${loopId}`)
return
}
} else {
state = createInitialState(loopId, taskDescription)
Write(`.workflow/.loop/${loopId}.json`, JSON.stringify(state, null, 2))
}
// Step 2: Create orchestrator agent (single agent handles all phases)
const agent = spawn_agent({
message: `
## TASK ASSIGNMENT
### MANDATORY FIRST STEPS (Agent Execute)
1. **Read role definition**: ~/.codex/agents/ccw-loop-executor.md (MUST read first)
2. Read: .workflow/project-tech.json
3. Read: .workflow/project-guidelines.json
---
## LOOP CONTEXT
- **Loop ID**: ${loopId}
- **State File**: .workflow/.loop/${loopId}.json
- **Progress Dir**: ${progressDir}
- **Mode**: ${mode} // 'interactive' or 'auto'
## CURRENT STATE
${JSON.stringify(state, null, 2)}
## TASK DESCRIPTION
${taskDescription}
## EXECUTION INSTRUCTIONS
You are executing CCW Loop orchestrator. Your job:
1. **Check Control Signals**
- Read .workflow/.loop/${loopId}.json
- If status === 'paused' -> Output "PAUSED" and stop
- If status === 'failed' -> Output "STOPPED" and stop
- If status === 'running' -> Continue
2. **Select Next Action**
Based on skill_state:
- If not initialized -> Execute INIT
- If mode === 'interactive' -> Output MENU and wait for input
- If mode === 'auto' -> Auto-select based on state
3. **Execute Action**
- Follow action instructions from ~/.codex/skills/ccw-loop/phases/actions/
- Update progress files in ${progressDir}/
- Update state in .workflow/.loop/${loopId}.json
4. **Output Format**
\`\`\`
ACTION_RESULT:
- action: {action_name}
- status: success | failed | needs_input
- message: {user message}
- state_updates: {JSON of skill_state updates}
NEXT_ACTION_NEEDED: {action_name} | WAITING_INPUT | COMPLETED | PAUSED
\`\`\`
## FIRST ACTION
${!state.skill_state ? 'Execute: INIT' : mode === 'auto' ? 'Auto-select next action' : 'Show MENU'}
`
})
// Step 3: Main orchestration loop
let iteration = 0
const maxIterations = state.max_iterations || 10
while (iteration < maxIterations) {
iteration++
// Wait for agent output
const result = wait({ ids: [agent], timeout_ms: 600000 })
const output = result.status[agent].completed
// Parse action result
const actionResult = parseActionResult(output)
// Handle different outcomes
switch (actionResult.next_action) {
case 'COMPLETED':
case 'PAUSED':
case 'STOPPED':
close_agent({ id: agent })
return actionResult
case 'WAITING_INPUT':
// Interactive mode: display menu, get user choice
const userChoice = await displayMenuAndGetChoice(actionResult)
// Send user choice back to agent
send_input({
id: agent,
message: `
## USER INPUT RECEIVED
Action selected: ${userChoice.action}
${userChoice.data ? `Additional data: ${JSON.stringify(userChoice.data)}` : ''}
## EXECUTE SELECTED ACTION
Follow instructions for: ${userChoice.action}
Update state and progress files accordingly.
`
})
break
default:
// Auto mode: agent continues to next action
// Check if we need to prompt for continuation
if (actionResult.next_action && actionResult.next_action !== 'NONE') {
send_input({
id: agent,
message: `
## CONTINUE EXECUTION
Previous action completed: ${actionResult.action}
Result: ${actionResult.status}
## EXECUTE NEXT ACTION
Continue with: ${actionResult.next_action}
`
})
}
}
// Update iteration count in state
const currentState = JSON.parse(Read(`.workflow/.loop/${loopId}.json`))
currentState.current_iteration = iteration
currentState.updated_at = getUtc8ISOString()
Write(`.workflow/.loop/${loopId}.json`, JSON.stringify(currentState, null, 2))
}
// Step 4: Cleanup
close_agent({ id: agent })
```
## Action Catalog
| Action | Purpose | Output Files | Trigger |
|--------|---------|--------------|---------|
| [action-init](phases/actions/action-init.md) | Initialize loop session | meta.json, state.json | First run |
| [action-develop](phases/actions/action-develop.md) | Execute development task | progress.md, tasks.json | Has pending tasks |
| [action-debug](phases/actions/action-debug.md) | Hypothesis-driven debug | understanding.md, hypotheses.json | Needs debugging |
| [action-validate](phases/actions/action-validate.md) | Test and validate | validation.md, test-results.json | Needs validation |
| [action-complete](phases/actions/action-complete.md) | Complete loop | summary.md | All done |
| [action-menu](phases/actions/action-menu.md) | Display action menu | - | Interactive mode |
## Usage
```bash
# Start new loop (direct call)
/ccw-loop TASK="Implement user authentication"
# Continue existing loop (API trigger or manual resume)
/ccw-loop --loop-id=loop-v2-20260122-abc123
# Auto-cycle mode
/ccw-loop --auto TASK="Fix login bug and add tests"
# API triggered auto-cycle
/ccw-loop --loop-id=loop-v2-20260122-abc123 --auto
```
## Reference Documents
| Document | Purpose |
|----------|---------|
| [phases/orchestrator.md](phases/orchestrator.md) | Orchestrator: state reading + action selection |
| [phases/state-schema.md](phases/state-schema.md) | State structure definition |
| [specs/loop-requirements.md](specs/loop-requirements.md) | Loop requirements specification |
| [specs/action-catalog.md](specs/action-catalog.md) | Action catalog |
## Error Handling
| Situation | Action |
|-----------|--------|
| Session not found | Create new session |
| State file corrupted | Rebuild from file contents |
| Agent timeout | send_input to request convergence |
| Agent unexpectedly closed | Re-spawn, paste previous output |
| Tests fail | Loop back to develop/debug |
| >10 iterations | Warn user, suggest break |
## Codex Best Practices Applied
1. **Role Path Passing**: Agent reads role file itself (no content embedding)
2. **Single Agent Deep Interaction**: Use send_input for multi-phase instead of multiple agents
3. **Delayed close_agent**: Only close after confirming no more interaction needed
4. **Context Reuse**: Same agent maintains all exploration context automatically
5. **Explicit wait()**: Always use wait({ ids }) to get results, not close_agent

View File

@@ -1,269 +0,0 @@
# Action: COMPLETE
Complete CCW Loop session and generate summary report.
## Purpose
- Generate completion report
- Aggregate all phase results
- Provide follow-up recommendations
- Offer expansion to issues
- Mark status as completed
## Preconditions
- [ ] state.status === 'running'
- [ ] state.skill_state !== null
## Execution Steps
### Step 1: Verify Control Signals
```javascript
const state = JSON.parse(Read(`.workflow/.loop/${loopId}.json`))
if (state.status !== 'running') {
return {
action: 'COMPLETE',
status: 'failed',
message: `Cannot complete: status is ${state.status}`,
next_action: state.status === 'paused' ? 'PAUSED' : 'STOPPED'
}
}
```
### Step 2: Aggregate Statistics
```javascript
const stats = {
// Time statistics
duration: Date.now() - new Date(state.created_at).getTime(),
iterations: state.current_iteration,
// Development statistics
develop: {
total_tasks: state.skill_state.develop.total,
completed_tasks: state.skill_state.develop.completed,
completion_rate: state.skill_state.develop.total > 0
? ((state.skill_state.develop.completed / state.skill_state.develop.total) * 100).toFixed(1)
: 0
},
// Debug statistics
debug: {
iterations: state.skill_state.debug.iteration,
hypotheses_tested: state.skill_state.debug.hypotheses.length,
root_cause_found: state.skill_state.debug.confirmed_hypothesis !== null
},
// Validation statistics
validate: {
runs: state.skill_state.validate.test_results.length,
passed: state.skill_state.validate.passed,
coverage: state.skill_state.validate.coverage,
failed_tests: state.skill_state.validate.failed_tests.length
}
}
```
### Step 3: Generate Summary Report
```javascript
const timestamp = getUtc8ISOString()
const summaryReport = `# CCW Loop Session Summary
**Loop ID**: ${loopId}
**Task**: ${state.description}
**Started**: ${state.created_at}
**Completed**: ${timestamp}
**Duration**: ${formatDuration(stats.duration)}
---
## Executive Summary
${state.skill_state.validate.passed
? 'All tests passed, validation successful'
: state.skill_state.develop.completed === state.skill_state.develop.total
? 'Development complete, validation not passed - needs debugging'
: 'Task partially complete - pending items remain'}
---
## Development Phase
| Metric | Value |
|--------|-------|
| Total Tasks | ${stats.develop.total_tasks} |
| Completed | ${stats.develop.completed_tasks} |
| Completion Rate | ${stats.develop.completion_rate}% |
### Completed Tasks
${state.skill_state.develop.tasks.filter(t => t.status === 'completed').map(t => `
- ${t.description}
- Files: ${t.files_changed?.join(', ') || 'N/A'}
- Completed: ${t.completed_at}
`).join('\n')}
### Pending Tasks
${state.skill_state.develop.tasks.filter(t => t.status !== 'completed').map(t => `
- ${t.description}
`).join('\n') || '_None_'}
---
## Debug Phase
| Metric | Value |
|--------|-------|
| Iterations | ${stats.debug.iterations} |
| Hypotheses Tested | ${stats.debug.hypotheses_tested} |
| Root Cause Found | ${stats.debug.root_cause_found ? 'Yes' : 'No'} |
${stats.debug.root_cause_found ? `
### Confirmed Root Cause
${state.skill_state.debug.confirmed_hypothesis}: ${state.skill_state.debug.hypotheses.find(h => h.id === state.skill_state.debug.confirmed_hypothesis)?.description}
` : ''}
---
## Validation Phase
| Metric | Value |
|--------|-------|
| Test Runs | ${stats.validate.runs} |
| Status | ${stats.validate.passed ? 'PASSED' : 'FAILED'} |
| Coverage | ${stats.validate.coverage || 'N/A'}% |
| Failed Tests | ${stats.validate.failed_tests} |
---
## Recommendations
${generateRecommendations(stats, state)}
---
## Session Artifacts
| File | Description |
|------|-------------|
| \`develop.md\` | Development progress timeline |
| \`debug.md\` | Debug exploration and learnings |
| \`validate.md\` | Validation report |
| \`test-results.json\` | Test execution results |
---
*Generated by CCW Loop at ${timestamp}*
`
Write(`${progressDir}/summary.md`, summaryReport)
```
### Step 4: Update State to Completed
```javascript
state.status = 'completed'
state.completed_at = timestamp
state.updated_at = timestamp
state.skill_state.last_action = 'COMPLETE'
state.skill_state.summary = stats
Write(`.workflow/.loop/${loopId}.json`, JSON.stringify(state, null, 2))
```
## Output Format
```
ACTION_RESULT:
- action: COMPLETE
- status: success
- message: Loop completed. Duration: {duration}, Iterations: {N}
- state_updates: {
"status": "completed",
"completed_at": "{timestamp}"
}
FILES_UPDATED:
- .workflow/.loop/{loopId}.json: Status set to completed
- .workflow/.loop/{loopId}.progress/summary.md: Summary report generated
NEXT_ACTION_NEEDED: COMPLETED
```
## Expansion Options
After completion, offer expansion to issues:
```
## Expansion Options
Would you like to create follow-up issues?
1. [test] Add more test cases
2. [enhance] Feature enhancements
3. [refactor] Code refactoring
4. [doc] Documentation updates
5. [none] No expansion needed
Select options (comma-separated) or 'none':
```
## Helper Functions
```javascript
function formatDuration(ms) {
const seconds = Math.floor(ms / 1000)
const minutes = Math.floor(seconds / 60)
const hours = Math.floor(minutes / 60)
if (hours > 0) {
return `${hours}h ${minutes % 60}m`
} else if (minutes > 0) {
return `${minutes}m ${seconds % 60}s`
} else {
return `${seconds}s`
}
}
function generateRecommendations(stats, state) {
const recommendations = []
if (stats.develop.completion_rate < 100) {
recommendations.push('- Complete remaining development tasks')
}
if (!stats.validate.passed) {
recommendations.push('- Fix failing tests')
}
if (stats.validate.coverage && stats.validate.coverage < 80) {
recommendations.push(`- Improve test coverage (current: ${stats.validate.coverage}%)`)
}
if (recommendations.length === 0) {
recommendations.push('- Consider code review')
recommendations.push('- Update documentation')
recommendations.push('- Prepare for deployment')
}
return recommendations.join('\n')
}
```
## Error Handling
| Error Type | Recovery |
|------------|----------|
| Report generation failed | Show basic stats, skip file write |
| Issue creation failed | Log error, continue completion |
## Next Actions
- None (terminal state)
- To continue: Use `/ccw-loop --loop-id={loopId}` to reopen (will set status back to running)

View File

@@ -1,286 +0,0 @@
# Action: DEBUG
Hypothesis-driven debugging with understanding evolution documentation.
## Purpose
- Locate error source
- Generate testable hypotheses
- Add NDJSON instrumentation
- Analyze log evidence
- Correct understanding based on evidence
- Apply fixes
## Preconditions
- [ ] state.status === 'running'
- [ ] state.skill_state !== null
## Mode Detection
```javascript
const understandingPath = `${progressDir}/debug.md`
const debugLogPath = `${progressDir}/debug.log`
const understandingExists = fs.existsSync(understandingPath)
const logHasContent = fs.existsSync(debugLogPath) && fs.statSync(debugLogPath).size > 0
const debugMode = logHasContent ? 'analyze' : (understandingExists ? 'continue' : 'explore')
```
## Execution Steps
### Mode: Explore (First Debug)
#### Step E1: Get Bug Description
```javascript
// From test failures or user input
const bugDescription = state.skill_state.validate?.failed_tests?.[0]
|| await getUserInput('Describe the bug:')
```
#### Step E2: Search Codebase
```javascript
// Use ACE search_context to find related code
const searchResults = mcp__ace-tool__search_context({
project_root_path: '.',
query: `code related to: ${bugDescription}`
})
```
#### Step E3: Generate Hypotheses
```javascript
const hypotheses = [
{
id: 'H1',
description: 'Most likely cause',
testable_condition: 'What to check',
logging_point: 'file.ts:functionName:42',
evidence_criteria: {
confirm: 'If we see X, hypothesis confirmed',
reject: 'If we see Y, hypothesis rejected'
},
likelihood: 1,
status: 'pending',
evidence: null,
verdict_reason: null
},
// H2, H3...
]
```
#### Step E4: Create Understanding Document
```javascript
const initialUnderstanding = `# Understanding Document
**Loop ID**: ${loopId}
**Bug Description**: ${bugDescription}
**Started**: ${getUtc8ISOString()}
---
## Exploration Timeline
### Iteration 1 - Initial Exploration (${getUtc8ISOString()})
#### Current Understanding
Based on bug description and code search:
- Error pattern: [identified pattern]
- Affected areas: [files/modules]
- Initial hypothesis: [first thoughts]
#### Evidence from Code Search
[Search results summary]
#### Hypotheses
${hypotheses.map(h => `
**${h.id}**: ${h.description}
- Testable condition: ${h.testable_condition}
- Logging point: ${h.logging_point}
- Likelihood: ${h.likelihood}
`).join('\n')}
---
## Current Consolidated Understanding
[Summary of what we know so far]
`
Write(understandingPath, initialUnderstanding)
Write(`${progressDir}/hypotheses.json`, JSON.stringify({ hypotheses, iteration: 1 }, null, 2))
```
#### Step E5: Add NDJSON Logging Points
```javascript
// For each hypothesis, add instrumentation
for (const hypothesis of hypotheses) {
const [file, func, line] = hypothesis.logging_point.split(':')
const logStatement = `console.log(JSON.stringify({
hid: "${hypothesis.id}",
ts: Date.now(),
func: "${func}",
data: { /* relevant context */ }
}))`
// Add to file using Edit tool
}
```
### Mode: Analyze (Has Logs)
#### Step A1: Parse Debug Log
```javascript
const logContent = Read(debugLogPath)
const entries = logContent.split('\n')
.filter(l => l.trim())
.map(l => JSON.parse(l))
// Group by hypothesis ID
const byHypothesis = entries.reduce((acc, e) => {
acc[e.hid] = acc[e.hid] || []
acc[e.hid].push(e)
return acc
}, {})
```
#### Step A2: Evaluate Evidence
```javascript
const hypothesesData = JSON.parse(Read(`${progressDir}/hypotheses.json`))
for (const hypothesis of hypothesesData.hypotheses) {
const evidence = byHypothesis[hypothesis.id] || []
// Evaluate against criteria
if (matchesConfirmCriteria(evidence, hypothesis.evidence_criteria.confirm)) {
hypothesis.status = 'confirmed'
hypothesis.evidence = evidence
hypothesis.verdict_reason = 'Evidence matches confirm criteria'
} else if (matchesRejectCriteria(evidence, hypothesis.evidence_criteria.reject)) {
hypothesis.status = 'rejected'
hypothesis.evidence = evidence
hypothesis.verdict_reason = 'Evidence matches reject criteria'
} else {
hypothesis.status = 'inconclusive'
hypothesis.evidence = evidence
hypothesis.verdict_reason = 'Insufficient evidence'
}
}
```
#### Step A3: Update Understanding
```javascript
const iteration = hypothesesData.iteration + 1
const timestamp = getUtc8ISOString()
const analysisEntry = `
### Iteration ${iteration} - Evidence Analysis (${timestamp})
#### Log Analysis Results
${hypothesesData.hypotheses.map(h => `
**${h.id}**: ${h.status.toUpperCase()}
- Evidence: ${JSON.stringify(h.evidence?.slice(0, 3))}
- Reasoning: ${h.verdict_reason}
`).join('\n')}
#### Corrected Understanding
[Any corrections to previous assumptions]
${confirmedHypothesis ? `
#### Root Cause Identified
**${confirmedHypothesis.id}**: ${confirmedHypothesis.description}
` : `
#### Next Steps
[What to investigate next]
`}
---
`
const existingUnderstanding = Read(understandingPath)
Write(understandingPath, existingUnderstanding + analysisEntry)
```
### Step: Update State
```javascript
state.skill_state.debug.active_bug = bugDescription
state.skill_state.debug.hypotheses = hypothesesData.hypotheses
state.skill_state.debug.hypotheses_count = hypothesesData.hypotheses.length
state.skill_state.debug.iteration = iteration
state.skill_state.debug.last_analysis_at = timestamp
if (confirmedHypothesis) {
state.skill_state.debug.confirmed_hypothesis = confirmedHypothesis.id
}
state.skill_state.last_action = 'DEBUG'
state.updated_at = timestamp
Write(`.workflow/.loop/${loopId}.json`, JSON.stringify(state, null, 2))
```
## Output Format
```
ACTION_RESULT:
- action: DEBUG
- status: success
- message: {Mode description} - {result summary}
- state_updates: {
"debug.iteration": {N},
"debug.confirmed_hypothesis": "{id or null}"
}
FILES_UPDATED:
- .workflow/.loop/{loopId}.progress/debug.md: Understanding updated
- .workflow/.loop/{loopId}.progress/hypotheses.json: Hypotheses updated
- [Source files]: Instrumentation added
NEXT_ACTION_NEEDED: {DEBUG | VALIDATE | DEVELOP | MENU}
```
## Next Action Selection
```javascript
if (confirmedHypothesis) {
// Root cause found, apply fix and validate
return 'VALIDATE'
} else if (allRejected) {
// Generate new hypotheses
return 'DEBUG'
} else {
// Need more evidence - prompt user to reproduce bug
return 'WAITING_INPUT' // User needs to trigger bug
}
```
## Error Handling
| Error Type | Recovery |
|------------|----------|
| Empty debug.log | Prompt user to reproduce bug |
| All hypotheses rejected | Generate new hypotheses |
| >5 iterations | Suggest escalation |
## Next Actions
- Root cause found: `VALIDATE`
- Need more evidence: `DEBUG` (after reproduction)
- All rejected: `DEBUG` (new hypotheses)

View File

@@ -1,183 +0,0 @@
# Action: DEVELOP
Execute development task and record progress to develop.md.
## Purpose
- Execute next pending development task
- Implement code changes
- Record progress to Markdown file
- Update task status in state
## Preconditions
- [ ] state.status === 'running'
- [ ] state.skill_state !== null
- [ ] state.skill_state.develop.tasks.some(t => t.status === 'pending')
## Execution Steps
### Step 1: Verify Control Signals
```javascript
const state = JSON.parse(Read(`.workflow/.loop/${loopId}.json`))
if (state.status !== 'running') {
return {
action: 'DEVELOP',
status: 'failed',
message: `Cannot develop: status is ${state.status}`,
next_action: state.status === 'paused' ? 'PAUSED' : 'STOPPED'
}
}
```
### Step 2: Find Next Pending Task
```javascript
const tasks = state.skill_state.develop.tasks
const currentTask = tasks.find(t => t.status === 'pending')
if (!currentTask) {
return {
action: 'DEVELOP',
status: 'success',
message: 'All development tasks completed',
next_action: mode === 'auto' ? 'VALIDATE' : 'MENU'
}
}
// Mark as in_progress
currentTask.status = 'in_progress'
```
### Step 3: Execute Development Task
```javascript
console.log(`Executing task: ${currentTask.description}`)
// Use appropriate tools based on task type
// - ACE search_context for finding patterns
// - Read for loading files
// - Edit/Write for making changes
// Record files changed
const filesChanged = []
// Implementation logic...
```
### Step 4: Record Changes to Log (NDJSON)
```javascript
const changesLogPath = `${progressDir}/changes.log`
const timestamp = getUtc8ISOString()
const changeEntry = {
timestamp: timestamp,
task_id: currentTask.id,
description: currentTask.description,
files_changed: filesChanged,
result: 'success'
}
// Append to NDJSON log
const existingLog = Read(changesLogPath) || ''
Write(changesLogPath, existingLog + JSON.stringify(changeEntry) + '\n')
```
### Step 5: Update Progress Document
```javascript
const progressPath = `${progressDir}/develop.md`
const iteration = state.skill_state.develop.completed + 1
const progressEntry = `
### Iteration ${iteration} - ${currentTask.description} (${timestamp})
#### Task Details
- **ID**: ${currentTask.id}
- **Tool**: ${currentTask.tool}
- **Mode**: ${currentTask.mode}
#### Implementation Summary
[Implementation description]
#### Files Changed
${filesChanged.map(f => `- \`${f}\``).join('\n') || '- No files changed'}
#### Status: COMPLETED
---
`
const existingProgress = Read(progressPath)
Write(progressPath, existingProgress + progressEntry)
```
### Step 6: Update State
```javascript
currentTask.status = 'completed'
currentTask.completed_at = timestamp
currentTask.files_changed = filesChanged
state.skill_state.develop.completed += 1
state.skill_state.develop.current_task = null
state.skill_state.develop.last_progress_at = timestamp
state.skill_state.last_action = 'DEVELOP'
state.skill_state.completed_actions.push('DEVELOP')
state.updated_at = timestamp
Write(`.workflow/.loop/${loopId}.json`, JSON.stringify(state, null, 2))
```
## Output Format
```
ACTION_RESULT:
- action: DEVELOP
- status: success
- message: Task completed: {task_description}
- state_updates: {
"develop.completed": {N},
"develop.last_progress_at": "{timestamp}"
}
FILES_UPDATED:
- .workflow/.loop/{loopId}.json: Task status updated
- .workflow/.loop/{loopId}.progress/develop.md: Progress entry added
- .workflow/.loop/{loopId}.progress/changes.log: Change entry added
NEXT_ACTION_NEEDED: {DEVELOP | DEBUG | VALIDATE | MENU}
```
## Auto Mode Next Action Selection
```javascript
const pendingTasks = tasks.filter(t => t.status === 'pending')
if (pendingTasks.length > 0) {
return 'DEVELOP' // More tasks to do
} else {
return 'DEBUG' // All done, check for issues
}
```
## Error Handling
| Error Type | Recovery |
|------------|----------|
| Task execution failed | Mark task as failed, continue to next |
| File write failed | Retry once, then report error |
| All tasks done | Move to DEBUG or VALIDATE |
## Next Actions
- More pending tasks: `DEVELOP`
- All tasks complete: `DEBUG` (auto) or `MENU` (interactive)
- Task failed: `DEVELOP` (retry) or `DEBUG` (investigate)

View File

@@ -1,164 +0,0 @@
# Action: INIT
Initialize CCW Loop session, create directory structure and initial state.
## Purpose
- Create session directory structure
- Initialize state file with skill_state
- Analyze task description to generate development tasks
- Prepare execution environment
## Preconditions
- [ ] state.status === 'running'
- [ ] state.skill_state === null
## Execution Steps
### Step 1: Verify Control Signals
```javascript
const state = JSON.parse(Read(`.workflow/.loop/${loopId}.json`))
if (state.status !== 'running') {
return {
action: 'INIT',
status: 'failed',
message: `Cannot init: status is ${state.status}`,
next_action: state.status === 'paused' ? 'PAUSED' : 'STOPPED'
}
}
```
### Step 2: Create Directory Structure
```javascript
const progressDir = `.workflow/.loop/${loopId}.progress`
// Directories created by orchestrator, verify they exist
// mkdir -p ${progressDir}
```
### Step 3: Analyze Task and Generate Tasks
```javascript
// Analyze task description
const taskDescription = state.description
// Generate 3-7 development tasks based on analysis
// Use ACE search or smart_search to find relevant patterns
const tasks = [
{
id: 'task-001',
description: 'Task description based on analysis',
tool: 'gemini',
mode: 'write',
status: 'pending',
priority: 1,
files: [],
created_at: getUtc8ISOString(),
completed_at: null
}
// ... more tasks
]
```
### Step 4: Initialize Progress Document
```javascript
const progressPath = `${progressDir}/develop.md`
const progressInitial = `# Development Progress
**Loop ID**: ${loopId}
**Task**: ${taskDescription}
**Started**: ${getUtc8ISOString()}
---
## Task List
${tasks.map((t, i) => `${i + 1}. [ ] ${t.description}`).join('\n')}
---
## Progress Timeline
`
Write(progressPath, progressInitial)
```
### Step 5: Update State
```javascript
const skillState = {
current_action: 'init',
last_action: null,
completed_actions: [],
mode: mode,
develop: {
total: tasks.length,
completed: 0,
current_task: null,
tasks: tasks,
last_progress_at: null
},
debug: {
active_bug: null,
hypotheses_count: 0,
hypotheses: [],
confirmed_hypothesis: null,
iteration: 0,
last_analysis_at: null
},
validate: {
pass_rate: 0,
coverage: 0,
test_results: [],
passed: false,
failed_tests: [],
last_run_at: null
},
errors: []
}
state.skill_state = skillState
state.updated_at = getUtc8ISOString()
Write(`.workflow/.loop/${loopId}.json`, JSON.stringify(state, null, 2))
```
## Output Format
```
ACTION_RESULT:
- action: INIT
- status: success
- message: Session initialized with {N} development tasks
FILES_UPDATED:
- .workflow/.loop/{loopId}.json: skill_state initialized
- .workflow/.loop/{loopId}.progress/develop.md: Progress document created
NEXT_ACTION_NEEDED: {DEVELOP (auto) | MENU (interactive)}
```
## Error Handling
| Error Type | Recovery |
|------------|----------|
| Directory creation failed | Report error, stop |
| Task analysis failed | Create single generic task |
| State write failed | Retry once, then stop |
## Next Actions
- Success (auto mode): `DEVELOP`
- Success (interactive): `MENU`
- Failed: Report error

View File

@@ -1,205 +0,0 @@
# Action: MENU
Display interactive action menu for user selection.
## Purpose
- Show current state summary
- Display available actions
- Wait for user selection
- Return selected action
## Preconditions
- [ ] state.status === 'running'
- [ ] state.skill_state !== null
- [ ] mode === 'interactive'
## Execution Steps
### Step 1: Verify Control Signals
```javascript
const state = JSON.parse(Read(`.workflow/.loop/${loopId}.json`))
if (state.status !== 'running') {
return {
action: 'MENU',
status: 'failed',
message: `Cannot show menu: status is ${state.status}`,
next_action: state.status === 'paused' ? 'PAUSED' : 'STOPPED'
}
}
```
### Step 2: Generate Status Summary
```javascript
// Development progress
const developProgress = state.skill_state.develop.total > 0
? `${state.skill_state.develop.completed}/${state.skill_state.develop.total} (${((state.skill_state.develop.completed / state.skill_state.develop.total) * 100).toFixed(0)}%)`
: 'Not started'
// Debug status
const debugStatus = state.skill_state.debug.confirmed_hypothesis
? 'Root cause found'
: state.skill_state.debug.iteration > 0
? `Iteration ${state.skill_state.debug.iteration}`
: 'Not started'
// Validation status
const validateStatus = state.skill_state.validate.passed
? 'PASSED'
: state.skill_state.validate.test_results.length > 0
? `FAILED (${state.skill_state.validate.failed_tests.length} failures)`
: 'Not run'
```
### Step 3: Display Menu
```javascript
const menuDisplay = `
================================================================
CCW Loop - ${loopId}
================================================================
Task: ${state.description}
Iteration: ${state.current_iteration}
+-----------------------------------------------------+
| Phase | Status |
+-----------------------------------------------------+
| Develop | ${developProgress.padEnd(35)}|
| Debug | ${debugStatus.padEnd(35)}|
| Validate | ${validateStatus.padEnd(35)}|
+-----------------------------------------------------+
================================================================
MENU_OPTIONS:
1. [develop] Continue Development - ${state.skill_state.develop.total - state.skill_state.develop.completed} tasks pending
2. [debug] Start Debugging - ${debugStatus}
3. [validate] Run Validation - ${validateStatus}
4. [status] View Detailed Status
5. [complete] Complete Loop
6. [exit] Exit (save and quit)
`
console.log(menuDisplay)
```
## Output Format
```
ACTION_RESULT:
- action: MENU
- status: success
- message: ${menuDisplay}
MENU_OPTIONS:
1. [develop] Continue Development - {N} tasks pending
2. [debug] Start Debugging - {status}
3. [validate] Run Validation - {status}
4. [status] View Detailed Status
5. [complete] Complete Loop
6. [exit] Exit (save and quit)
NEXT_ACTION_NEEDED: WAITING_INPUT
```
## User Input Handling
When user provides input, orchestrator sends it back via `send_input`:
```javascript
// User selects "develop"
send_input({
id: agent,
message: `
## USER INPUT RECEIVED
Action selected: develop
## EXECUTE SELECTED ACTION
Execute DEVELOP action.
`
})
```
## Status Detail View
If user selects "status":
```javascript
const detailView = `
## Detailed Status
### Development Progress
${Read(`${progressDir}/develop.md`)?.substring(0, 1000) || 'No progress recorded'}
### Debug Status
${state.skill_state.debug.hypotheses.length > 0
? state.skill_state.debug.hypotheses.map(h => ` ${h.id}: ${h.status} - ${h.description.substring(0, 50)}...`).join('\n')
: ' No debugging started'}
### Validation Results
${state.skill_state.validate.test_results.length > 0
? ` Last run: ${state.skill_state.validate.last_run_at}
Pass rate: ${state.skill_state.validate.pass_rate}%`
: ' No validation run yet'}
`
console.log(detailView)
// Return to menu
return {
action: 'MENU',
status: 'success',
message: detailView,
next_action: 'MENU' // Show menu again
}
```
## Exit Handling
If user selects "exit":
```javascript
// Save current state
state.status = 'user_exit'
state.updated_at = getUtc8ISOString()
Write(`.workflow/.loop/${loopId}.json`, JSON.stringify(state, null, 2))
return {
action: 'MENU',
status: 'success',
message: 'Session saved. Use --loop-id to resume.',
next_action: 'COMPLETED'
}
```
## Action Mapping
| User Selection | Next Action |
|----------------|-------------|
| develop | DEVELOP |
| debug | DEBUG |
| validate | VALIDATE |
| status | MENU (after showing details) |
| complete | COMPLETE |
| exit | COMPLETED (save and exit) |
## Error Handling
| Error Type | Recovery |
|------------|----------|
| Invalid selection | Show menu again |
| User cancels | Return to menu |
## Next Actions
Based on user selection - forwarded via `send_input` by orchestrator.

View File

@@ -1,250 +0,0 @@
# Action: VALIDATE
Run tests and verify implementation, record results to validate.md.
## Purpose
- Run unit tests
- Run integration tests
- Check code coverage
- Generate validation report
- Determine pass/fail status
## Preconditions
- [ ] state.status === 'running'
- [ ] state.skill_state !== null
- [ ] (develop.completed > 0) OR (debug.confirmed_hypothesis !== null)
## Execution Steps
### Step 1: Verify Control Signals
```javascript
const state = JSON.parse(Read(`.workflow/.loop/${loopId}.json`))
if (state.status !== 'running') {
return {
action: 'VALIDATE',
status: 'failed',
message: `Cannot validate: status is ${state.status}`,
next_action: state.status === 'paused' ? 'PAUSED' : 'STOPPED'
}
}
```
### Step 2: Detect Test Framework
```javascript
const packageJson = JSON.parse(Read('package.json') || '{}')
const testScript = packageJson.scripts?.test || 'npm test'
const coverageScript = packageJson.scripts?.['test:coverage']
```
### Step 3: Run Tests
```javascript
const testResult = await Bash({
command: testScript,
timeout: 300000 // 5 minutes
})
// Parse test output based on framework
const testResults = parseTestOutput(testResult.stdout, testResult.stderr)
```
### Step 4: Run Coverage (if available)
```javascript
let coverageData = null
if (coverageScript) {
const coverageResult = await Bash({
command: coverageScript,
timeout: 300000
})
coverageData = parseCoverageReport(coverageResult.stdout)
Write(`${progressDir}/coverage.json`, JSON.stringify(coverageData, null, 2))
}
```
### Step 5: Generate Validation Report
```javascript
const timestamp = getUtc8ISOString()
const iteration = (state.skill_state.validate.test_results?.length || 0) + 1
const validationReport = `# Validation Report
**Loop ID**: ${loopId}
**Task**: ${state.description}
**Validated**: ${timestamp}
---
## Iteration ${iteration} - Validation Run
### Test Execution Summary
| Metric | Value |
|--------|-------|
| Total Tests | ${testResults.total} |
| Passed | ${testResults.passed} |
| Failed | ${testResults.failed} |
| Skipped | ${testResults.skipped} |
| Duration | ${testResults.duration_ms}ms |
| **Pass Rate** | **${((testResults.passed / testResults.total) * 100).toFixed(1)}%** |
### Coverage Report
${coverageData ? `
| File | Statements | Branches | Functions | Lines |
|------|------------|----------|-----------|-------|
${coverageData.files.map(f => `| ${f.path} | ${f.statements}% | ${f.branches}% | ${f.functions}% | ${f.lines}% |`).join('\n')}
**Overall Coverage**: ${coverageData.overall.statements}%
` : '_No coverage data available_'}
### Failed Tests
${testResults.failed > 0 ? testResults.failures.map(f => `
#### ${f.test_name}
- **Suite**: ${f.suite}
- **Error**: ${f.error_message}
`).join('\n') : '_All tests passed_'}
---
## Validation Decision
**Result**: ${testResults.failed === 0 ? 'PASS' : 'FAIL'}
${testResults.failed > 0 ? `
### Next Actions
1. Review failed tests
2. Debug failures using DEBUG action
3. Fix issues and re-run validation
` : `
### Next Actions
1. Consider code review
2. Complete loop
`}
`
Write(`${progressDir}/validate.md`, validationReport)
```
### Step 6: Save Test Results
```javascript
const testResultsData = {
iteration,
timestamp,
summary: {
total: testResults.total,
passed: testResults.passed,
failed: testResults.failed,
skipped: testResults.skipped,
pass_rate: ((testResults.passed / testResults.total) * 100).toFixed(1),
duration_ms: testResults.duration_ms
},
tests: testResults.tests,
failures: testResults.failures,
coverage: coverageData?.overall || null
}
Write(`${progressDir}/test-results.json`, JSON.stringify(testResultsData, null, 2))
```
### Step 7: Update State
```javascript
const validationPassed = testResults.failed === 0 && testResults.passed > 0
state.skill_state.validate.test_results.push(testResultsData)
state.skill_state.validate.pass_rate = parseFloat(testResultsData.summary.pass_rate)
state.skill_state.validate.coverage = coverageData?.overall?.statements || 0
state.skill_state.validate.passed = validationPassed
state.skill_state.validate.failed_tests = testResults.failures.map(f => f.test_name)
state.skill_state.validate.last_run_at = timestamp
state.skill_state.last_action = 'VALIDATE'
state.updated_at = timestamp
Write(`.workflow/.loop/${loopId}.json`, JSON.stringify(state, null, 2))
```
## Output Format
```
ACTION_RESULT:
- action: VALIDATE
- status: success
- message: Validation {PASSED | FAILED} - {pass_count}/{total_count} tests passed
- state_updates: {
"validate.passed": {true | false},
"validate.pass_rate": {N},
"validate.failed_tests": [{list}]
}
FILES_UPDATED:
- .workflow/.loop/{loopId}.progress/validate.md: Validation report created
- .workflow/.loop/{loopId}.progress/test-results.json: Test results saved
- .workflow/.loop/{loopId}.progress/coverage.json: Coverage data saved (if available)
NEXT_ACTION_NEEDED: {COMPLETE | DEBUG | DEVELOP | MENU}
```
## Next Action Selection
```javascript
if (validationPassed) {
const pendingTasks = state.skill_state.develop.tasks.filter(t => t.status === 'pending')
if (pendingTasks.length === 0) {
return 'COMPLETE'
} else {
return 'DEVELOP'
}
} else {
// Tests failed - need debugging
return 'DEBUG'
}
```
## Test Output Parsers
### Jest/Vitest Parser
```javascript
function parseJestOutput(stdout) {
const summaryMatch = stdout.match(/Tests:\s+(\d+)\s+passed.*?(\d+)\s+failed.*?(\d+)\s+total/)
// ... implementation
}
```
### Pytest Parser
```javascript
function parsePytestOutput(stdout) {
const summaryMatch = stdout.match(/(\d+)\s+passed.*?(\d+)\s+failed/)
// ... implementation
}
```
## Error Handling
| Error Type | Recovery |
|------------|----------|
| Tests don't run | Check test script config, report error |
| All tests fail | Suggest DEBUG action |
| Coverage tool missing | Skip coverage, run tests only |
| Timeout | Increase timeout or split tests |
## Next Actions
- Validation passed, no pending: `COMPLETE`
- Validation passed, has pending: `DEVELOP`
- Validation failed: `DEBUG`

View File

@@ -1,416 +0,0 @@
# Orchestrator (Codex Pattern)
Orchestrate CCW Loop using Codex subagent pattern: `spawn_agent -> wait -> send_input -> close_agent`.
## Role
Check control signals -> Read file state -> Select action -> Execute via agent -> Update files -> Loop until complete or paused/stopped.
## Codex Pattern Overview
```
+-- spawn_agent (ccw-loop-executor role) --+
| |
| Phase 1: INIT or first action |
| | |
| v |
| wait() -> get result |
| | |
| v |
| [If needs input] Collect user input |
| | |
| v |
| send_input(user choice + next action) |
| | |
| v |
| wait() -> get result |
| | |
| v |
| [Loop until COMPLETED/PAUSED/STOPPED] |
| | |
+----------v-------------------------------+
|
close_agent()
```
## State Management (Unified Location)
### Read State
```javascript
const getUtc8ISOString = () => new Date(Date.now() + 8 * 60 * 60 * 1000).toISOString()
/**
* Read loop state (unified location)
* @param loopId - Loop ID (e.g., "loop-v2-20260122-abc123")
*/
function readLoopState(loopId) {
const stateFile = `.workflow/.loop/${loopId}.json`
if (!fs.existsSync(stateFile)) {
return null
}
const state = JSON.parse(Read(stateFile))
return state
}
```
### Create New Loop State (Direct Call)
```javascript
/**
* Create new loop state (only for direct calls, API triggers have existing state)
*/
function createLoopState(loopId, taskDescription) {
const stateFile = `.workflow/.loop/${loopId}.json`
const now = getUtc8ISOString()
const state = {
// API compatible fields
loop_id: loopId,
title: taskDescription.substring(0, 100),
description: taskDescription,
max_iterations: 10,
status: 'running', // Direct call sets to running
current_iteration: 0,
created_at: now,
updated_at: now,
// Skill extension fields
skill_state: null // Initialized by INIT action
}
// Ensure directories exist
mkdir -p ".loop"
mkdir -p ".workflow/.loop/${loopId}.progress"
Write(stateFile, JSON.stringify(state, null, 2))
return state
}
```
## Main Execution Flow (Codex Subagent)
```javascript
/**
* Run CCW Loop orchestrator using Codex subagent pattern
* @param options.loopId - Existing Loop ID (API trigger)
* @param options.task - Task description (direct call)
* @param options.mode - 'interactive' | 'auto'
*/
async function runOrchestrator(options = {}) {
const { loopId: existingLoopId, task, mode = 'interactive' } = options
console.log('=== CCW Loop Orchestrator (Codex) Started ===')
// 1. Determine loopId and initial state
let loopId
let state
if (existingLoopId) {
// API trigger: use existing loopId
loopId = existingLoopId
state = readLoopState(loopId)
if (!state) {
console.error(`Loop not found: ${loopId}`)
return { status: 'error', message: 'Loop not found' }
}
console.log(`Resuming loop: ${loopId}`)
console.log(`Status: ${state.status}`)
} else if (task) {
// Direct call: create new loopId
const timestamp = getUtc8ISOString().replace(/[-:]/g, '').split('.')[0]
const random = Math.random().toString(36).substring(2, 10)
loopId = `loop-v2-${timestamp}-${random}`
console.log(`Creating new loop: ${loopId}`)
console.log(`Task: ${task}`)
state = createLoopState(loopId, task)
} else {
console.error('Either --loop-id or task description is required')
return { status: 'error', message: 'Missing loopId or task' }
}
const progressDir = `.workflow/.loop/${loopId}.progress`
// 2. Create executor agent (single agent for entire loop)
const agent = spawn_agent({
message: `
## TASK ASSIGNMENT
### MANDATORY FIRST STEPS (Agent Execute)
1. **Read role definition**: ~/.codex/agents/ccw-loop-executor.md (MUST read first)
2. Read: .workflow/project-tech.json (if exists)
3. Read: .workflow/project-guidelines.json (if exists)
---
## LOOP CONTEXT
- **Loop ID**: ${loopId}
- **State File**: .workflow/.loop/${loopId}.json
- **Progress Dir**: ${progressDir}
- **Mode**: ${mode}
## CURRENT STATE
${JSON.stringify(state, null, 2)}
## TASK DESCRIPTION
${state.description || task}
## FIRST ACTION
${!state.skill_state ? 'Execute: INIT' : mode === 'auto' ? 'Auto-select next action' : 'Show MENU'}
Read the role definition first, then execute the appropriate action.
`
})
// 3. Main orchestration loop
let iteration = state.current_iteration || 0
const maxIterations = state.max_iterations || 10
let continueLoop = true
while (continueLoop && iteration < maxIterations) {
iteration++
// Wait for agent output
const result = wait({ ids: [agent], timeout_ms: 600000 })
// Check for timeout
if (result.timed_out) {
console.log('Agent timeout, requesting convergence...')
send_input({
id: agent,
message: `
## TIMEOUT NOTIFICATION
Execution timeout reached. Please:
1. Output current progress
2. Save any pending state updates
3. Return ACTION_RESULT with current status
`
})
continue
}
const output = result.status[agent].completed
// Parse action result
const actionResult = parseActionResult(output)
console.log(`\n[Iteration ${iteration}] Action: ${actionResult.action}, Status: ${actionResult.status}`)
// Update iteration in state
state = readLoopState(loopId)
state.current_iteration = iteration
state.updated_at = getUtc8ISOString()
Write(`.workflow/.loop/${loopId}.json`, JSON.stringify(state, null, 2))
// Handle different outcomes
switch (actionResult.next_action) {
case 'COMPLETED':
console.log('Loop completed successfully')
continueLoop = false
break
case 'PAUSED':
console.log('Loop paused by API, exiting gracefully')
continueLoop = false
break
case 'STOPPED':
console.log('Loop stopped by API')
continueLoop = false
break
case 'WAITING_INPUT':
// Interactive mode: display menu, get user choice
if (mode === 'interactive') {
const userChoice = await displayMenuAndGetChoice(actionResult)
// Send user choice back to agent
send_input({
id: agent,
message: `
## USER INPUT RECEIVED
Action selected: ${userChoice.action}
${userChoice.data ? `Additional data: ${JSON.stringify(userChoice.data)}` : ''}
## EXECUTE SELECTED ACTION
Read action instructions and execute: ${userChoice.action}
Update state and progress files accordingly.
Output ACTION_RESULT when complete.
`
})
}
break
default:
// Continue with next action
if (actionResult.next_action && actionResult.next_action !== 'NONE') {
send_input({
id: agent,
message: `
## CONTINUE EXECUTION
Previous action completed: ${actionResult.action}
Result: ${actionResult.status}
${actionResult.message ? `Message: ${actionResult.message}` : ''}
## EXECUTE NEXT ACTION
Continue with: ${actionResult.next_action}
Read action instructions and execute.
Output ACTION_RESULT when complete.
`
})
} else {
// No next action specified, check if should continue
if (actionResult.status === 'failed') {
console.log(`Action failed: ${actionResult.message}`)
}
continueLoop = false
}
}
}
// 4. Check iteration limit
if (iteration >= maxIterations) {
console.log(`\nReached maximum iterations (${maxIterations})`)
console.log('Consider breaking down the task or taking a break.')
}
// 5. Cleanup
close_agent({ id: agent })
console.log('\n=== CCW Loop Orchestrator (Codex) Finished ===')
// Return final state
const finalState = readLoopState(loopId)
return {
status: finalState.status,
loop_id: loopId,
iterations: iteration,
final_state: finalState
}
}
/**
* Parse action result from agent output
*/
function parseActionResult(output) {
const result = {
action: 'unknown',
status: 'unknown',
message: '',
state_updates: {},
next_action: 'NONE'
}
// Parse ACTION_RESULT block
const actionMatch = output.match(/ACTION_RESULT:\s*([\s\S]*?)(?:FILES_UPDATED:|NEXT_ACTION_NEEDED:|$)/)
if (actionMatch) {
const lines = actionMatch[1].split('\n')
for (const line of lines) {
const match = line.match(/^-\s*(\w+):\s*(.+)$/)
if (match) {
const [, key, value] = match
if (key === 'state_updates') {
try {
result.state_updates = JSON.parse(value)
} catch (e) {
// Try parsing multi-line JSON
}
} else {
result[key] = value.trim()
}
}
}
}
// Parse NEXT_ACTION_NEEDED
const nextMatch = output.match(/NEXT_ACTION_NEEDED:\s*(\S+)/)
if (nextMatch) {
result.next_action = nextMatch[1]
}
return result
}
/**
* Display menu and get user choice (interactive mode)
*/
async function displayMenuAndGetChoice(actionResult) {
// Parse MENU_OPTIONS from output
const menuMatch = actionResult.message.match(/MENU_OPTIONS:\s*([\s\S]*?)(?:WAITING_INPUT:|$)/)
if (menuMatch) {
console.log('\n' + menuMatch[1])
}
// Use AskUserQuestion to get choice
const response = await AskUserQuestion({
questions: [{
question: "Select next action:",
header: "Action",
multiSelect: false,
options: [
{ label: "develop", description: "Continue development" },
{ label: "debug", description: "Start debugging" },
{ label: "validate", description: "Run validation" },
{ label: "complete", description: "Complete loop" },
{ label: "exit", description: "Exit and save" }
]
}]
})
return { action: response["Action"] }
}
```
## Action Catalog
| Action | Purpose | Preconditions | Effects |
|--------|---------|---------------|---------|
| INIT | Initialize session | status=running, skill_state=null | skill_state initialized |
| MENU | Display menu | skill_state != null, mode=interactive | Wait for user input |
| DEVELOP | Execute dev task | pending tasks > 0 | Update progress.md |
| DEBUG | Hypothesis debug | needs debugging | Update understanding.md |
| VALIDATE | Run tests | needs validation | Update validation.md |
| COMPLETE | Finish loop | all done | status=completed |
## Termination Conditions
1. **API Paused**: `state.status === 'paused'` (Skill exits, wait for resume)
2. **API Stopped**: `state.status === 'failed'` (Skill terminates)
3. **Task Complete**: `NEXT_ACTION_NEEDED === 'COMPLETED'`
4. **Iteration Limit**: `current_iteration >= max_iterations`
5. **User Exit**: User selects 'exit' in interactive mode
## Error Recovery
| Error Type | Recovery Strategy |
|------------|-------------------|
| Agent timeout | send_input requesting convergence |
| Action failed | Log error, continue or prompt user |
| State corrupted | Rebuild from progress files |
| Agent closed unexpectedly | Re-spawn with previous output in message |
## Codex Best Practices Applied
1. **Single Agent Pattern**: One agent handles entire loop lifecycle
2. **Deep Interaction via send_input**: Multi-phase without context loss
3. **Delayed close_agent**: Only after confirming no more interaction
4. **Explicit wait()**: Always get results before proceeding
5. **Role Path Passing**: Agent reads role file, no content embedding

View File

@@ -1,388 +0,0 @@
# State Schema (Codex Version)
CCW Loop state structure definition for Codex subagent pattern.
## State File
**Location**: `.workflow/.loop/{loopId}.json` (unified location, API + Skill shared)
## Structure Definition
### Unified Loop State Interface
```typescript
/**
* Unified Loop State - API and Skill shared state structure
* API (loop-v2-routes.ts) owns state control
* Skill (ccw-loop) reads and updates this state via subagent
*/
interface LoopState {
// =====================================================
// API FIELDS (from loop-v2-routes.ts)
// These fields are managed by API, Skill read-only
// =====================================================
loop_id: string // Loop ID, e.g., "loop-v2-20260122-abc123"
title: string // Loop title
description: string // Loop description
max_iterations: number // Maximum iteration count
status: 'created' | 'running' | 'paused' | 'completed' | 'failed' | 'user_exit'
current_iteration: number // Current iteration count
created_at: string // Creation time (ISO8601)
updated_at: string // Last update time (ISO8601)
completed_at?: string // Completion time (ISO8601)
failure_reason?: string // Failure reason
// =====================================================
// SKILL EXTENSION FIELDS
// These fields are managed by Skill executor agent
// =====================================================
skill_state?: {
// Current execution action
current_action: 'init' | 'develop' | 'debug' | 'validate' | 'complete' | null
last_action: string | null
completed_actions: string[]
mode: 'interactive' | 'auto'
// === Development Phase ===
develop: {
total: number
completed: number
current_task?: string
tasks: DevelopTask[]
last_progress_at: string | null
}
// === Debug Phase ===
debug: {
active_bug?: string
hypotheses_count: number
hypotheses: Hypothesis[]
confirmed_hypothesis: string | null
iteration: number
last_analysis_at: string | null
}
// === Validation Phase ===
validate: {
pass_rate: number // Test pass rate (0-100)
coverage: number // Coverage (0-100)
test_results: TestResult[]
passed: boolean
failed_tests: string[]
last_run_at: string | null
}
// === Error Tracking ===
errors: Array<{
action: string
message: string
timestamp: string
}>
// === Summary (after completion) ===
summary?: {
duration: number
iterations: number
develop: object
debug: object
validate: object
}
}
}
interface DevelopTask {
id: string
description: string
tool: 'gemini' | 'qwen' | 'codex' | 'bash'
mode: 'analysis' | 'write'
status: 'pending' | 'in_progress' | 'completed' | 'failed'
files_changed: string[]
created_at: string
completed_at: string | null
}
interface Hypothesis {
id: string // H1, H2, ...
description: string
testable_condition: string
logging_point: string
evidence_criteria: {
confirm: string
reject: string
}
likelihood: number // 1 = most likely
status: 'pending' | 'confirmed' | 'rejected' | 'inconclusive'
evidence: Record<string, any> | null
verdict_reason: string | null
}
interface TestResult {
test_name: string
suite: string
status: 'passed' | 'failed' | 'skipped'
duration_ms: number
error_message: string | null
stack_trace: string | null
}
```
## Initial State
### Created by API (Dashboard Trigger)
```json
{
"loop_id": "loop-v2-20260122-abc123",
"title": "Implement user authentication",
"description": "Add login/logout functionality",
"max_iterations": 10,
"status": "created",
"current_iteration": 0,
"created_at": "2026-01-22T10:00:00+08:00",
"updated_at": "2026-01-22T10:00:00+08:00"
}
```
### After Skill Initialization (INIT action)
```json
{
"loop_id": "loop-v2-20260122-abc123",
"title": "Implement user authentication",
"description": "Add login/logout functionality",
"max_iterations": 10,
"status": "running",
"current_iteration": 0,
"created_at": "2026-01-22T10:00:00+08:00",
"updated_at": "2026-01-22T10:00:05+08:00",
"skill_state": {
"current_action": "init",
"last_action": null,
"completed_actions": [],
"mode": "auto",
"develop": {
"total": 3,
"completed": 0,
"current_task": null,
"tasks": [
{ "id": "task-001", "description": "Create auth component", "status": "pending" }
],
"last_progress_at": null
},
"debug": {
"active_bug": null,
"hypotheses_count": 0,
"hypotheses": [],
"confirmed_hypothesis": null,
"iteration": 0,
"last_analysis_at": null
},
"validate": {
"pass_rate": 0,
"coverage": 0,
"test_results": [],
"passed": false,
"failed_tests": [],
"last_run_at": null
},
"errors": []
}
}
```
## Control Signal Checking (Codex Pattern)
Agent checks control signals at start of every action:
```javascript
/**
* Check API control signals
* MUST be called at start of every action
* @returns { continue: boolean, action: 'pause_exit' | 'stop_exit' | 'continue' }
*/
function checkControlSignals(loopId) {
const state = JSON.parse(Read(`.workflow/.loop/${loopId}.json`))
switch (state.status) {
case 'paused':
// API paused the loop, Skill should exit and wait for resume
return { continue: false, action: 'pause_exit' }
case 'failed':
// API stopped the loop (user manual stop)
return { continue: false, action: 'stop_exit' }
case 'running':
// Normal continue
return { continue: true, action: 'continue' }
default:
// Abnormal status
return { continue: false, action: 'stop_exit' }
}
}
```
## State Transitions
### 1. Initialization (INIT action)
```javascript
{
status: 'created' -> 'running', // Or keep 'running' if API already set
updated_at: timestamp,
skill_state: {
current_action: 'init',
mode: 'auto',
develop: {
tasks: [...parsed_tasks],
total: N,
completed: 0
}
}
}
```
### 2. Development (DEVELOP action)
```javascript
{
updated_at: timestamp,
current_iteration: state.current_iteration + 1,
skill_state: {
current_action: 'develop',
last_action: 'DEVELOP',
completed_actions: [..., 'DEVELOP'],
develop: {
current_task: 'task-xxx',
completed: N+1,
last_progress_at: timestamp
}
}
}
```
### 3. Debugging (DEBUG action)
```javascript
{
updated_at: timestamp,
current_iteration: state.current_iteration + 1,
skill_state: {
current_action: 'debug',
last_action: 'DEBUG',
debug: {
active_bug: '...',
hypotheses_count: N,
hypotheses: [...new_hypotheses],
iteration: N+1,
last_analysis_at: timestamp
}
}
}
```
### 4. Validation (VALIDATE action)
```javascript
{
updated_at: timestamp,
current_iteration: state.current_iteration + 1,
skill_state: {
current_action: 'validate',
last_action: 'VALIDATE',
validate: {
test_results: [...results],
pass_rate: 95.5,
coverage: 85.0,
passed: true | false,
failed_tests: ['test1', 'test2'],
last_run_at: timestamp
}
}
}
```
### 5. Completion (COMPLETE action)
```javascript
{
status: 'running' -> 'completed',
completed_at: timestamp,
updated_at: timestamp,
skill_state: {
current_action: 'complete',
last_action: 'COMPLETE',
summary: { ... }
}
}
```
## File Sync
### Unified Location
State-to-file mapping:
| State Field | Sync File | Sync Timing |
|-------------|-----------|-------------|
| Entire LoopState | `.workflow/.loop/{loopId}.json` | Every state change (master) |
| `skill_state.develop` | `.workflow/.loop/{loopId}.progress/develop.md` | After each dev operation |
| `skill_state.debug` | `.workflow/.loop/{loopId}.progress/debug.md` | After each debug operation |
| `skill_state.validate` | `.workflow/.loop/{loopId}.progress/validate.md` | After each validation |
| Code changes log | `.workflow/.loop/{loopId}.progress/changes.log` | Each file modification (NDJSON) |
| Debug log | `.workflow/.loop/{loopId}.progress/debug.log` | Each debug log (NDJSON) |
### File Structure
```
.workflow/.loop/
+-- loop-v2-20260122-abc123.json # Master state file (API + Skill)
+-- loop-v2-20260122-abc123.tasks.jsonl # Task list (API managed)
+-- loop-v2-20260122-abc123.progress/ # Skill progress files
+-- develop.md # Development progress
+-- debug.md # Debug understanding
+-- validate.md # Validation report
+-- changes.log # Code changes (NDJSON)
+-- debug.log # Debug log (NDJSON)
+-- summary.md # Completion summary
```
## State Recovery
If master state file corrupted, rebuild skill_state from progress files:
```javascript
function rebuildSkillStateFromProgress(loopId) {
const progressDir = `.workflow/.loop/${loopId}.progress`
// Parse progress files to rebuild state
const skill_state = {
develop: parseProgressFile(`${progressDir}/develop.md`),
debug: parseProgressFile(`${progressDir}/debug.md`),
validate: parseProgressFile(`${progressDir}/validate.md`)
}
return skill_state
}
```
## Codex Pattern Notes
1. **Agent reads state**: Agent reads `.workflow/.loop/{loopId}.json` at action start
2. **Agent writes state**: Agent updates state after action completion
3. **Orchestrator tracks iterations**: Main loop tracks `current_iteration`
4. **Single agent context**: All state updates in same agent conversation via send_input
5. **No context serialization loss**: State transitions happen in-memory within agent

View File

@@ -1,182 +0,0 @@
# Action Catalog (Codex Version)
CCW Loop available actions and their specifications.
## Available Actions
| Action | Purpose | Preconditions | Effects | Output |
|--------|---------|---------------|---------|--------|
| INIT | Initialize session | status=running, skill_state=null | skill_state initialized | progress/*.md created |
| MENU | Display action menu | skill_state!=null, mode=interactive | Wait for user input | WAITING_INPUT |
| DEVELOP | Execute dev task | pending tasks > 0 | Update progress.md | develop.md updated |
| DEBUG | Hypothesis debug | needs debugging | Update understanding.md | debug.md updated |
| VALIDATE | Run tests | needs validation | Update validation.md | validate.md updated |
| COMPLETE | Finish loop | all done | status=completed | summary.md created |
## Action Flow (Codex Pattern)
```
spawn_agent (ccw-loop-executor)
|
v
+-------+
| INIT | (if skill_state is null)
+-------+
|
v
+-------+ send_input
| MENU | <------------- (user selection in interactive mode)
+-------+
|
+---+---+---+---+
| | | | |
v v v v v
DEV DBG VAL CMP EXIT
|
v
wait() -> get result
|
v
[Loop continues via send_input]
|
v
close_agent()
```
## Action Execution Pattern
### Single Agent Deep Interaction
All actions executed within same agent via `send_input`:
```javascript
// Initial spawn
const agent = spawn_agent({ message: role + initial_task })
// Execute INIT
const initResult = wait({ ids: [agent] })
// Continue with DEVELOP via send_input
send_input({ id: agent, message: 'Execute DEVELOP' })
const devResult = wait({ ids: [agent] })
// Continue with VALIDATE via send_input
send_input({ id: agent, message: 'Execute VALIDATE' })
const valResult = wait({ ids: [agent] })
// Only close when done
close_agent({ id: agent })
```
### Action Output Format (Standardized)
Every action MUST output:
```
ACTION_RESULT:
- action: {ACTION_NAME}
- status: success | failed | needs_input
- message: {user-facing message}
- state_updates: { ... }
FILES_UPDATED:
- {file_path}: {description}
NEXT_ACTION_NEEDED: {NEXT_ACTION} | WAITING_INPUT | COMPLETED | PAUSED
```
## Action Selection Logic
### Auto Mode
```javascript
function selectNextAction(state) {
const skillState = state.skill_state
// 1. Terminal conditions
if (state.status === 'completed') return null
if (state.status === 'failed') return null
if (state.current_iteration >= state.max_iterations) return 'COMPLETE'
// 2. Initialization check
if (!skillState) return 'INIT'
// 3. Auto selection based on state
const hasPendingDevelop = skillState.develop.tasks.some(t => t.status === 'pending')
if (hasPendingDevelop) {
return 'DEVELOP'
}
if (skillState.last_action === 'DEVELOP') {
const needsDebug = skillState.develop.completed < skillState.develop.total
if (needsDebug) return 'DEBUG'
}
if (skillState.last_action === 'DEBUG' || skillState.debug.confirmed_hypothesis) {
return 'VALIDATE'
}
if (skillState.last_action === 'VALIDATE') {
if (!skillState.validate.passed) return 'DEVELOP'
}
if (skillState.validate.passed && !hasPendingDevelop) {
return 'COMPLETE'
}
return 'DEVELOP'
}
```
### Interactive Mode
Returns `MENU` action, which displays options and waits for user input.
## Action Dependencies
| Action | Depends On | Leads To |
|--------|------------|----------|
| INIT | - | MENU or DEVELOP |
| MENU | INIT | User selection |
| DEVELOP | INIT | DEVELOP, DEBUG, VALIDATE |
| DEBUG | INIT | DEVELOP, VALIDATE |
| VALIDATE | DEVELOP or DEBUG | COMPLETE, DEBUG, DEVELOP |
| COMPLETE | - | Terminal |
## Action Sequences
### Happy Path (Auto Mode)
```
INIT -> DEVELOP -> DEVELOP -> DEVELOP -> VALIDATE (pass) -> COMPLETE
```
### Debug Iteration Path
```
INIT -> DEVELOP -> VALIDATE (fail) -> DEBUG -> DEBUG -> VALIDATE (pass) -> COMPLETE
```
### Interactive Path
```
INIT -> MENU -> (user: develop) -> DEVELOP -> MENU -> (user: validate) -> VALIDATE -> MENU -> (user: complete) -> COMPLETE
```
## Error Recovery
| Error | Recovery |
|-------|----------|
| Action timeout | send_input requesting convergence |
| Action failed | Log error, continue or retry |
| Agent closed unexpectedly | Re-spawn with previous output |
| State corrupted | Rebuild from progress files |
## Codex Best Practices
1. **Single agent for all actions**: No need to spawn new agent for each action
2. **Deep interaction via send_input**: Continue conversation in same context
3. **Delayed close_agent**: Only close after all actions complete
4. **Structured output**: Always use ACTION_RESULT format for parsing
5. **Control signal checking**: Check state.status before every action

View File

@@ -1,214 +0,0 @@
---
name: codex-issue-plan-execute
description: Autonomous issue planning and execution workflow for Codex. Supports batch issue processing with integrated planning, queuing, and execution stages. Triggers on "codex-issue", "plan execute issue", "issue workflow".
allowed-tools: Task, AskUserQuestion, Read, Write, Bash, Glob, Grep
---
# Codex Issue Plan-Execute Workflow
Streamlined autonomous workflow for Codex that integrates issue planning, queue management, and solution execution in a single stateful Skill. Supports batch processing with minimal queue overhead and dual-agent execution strategy.
## Architecture Overview
For complete architecture details, system diagrams, and design principles, see **[ARCHITECTURE.md](ARCHITECTURE.md)**.
Key concepts:
- **Persistent Dual-Agent System**: Two long-running agents (Planning + Execution) that maintain context across all tasks
- **Sequential Pipeline**: Issues → Planning Agent → Solutions → Execution Agent → Results
- **Unified Results**: All results accumulated in single `planning-results.json` and `execution-results.json` files
- **Efficient Communication**: Uses `send_input()` for task routing without agent recreation overhead
---
## ⚠️ Mandatory Prerequisites (强制前置条件)
> **⛔ 禁止跳过**: 在执行任何操作之前,**必须**阅读以下两份P0规范文档。未理解规范直接执行将导致输出质量不符合标准。
| Document | Purpose | When |
|----------|---------|------|
| [specs/issue-handling.md](specs/issue-handling.md) | Issue 处理规范和数据结构 | **执行前必读** |
| [specs/solution-schema.md](specs/solution-schema.md) | 解决方案数据结构和验证规则 | **执行前必读** |
---
## Execution Flow
### Phase 1: Initialize Persistent Agents
**查阅**: [ARCHITECTURE.md](ARCHITECTURE.md) - 系统架构
**查阅**: [phases/orchestrator.md](phases/orchestrator.md) - 编排逻辑
→ Spawn Planning Agent with `prompts/planning-agent.md` (stays alive)
→ Spawn Execution Agent with `prompts/execution-agent.md` (stays alive)
### Phase 2: Planning Pipeline
**查阅**: [phases/actions/action-plan.md](phases/actions/action-plan.md), [specs/subagent-roles.md](specs/subagent-roles.md)
For each issue sequentially:
1. Send issue to Planning Agent via `send_input()` with planning request
2. Wait for Planning Agent to return solution JSON
3. Store result in unified `planning-results.json` array
4. Continue to next issue (agent stays alive)
### Phase 3: Execution Pipeline
**查阅**: [phases/actions/action-execute.md](phases/actions/action-execute.md), [specs/quality-standards.md](specs/quality-standards.md)
For each successful planning result sequentially:
1. Send solution to Execution Agent via `send_input()` with execution request
2. Wait for Execution Agent to complete implementation and testing
3. Store result in unified `execution-results.json` array
4. Continue to next solution (agent stays alive)
### Phase 4: Finalize
**查阅**: [phases/actions/action-complete.md](phases/actions/action-complete.md)
→ Close Planning Agent (after all issues planned)
→ Close Execution Agent (after all solutions executed)
→ Generate final report with statistics
### State Schema
```json
{
"status": "pending|running|completed",
"phase": "init|listing|planning|executing|complete",
"issues": {
"{issue_id}": {
"id": "ISS-xxx",
"status": "registered|planning|planned|executing|completed",
"solution_id": "SOL-xxx-1",
"planned_at": "ISO-8601",
"executed_at": "ISO-8601"
}
},
"queue": [
{
"item_id": "S-1",
"issue_id": "ISS-xxx",
"solution_id": "SOL-xxx-1",
"status": "pending|executing|completed"
}
],
"context": {
"work_dir": ".workflow/.scratchpad/...",
"total_issues": 0,
"completed_count": 0,
"failed_count": 0
},
"errors": []
}
```
---
## Directory Setup
```javascript
const timestamp = new Date().toISOString().slice(0,19).replace(/[-:T]/g, '');
const workDir = `.workflow/.scratchpad/codex-issue-${timestamp}`;
Bash(`mkdir -p "${workDir}"`);
Bash(`mkdir -p "${workDir}/solutions"`);
Bash(`mkdir -p "${workDir}/snapshots"`);
```
## Output Structure
```
.workflow/.scratchpad/codex-issue-{timestamp}/
├── planning-results.json # All planning results in single file
│ ├── phase: "planning"
│ ├── created_at: "ISO-8601"
│ └── results: [
│ { issue_id, solution_id, status, solution, planned_at }
│ ]
├── execution-results.json # All execution results in single file
│ ├── phase: "execution"
│ ├── created_at: "ISO-8601"
│ └── results: [
│ { issue_id, solution_id, status, commit_hash, files_modified, executed_at }
│ ]
└── final-report.md # Summary statistics and report
```
---
## Reference Documents by Phase
### 🔧 Setup & Understanding (初始化阶段)
用于理解整个系统架构和执行流程
| Document | Purpose | Key Topics |
|----------|---------|-----------|
| [phases/orchestrator.md](phases/orchestrator.md) | 编排器核心逻辑 | 如何管理agents、pipeline流程、状态转换 |
| [phases/state-schema.md](phases/state-schema.md) | 状态结构定义 | 完整状态模型、验证规则、持久化 |
| [specs/agent-roles.md](specs/agent-roles.md) | Agent角色和职责定义 | Planning & Execution Agent详细说明 |
### 📋 Planning Phase (规划阶段)
执行Phase 2时查阅 - Planning逻辑和Issue处理
| Document | Purpose | When to Use |
|----------|---------|-------------|
| [phases/actions/action-plan.md](phases/actions/action-plan.md) | Planning流程详解 | 理解issue→solution转换逻辑 |
| [phases/actions/action-list.md](phases/actions/action-list.md) | Issue列表处理 | 学习issue加载和列举逻辑 |
| [specs/issue-handling.md](specs/issue-handling.md) | Issue数据规范 | 理解issue结构和验证规则 ✅ **必读** |
| [specs/solution-schema.md](specs/solution-schema.md) | 解决方案数据结构 | 了解solution JSON格式 ✅ **必读** |
### ⚙️ Execution Phase (执行阶段)
执行Phase 3时查阅 - 实现和验证逻辑
| Document | Purpose | When to Use |
|----------|---------|-------------|
| [phases/actions/action-execute.md](phases/actions/action-execute.md) | Execution流程详解 | 理解solution→implementation逻辑 |
| [specs/quality-standards.md](specs/quality-standards.md) | 质量标准和验收条件 | 检查implementation是否达标 |
### 🏁 Completion Phase (完成阶段)
执行Phase 4时查阅 - 收尾和报告逻辑
| Document | Purpose | When to Use |
|----------|---------|-------------|
| [phases/actions/action-complete.md](phases/actions/action-complete.md) | 完成流程 | 生成最终报告、统计信息 |
### 🔍 Debugging & Troubleshooting (问题排查)
遇到问题时查阅 - 快速定位和解决
| Issue | Solution Document |
|-------|------------------|
| 执行过程中状态异常 | [phases/state-schema.md](phases/state-schema.md) - 验证状态结构 |
| Planning Agent输出不符合预期 | [phases/actions/action-plan.md](phases/actions/action-plan.md) + [specs/solution-schema.md](specs/solution-schema.md) |
| Execution Agent实现失败 | [phases/actions/action-execute.md](phases/actions/action-execute.md) + [specs/quality-standards.md](specs/quality-standards.md) |
| Issue数据格式错误 | [specs/issue-handling.md](specs/issue-handling.md) |
### 📚 Architecture & Agent Definitions (架构和Agent定义)
核心设计文档
| Document | Purpose | Notes |
|----------|---------|-------|
| [ARCHITECTURE.md](ARCHITECTURE.md) | 系统架构和设计原则 | 启动前必读 |
| [specs/agent-roles.md](specs/agent-roles.md) | Agent角色定义 | Planning & Execution Agent详细职责 |
| [prompts/planning-agent.md](prompts/planning-agent.md) | Planning Agent统一提示词 | 用于初始化Planning Agent |
| [prompts/execution-agent.md](prompts/execution-agent.md) | Execution Agent统一提示词 | 用于初始化Execution Agent |
---
## Usage Examples
### Batch Process Specific Issues
```bash
codex -p "@.codex/prompts/codex-issue-plan-execute ISS-001,ISS-002,ISS-003"
```
### Interactive Selection
```bash
codex -p "@.codex/prompts/codex-issue-plan-execute"
# Then select issues from the list
```
### Resume from Snapshot
```bash
codex -p "@.codex/prompts/codex-issue-plan-execute --resume snapshot-path"
```
---
*Skill Version: 1.0*
*Execution Mode: Autonomous*
*Status: Ready for Customization*

View File

@@ -1,173 +0,0 @@
# Action: Complete
完成工作流并生成最终报告。
## Purpose
序列化最终状态,生成执行摘要,清理临时文件。
## Preconditions
- [ ] `state.status === "running"`
- [ ] 所有 issues 已处理或错误限制达到
## Execution
```javascript
async function execute(state) {
const workDir = state.work_dir;
const issues = state.issues || {};
console.log("\n=== Finalizing Workflow ===");
// 1. 生成统计信息
const totalIssues = Object.keys(issues).length;
const completedCount = Object.values(issues).filter(i => i.status === "completed").length;
const failedCount = Object.values(issues).filter(i => i.status === "failed").length;
const pendingCount = totalIssues - completedCount - failedCount;
const stats = {
total_issues: totalIssues,
completed: completedCount,
failed: failedCount,
pending: pendingCount,
success_rate: totalIssues > 0 ? ((completedCount / totalIssues) * 100).toFixed(1) : 0,
duration_ms: new Date() - new Date(state.created_at)
};
console.log("\n=== Summary ===");
console.log(`Total Issues: ${stats.total_issues}`);
console.log(`✓ Completed: ${stats.completed}`);
console.log(`✗ Failed: ${stats.failed}`);
console.log(`○ Pending: ${stats.pending}`);
console.log(`Success Rate: ${stats.success_rate}%`);
console.log(`Duration: ${(stats.duration_ms / 1000).toFixed(1)}s`);
// 2. 生成详细报告
const reportLines = [
"# Execution Report",
"",
`## Summary`,
`- Total Issues: ${stats.total_issues}`,
`- Completed: ${stats.completed}`,
`- Failed: ${stats.failed}`,
`- Pending: ${stats.pending}`,
`- Success Rate: ${stats.success_rate}%`,
`- Duration: ${(stats.duration_ms / 1000).toFixed(1)}s`,
"",
"## Results by Issue"
];
Object.values(issues).forEach((issue, index) => {
const status = issue.status === "completed" ? "✓" : issue.status === "failed" ? "✗" : "○";
reportLines.push(`### ${status} [${index + 1}] ${issue.id}: ${issue.title}`);
reportLines.push(`- Status: ${issue.status}`);
if (issue.solution_id) {
reportLines.push(`- Solution: ${issue.solution_id}`);
}
if (issue.planned_at) {
reportLines.push(`- Planned: ${issue.planned_at}`);
}
if (issue.executed_at) {
reportLines.push(`- Executed: ${issue.executed_at}`);
}
if (issue.error) {
reportLines.push(`- Error: ${issue.error}`);
}
reportLines.push("");
});
if (state.errors && state.errors.length > 0) {
reportLines.push("## Errors");
state.errors.forEach(error => {
reportLines.push(`- [${error.timestamp}] ${error.action}: ${error.message}`);
});
reportLines.push("");
}
reportLines.push("## Files Generated");
reportLines.push(`- Work Directory: ${workDir}`);
reportLines.push(`- State File: ${workDir}/state.json`);
reportLines.push(`- Execution Results: ${workDir}/execution-results.json`);
reportLines.push(`- Solutions: ${workDir}/solutions/`);
reportLines.push(`- Snapshots: ${workDir}/snapshots/`);
// 3. 保存报告
const reportPath = `${workDir}/final-report.md`;
Write(reportPath, reportLines.join("\n"));
// 4. 保存最终状态
const finalState = {
...state,
status: "completed",
phase: "completed",
completed_at: new Date().toISOString(),
completed_actions: [...state.completed_actions, "action-complete"],
context: {
...state.context,
...stats
}
};
Write(`${workDir}/state.json`, JSON.stringify(finalState, null, 2));
// 5. 保存汇总 JSON
Write(`${workDir}/summary.json`, JSON.stringify({
status: "completed",
stats: stats,
report_file: reportPath,
work_dir: workDir,
completed_at: new Date().toISOString()
}, null, 2));
// 6. 输出完成消息
console.log(`\n✓ Workflow completed`);
console.log(`📄 Report: ${reportPath}`);
console.log(`📁 Working directory: ${workDir}`);
return {
stateUpdates: {
status: "completed",
phase: "completed",
completed_at: new Date().toISOString(),
completed_actions: [...state.completed_actions, "action-complete"],
context: finalState.context
}
};
}
```
## State Updates
```javascript
return {
stateUpdates: {
status: "completed",
phase: "completed",
completed_at: timestamp,
completed_actions: [...state.completed_actions, "action-complete"],
context: {
total_issues: stats.total_issues,
completed_count: stats.completed,
failed_count: stats.failed,
success_rate: stats.success_rate
}
}
};
```
## Error Handling
| Error Type | Recovery |
|------------|----------|
| 报告生成失败 | 输出文本摘要到控制台 |
| 文件写入失败 | 继续完成,允许手动保存 |
| 权限错误 | 使用替代目录 |
## Next Actions (Hints)
- 无(终止状态)
- 用户可选择:
- 查看报告:`cat {report_path}`
- 恢复并重试失败的 issues`codex issue:plan-execute --resume {work_dir}`
- 清理临时文件:`rm -rf {work_dir}`

View File

@@ -1,220 +0,0 @@
# Action: Execute Solutions
按队列顺序执行已规划的解决方案。
## Purpose
加载计划的解决方案并使用 subagent 执行所有任务、提交更改。
## Preconditions
- [ ] `state.status === "running"`
- [ ] `issues with solution_id` exist (来自规划阶段)
## Execution
```javascript
async function execute(state) {
const workDir = state.work_dir;
const issues = state.issues || {};
const queue = state.queue || [];
// 1. 构建执行队列(来自已规划的 issues
const plannedIssues = Object.values(issues).filter(i => i.status === "planned");
if (plannedIssues.length === 0) {
console.log("No planned solutions to execute");
return { stateUpdates: { queue } };
}
console.log(`\n=== Executing ${plannedIssues.length} Solutions ===`);
// 2. 序列化执行每个解决方案
const executionResults = [];
for (let i = 0; i < plannedIssues.length; i++) {
const issue = plannedIssues[i];
const solutionId = issue.solution_id;
console.log(`\n[${i + 1}/${plannedIssues.length}] Executing: ${solutionId}`);
try {
// 创建快照(便于恢复)
const beforeSnapshot = {
timestamp: new Date().toISOString(),
phase: "before-execute",
issue_id: issue.id,
solution_id: solutionId,
state: { ...state }
};
Write(`${workDir}/snapshots/snapshot-before-execute-${i}.json`, JSON.stringify(beforeSnapshot, null, 2));
// 执行 subagent
const executionPrompt = `
## TASK ASSIGNMENT
### MANDATORY FIRST STEPS (Agent Execute)
1. **Read role definition**: ~/.codex/agents/issue-execute-agent.md (MUST read first)
2. Read: .workflow/project-tech.json
3. Read: .workflow/project-guidelines.json
---
Goal: Execute solution "${solutionId}" for issue "${issue.id}"
Scope:
- CAN DO: Implement tasks, run tests, commit code
- CANNOT DO: Push to remote or create PRs without approval
- Directory: ${process.cwd()}
Solution ID: ${solutionId}
Load solution details:
- Read: ${workDir}/solutions/${issue.id}-plan.json
Execution steps:
1. Parse all tasks from solution
2. Execute each task: implement → test → verify
3. Commit once for all tasks with formatted summary
4. Report completion
Quality bar:
- All acceptance criteria verified
- Tests passing
- Commit message follows conventions
Return: JSON with files_modified[], commit_hash, status
`;
const result = await Task({
subagent_type: "universal-executor",
run_in_background: false,
description: `Execute solution ${solutionId}`,
prompt: executionPrompt
});
// 解析执行结果
let execResult;
try {
execResult = typeof result === "string" ? JSON.parse(result) : result;
} catch {
execResult = { status: "executed", commit_hash: "unknown" };
}
// 保存执行结果
Write(`${workDir}/solutions/${issue.id}-execution.json`, JSON.stringify({
solution_id: solutionId,
issue_id: issue.id,
status: "completed",
executed_at: new Date().toISOString(),
execution_result: execResult
}, null, 2));
// 更新 issue 状态
issues[issue.id].status = "completed";
issues[issue.id].executed_at = new Date().toISOString();
// 更新队列项
const queueIndex = queue.findIndex(q => q.solution_id === solutionId);
if (queueIndex >= 0) {
queue[queueIndex].status = "completed";
}
// 更新 ccw
try {
Bash(`ccw issue update ${issue.id} --status completed`);
} catch (error) {
console.log(`Note: Could not update ccw status (${error.message})`);
}
console.log(`${solutionId} completed`);
executionResults.push({
issue_id: issue.id,
solution_id: solutionId,
status: "completed",
commit: execResult.commit_hash
});
state.context.completed_count++;
} catch (error) {
console.error(`✗ Execution failed for ${solutionId}: ${error.message}`);
// 更新失败状态
issues[issue.id].status = "failed";
issues[issue.id].error = error.message;
state.context.failed_count++;
executionResults.push({
issue_id: issue.id,
solution_id: solutionId,
status: "failed",
error: error.message
});
}
}
// 3. 保存执行结果摘要
Write(`${workDir}/execution-results.json`, JSON.stringify({
total: plannedIssues.length,
completed: state.context.completed_count,
failed: state.context.failed_count,
results: executionResults,
timestamp: new Date().toISOString()
}, null, 2));
return {
stateUpdates: {
issues: issues,
queue: queue,
context: state.context,
completed_actions: [...state.completed_actions, "action-execute"]
}
};
}
```
## State Updates
```javascript
return {
stateUpdates: {
issues: {
[issue.id]: {
...issue,
status: "completed|failed",
executed_at: timestamp,
error: errorMessage
}
},
queue: [
...queue.map(item =>
item.solution_id === solutionId
? { ...item, status: "completed|failed" }
: item
)
],
context: {
...state.context,
completed_count: newCompletedCount,
failed_count: newFailedCount
}
}
};
```
## Error Handling
| Error Type | Recovery |
|------------|----------|
| 任务执行失败 | 标记为失败,继续下一个 |
| 测试失败 | 不提交,标记为失败 |
| 提交失败 | 保存快照便于恢复 |
| Subagent 超时 | 记录超时,继续 |
## Next Actions (Hints)
- 执行完成:转入 action-complete 阶段
- 有失败项:用户选择是否重试
- 全部完成:生成最终报告

View File

@@ -1,86 +0,0 @@
# Action: Initialize
初始化 Skill 执行状态和工作目录。
## Purpose
设置初始状态,创建工作目录,准备执行环境。
## Preconditions
- [ ] `state.status === "pending"`
## Execution
```javascript
async function execute(state) {
// 创建工作目录
const timestamp = new Date().toISOString().slice(0,19).replace(/[-:T]/g, '');
const workDir = `.workflow/.scratchpad/codex-issue-${timestamp}`;
Bash(`mkdir -p "${workDir}/solutions" "${workDir}/snapshots"`);
// 初始化状态
const initialState = {
status: "running",
phase: "initialized",
work_dir: workDir,
issues: {},
queue: [],
completed_actions: ["action-init"],
context: {
total_issues: 0,
completed_count: 0,
failed_count: 0
},
errors: [],
created_at: new Date().toISOString(),
updated_at: new Date().toISOString()
};
// 保存初始状态
Write(`${workDir}/state.json`, JSON.stringify(initialState, null, 2));
Write(`${workDir}/state-history.json`, JSON.stringify([{
timestamp: initialState.created_at,
phase: "init",
completed_actions: 1,
issues_count: 0
}], null, 2));
console.log(`✓ Initialized: ${workDir}`);
return {
stateUpdates: {
status: "running",
phase: "initialized",
work_dir: workDir,
completed_actions: ["action-init"]
}
};
}
```
## State Updates
```javascript
return {
stateUpdates: {
status: "running",
phase: "initialized",
work_dir: workDir,
completed_actions: ["action-init"]
}
};
```
## Error Handling
| Error Type | Recovery |
|------------|----------|
| 目录创建失败 | 检查权限,使用临时目录 |
| 文件写入失败 | 重试或切换存储位置 |
## Next Actions (Hints)
- 成功:进入 listing phase执行 action-list
- 失败:中止工作流

View File

@@ -1,165 +0,0 @@
# Action: List Issues
列出 issues 并支持用户交互选择。
## Purpose
展示当前所有 issues 的状态,收集用户的规划/执行意图。
## Preconditions
- [ ] `state.status === "running"`
## Execution
```javascript
async function execute(state) {
// 1. 加载或初始化 issues
let issues = state.issues || {};
// 2. 从 ccw issue list 或提供的参数加载 issues
// 这取决于用户是否在命令行提供了 issue IDs
// 示例ccw codex issue:plan-execute ISS-001,ISS-002
// 对于本次演示,我们假设从 issues.jsonl 加载
try {
const issuesListOutput = Bash("ccw issue list --status registered,planned --json").output;
const issuesList = JSON.parse(issuesListOutput);
issuesList.forEach(issue => {
if (!issues[issue.id]) {
issues[issue.id] = {
id: issue.id,
title: issue.title,
status: "registered",
solution_id: null,
planned_at: null,
executed_at: null,
error: null
};
}
});
} catch (error) {
console.log("Note: Could not load issues from ccw issue list");
// 使用来自参数的 issues或者空列表
}
// 3. 显示当前状态
const totalIssues = Object.keys(issues).length;
const registeredCount = Object.values(issues).filter(i => i.status === "registered").length;
const plannedCount = Object.values(issues).filter(i => i.status === "planned").length;
const completedCount = Object.values(issues).filter(i => i.status === "completed").length;
console.log("\n=== Issue Status ===");
console.log(`Total: ${totalIssues} | Registered: ${registeredCount} | Planned: ${plannedCount} | Completed: ${completedCount}`);
if (totalIssues === 0) {
console.log("\nNo issues found. Please create issues first using 'ccw issue init'");
return {
stateUpdates: {
context: {
...state.context,
total_issues: 0
}
}
};
}
// 4. 显示详细列表
console.log("\n=== Issue Details ===");
Object.values(issues).forEach((issue, index) => {
const status = issue.status === "completed" ? "✓" : issue.status === "planned" ? "→" : "○";
console.log(`${status} [${index + 1}] ${issue.id}: ${issue.title} (${issue.status})`);
});
// 5. 询问用户下一步
const issueIds = Object.keys(issues);
const pendingIds = issueIds.filter(id => issues[id].status === "registered");
if (pendingIds.length === 0) {
console.log("\nNo unplanned issues. Ready to execute planned solutions.");
return {
stateUpdates: {
context: {
...state.context,
total_issues: totalIssues
}
}
};
}
// 6. 显示选项
console.log("\nNext action:");
console.log("- Enter 'p' to PLAN selected issues");
console.log("- Enter 'x' to EXECUTE planned solutions");
console.log("- Enter 'a' to plan ALL pending issues");
console.log("- Enter 'q' to QUIT");
const response = await AskUserQuestion({
questions: [{
question: "Select issues to plan (comma-separated numbers, or 'all'):",
header: "Selection",
multiSelect: false,
options: pendingIds.slice(0, 4).map(id => ({
label: `${issues[id].id}: ${issues[id].title}`,
description: `Current status: ${issues[id].status}`
}))
}]
});
// 7. 更新 issues 状态为 "planning"
const selectedIds = [];
if (response.Selection === "all") {
selectedIds.push(...pendingIds);
} else {
// 解析用户选择
selectedIds.push(response.Selection);
}
selectedIds.forEach(issueId => {
if (issues[issueId]) {
issues[issueId].status = "planning";
}
});
return {
stateUpdates: {
issues: issues,
context: {
...state.context,
total_issues: totalIssues
}
}
};
}
```
## State Updates
```javascript
return {
stateUpdates: {
issues: issues,
context: {
total_issues: Object.keys(issues).length,
registered_count: registeredCount,
planned_count: plannedCount,
completed_count: completedCount
}
}
};
```
## Error Handling
| Error Type | Recovery |
|------------|----------|
| Issues 加载失败 | 使用空列表继续 |
| 用户输入无效 | 要求重新选择 |
| 列表显示异常 | 使用 JSON 格式输出 |
## Next Actions (Hints)
- 有 "planning" issues执行 action-plan
- 无 pending issues执行 action-execute
- 用户取消:中止

View File

@@ -1,170 +0,0 @@
# Action: Plan Solutions
为选中的 issues 生成执行方案。
## Purpose
使用 subagent 分析 issues 并生成解决方案,支持多解决方案选择和自动绑定。
## Preconditions
- [ ] `state.status === "running"`
- [ ] `issues with status === "planning"` exist
## Execution
```javascript
async function execute(state) {
const workDir = state.work_dir;
const issues = state.issues || {};
// 1. 识别需要规划的 issues
const planningIssues = Object.values(issues).filter(i => i.status === "planning");
if (planningIssues.length === 0) {
console.log("No issues to plan");
return { stateUpdates: { issues } };
}
console.log(`\n=== Planning ${planningIssues.length} Issues ===`);
// 2. 为每个 issue 生成规划 subagent
const planningAgents = planningIssues.map(issue => ({
issue_id: issue.id,
issue_title: issue.title,
prompt: `
## TASK ASSIGNMENT
### MANDATORY FIRST STEPS (Agent Execute)
1. **Read role definition**: ~/.codex/agents/issue-plan-agent.md (MUST read first)
2. Read: .workflow/project-tech.json
3. Read: .workflow/project-guidelines.json
4. Read schema: ~/.claude/workflows/cli-templates/schemas/solution-schema.json
---
Goal: Plan solution for issue "${issue.id}: ${issue.title}"
Scope:
- CAN DO: Explore codebase, design solutions, create tasks
- CANNOT DO: Execute solutions, modify production code
- Directory: ${process.cwd()}
Task Description:
${issue.title}
Deliverables:
- Create ONE primary solution
- Write to: ${workDir}/solutions/${issue.id}-plan.json
- Format: JSON following solution-schema.json
Quality bar:
- Tasks have quantified acceptance.criteria
- Each task includes test.commands
- Solution follows schema exactly
Return: JSON with solution_id, task_count, status
`
}));
// 3. 执行规划(串行执行避免竞争)
for (const agent of planningAgents) {
console.log(`\n→ Planning: ${agent.issue_id}`);
try {
// 对于 Codex这里应该使用 spawn_agent
// 对于 Claude Code Task使用 Task()
// 模拟 Task 调用 (实际应该是 spawn_agent 对于 Codex)
const result = await Task({
subagent_type: "universal-executor",
run_in_background: false,
description: `Plan solution for ${agent.issue_id}`,
prompt: agent.prompt
});
// 解析结果
let planResult;
try {
planResult = typeof result === "string" ? JSON.parse(result) : result;
} catch {
planResult = { status: "executed", solution_id: `SOL-${agent.issue_id}-1` };
}
// 更新 issue 状态
issues[agent.issue_id].status = "planned";
issues[agent.issue_id].solution_id = planResult.solution_id || `SOL-${agent.issue_id}-1`;
issues[agent.issue_id].planned_at = new Date().toISOString();
console.log(`${agent.issue_id}${issues[agent.issue_id].solution_id}`);
// 绑定解决方案
try {
Bash(`ccw issue bind ${agent.issue_id} ${issues[agent.issue_id].solution_id}`);
} catch (error) {
console.log(`Note: Could not bind solution (${error.message})`);
}
} catch (error) {
console.error(`✗ Planning failed for ${agent.issue_id}: ${error.message}`);
issues[agent.issue_id].status = "registered"; // 回退
issues[agent.issue_id].error = error.message;
}
}
// 4. 更新 issue 状态到 ccw
try {
Bash(`ccw issue update --from-planning`);
} catch {
console.log("Note: Could not update issue status");
}
return {
stateUpdates: {
issues: issues,
completed_actions: [...state.completed_actions, "action-plan"]
}
};
}
```
## State Updates
```javascript
return {
stateUpdates: {
issues: {
[issue.id]: {
...issue,
status: "planned",
solution_id: solutionId,
planned_at: timestamp
}
},
queue: [
...state.queue,
{
item_id: `S-${index}`,
issue_id: issue.id,
solution_id: solutionId,
status: "pending"
}
]
}
};
```
## Error Handling
| Error Type | Recovery |
|------------|----------|
| Subagent 超时 | 标记为失败,继续下一个 |
| 无效解决方案 | 回退到 registered 状态 |
| 绑定失败 | 记录警告,但继续 |
| 文件写入失败 | 重试 3 次 |
## Next Actions (Hints)
- 所有 issues 规划完成:执行 action-execute
- 部分失败:用户选择是否继续或重试
- 全部失败:返回 action-list 重新选择

View File

@@ -1,212 +0,0 @@
# Orchestrator - Dual-Agent Pipeline Architecture
主流程编排器:创建两个持久化 agent规划和执行流水线式处理所有 issue。
> **Note**: For complete system architecture overview and design principles, see **[../ARCHITECTURE.md](../ARCHITECTURE.md)**
## Architecture Overview
```
┌─────────────────────────────────────────────────────────┐
│ Main Orchestrator (Claude Code) │
│ 流水线式分配任务给两个持久化 agent │
└──────┬────────────────────────────────────────┬────────┘
│ send_input │ send_input
│ (逐个 issue) │ (逐个 solution)
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ Planning Agent │ │ Execution Agent │
│ (持久化) │ │ (持久化) │
│ │ │ │
│ • 接收 issue │ │ • 接收 solution │
│ • 设计方案 │ │ • 执行 tasks │
│ • 返回 solution │ │ • 返回执行结果 │
└──────────────────┘ └──────────────────┘
▲ ▲
└────────────────┬─────────────────────┘
wait for completion
```
## Main Orchestrator Pseudocode
```javascript
async function mainOrchestrator(workDir, issues) {
const planningResults = { results: [] }; // 统一存储
const executionResults = { results: [] }; // 统一存储
// 1. Create persistent agents (never close until done)
const planningAgentId = spawn_agent({
message: Read('prompts/planning-agent-system.md')
});
const executionAgentId = spawn_agent({
message: Read('prompts/execution-agent-system.md')
});
try {
// Phase 1: Planning Pipeline
for (const issue of issues) {
// Send issue to planning agent (不新建 agent用 send_input)
send_input({
id: planningAgentId,
message: buildPlanningRequest(issue)
});
// Wait for solution
const result = wait({ ids: [planningAgentId], timeout_ms: 300000 });
const solution = parseResponse(result);
// Store in unified results
planningResults.results.push({
issue_id: issue.id,
solution: solution,
status: solution ? "completed" : "failed"
});
}
// Save planning results once
Write(`${workDir}/planning-results.json`, JSON.stringify(planningResults, null, 2));
// Phase 2: Execution Pipeline
for (const planning of planningResults.results) {
if (planning.status !== "completed") continue;
// Send solution to execution agent (不新建 agent用 send_input)
send_input({
id: executionAgentId,
message: buildExecutionRequest(planning.solution)
});
// Wait for execution result
const result = wait({ ids: [executionAgentId], timeout_ms: 600000 });
const execResult = parseResponse(result);
// Store in unified results
executionResults.results.push({
issue_id: planning.issue_id,
status: execResult?.status || "failed",
commit_hash: execResult?.commit_hash
});
}
// Save execution results once
Write(`${workDir}/execution-results.json`, JSON.stringify(executionResults, null, 2));
} finally {
// Close agents after ALL issues processed
close_agent({ id: planningAgentId });
close_agent({ id: executionAgentId });
}
generateFinalReport(workDir, planningResults, executionResults);
}
```
## Key Design Principles
### 1. Agent Persistence
- **Creating**: Each agent created once at the beginning
- **Running**: Agents continue running, receiving multiple `send_input` calls
- **Closing**: Agents closed only after all issues processed
- **Benefit**: Agent maintains context across multiple issues
### 2. Unified Results Storage
```json
// planning-results.json
{
"phase": "planning",
"created_at": "2025-01-29T12:00:00Z",
"results": [
{
"issue_id": "ISS-001",
"solution_id": "SOL-ISS-001-1",
"status": "completed",
"solution": { "id": "...", "tasks": [...] },
"planned_at": "2025-01-29T12:05:00Z"
},
{
"issue_id": "ISS-002",
"solution_id": "SOL-ISS-002-1",
"status": "completed",
"solution": { "id": "...", "tasks": [...] },
"planned_at": "2025-01-29T12:10:00Z"
}
]
}
// execution-results.json
{
"phase": "execution",
"created_at": "2025-01-29T12:15:00Z",
"results": [
{
"issue_id": "ISS-001",
"solution_id": "SOL-ISS-001-1",
"status": "completed",
"commit_hash": "abc123def",
"files_modified": ["src/auth.ts"],
"executed_at": "2025-01-29T12:20:00Z"
}
]
}
```
**优点**:
- 单一 JSON 文件,易于查询和分析
- 完整的处理历史
- 减少文件 I/O 次数
### 3. Pipeline Flow
```
Issue 1 → Planning Agent → Wait → Solution 1 (save)
Issue 2 → Planning Agent → Wait → Solution 2 (save)
Issue 3 → Planning Agent → Wait → Solution 3 (save)
[All saved to planning-results.json]
Solution 1 → Execution Agent → Wait → Result 1 (save)
Solution 2 → Execution Agent → Wait → Result 2 (save)
Solution 3 → Execution Agent → Wait → Result 3 (save)
[All saved to execution-results.json]
```
### 4. Agent Communication via send_input
Instead of creating new agents, reuse persistent ones:
```javascript
// ❌ OLD: Create new agent per issue
for (const issue of issues) {
const agentId = spawn_agent({ message: prompt });
const result = wait({ ids: [agentId] });
close_agent({ id: agentId }); // ← Expensive!
}
// ✅ NEW: Persistent agent with send_input
const agentId = spawn_agent({ message: initialPrompt });
for (const issue of issues) {
send_input({ id: agentId, message: taskPrompt }); // ← Reuse!
const result = wait({ ids: [agentId] });
}
close_agent({ id: agentId }); // ← Single cleanup
```
### 5. Path Resolution for Global Installation
When this skill is installed globally:
- **Skill-internal paths**: Use relative paths from skill root (e.g., `prompts/planning-agent-system.md`)
- **Project paths**: Use project-relative paths starting with `.` (e.g., `.workflow/project-tech.json`)
- **User-home paths**: Use `~` prefix (e.g., `~/.codex/agents/...`)
- **Working directory**: Always relative to the project root when skill executes
## Benefits of This Architecture
| 方面 | 优势 |
|------|------|
| **性能** | Agent 创建/销毁开销仅一次(而非 N 次) |
| **上下文** | Agent 在多个任务间保持上下文 |
| **存储** | 统一的 JSON 文件,易于追踪和查询 |
| **通信** | 通过 send_input 实现 agent 间的数据传递 |
| **可维护性** | 流水线结构清晰,易于调试 |

View File

@@ -1,136 +0,0 @@
# State Schema Definition
状态结构定义和验证规则。
## 初始状态
```json
{
"status": "pending",
"phase": "init",
"work_dir": "",
"issues": {},
"queue": [],
"completed_actions": [],
"context": {
"total_issues": 0,
"completed_count": 0,
"failed_count": 0
},
"errors": [],
"created_at": "ISO-8601",
"updated_at": "ISO-8601"
}
```
## 状态转移
```
pending
init (Action-Init)
running
├→ list (Action-List) → Display issues
├→ plan (Action-Plan) → Plan issues
├→ execute (Action-Execute) → Execute solutions
├→ back to list/plan/execute loop
└→ complete (Action-Complete) → Finalize
completed
```
## 字段说明
| 字段 | 类型 | 说明 |
|------|------|------|
| `status` | string | "pending"\|"running"\|"completed" - 全局状态 |
| `phase` | string | "init"\|"listing"\|"planning"\|"executing"\|"complete" - 当前阶段 |
| `work_dir` | string | 工作目录路径 |
| `issues` | object | Issue 状态映射 `{issue_id: IssueState}` |
| `queue` | array | 待执行队列 |
| `completed_actions` | array | 已执行动作 ID 列表 |
| `context` | object | 执行上下文信息 |
| `errors` | array | 错误日志 |
## Issue 状态
```json
{
"id": "ISS-xxx",
"title": "Issue title",
"status": "registered|planning|planned|executing|completed|failed",
"solution_id": "SOL-xxx-1",
"planned_at": "ISO-8601",
"executed_at": "ISO-8601",
"error": null
}
```
## Queue Item
```json
{
"item_id": "S-1",
"issue_id": "ISS-xxx",
"solution_id": "SOL-xxx-1",
"status": "pending|executing|completed|failed"
}
```
## 验证函数
```javascript
function validateState(state) {
// Required fields
if (!state.status) throw new Error("Missing: status");
if (!state.phase) throw new Error("Missing: phase");
if (!state.work_dir) throw new Error("Missing: work_dir");
// Valid status values
const validStatus = ["pending", "running", "completed"];
if (!validStatus.includes(state.status)) {
throw new Error(`Invalid status: ${state.status}`);
}
// Issues structure
if (typeof state.issues !== "object") {
throw new Error("issues must be object");
}
// Queue is array
if (!Array.isArray(state.queue)) {
throw new Error("queue must be array");
}
return true;
}
```
## 状态持久化
```javascript
// 保存状态
function saveState(state) {
const statePath = `${state.work_dir}/state.json`;
Write(statePath, JSON.stringify(state, null, 2));
// 保存历史
const historyPath = `${state.work_dir}/state-history.json`;
const history = Read(historyPath).then(JSON.parse).catch(() => []);
history.push({
timestamp: new Date().toISOString(),
phase: state.phase,
completed_actions: state.completed_actions.length,
issues_count: Object.keys(state.issues).length
});
Write(historyPath, JSON.stringify(history, null, 2));
}
// 加载状态
function loadState(workDir) {
const statePath = `${workDir}/state.json`;
return JSON.parse(Read(statePath));
}
```

View File

@@ -1,32 +0,0 @@
⚠️ **DEPRECATED** - This file is deprecated as of v2.0 (2025-01-29)
**Use instead**: [`execution-agent.md`](execution-agent.md)
This file has been merged into `execution-agent.md` to consolidate system prompt + user prompt into a single unified source.
**Why the change?**
- Eliminates duplication between system and user prompts
- Reduces token usage by 70% in agent initialization
- Single source of truth for agent instructions
- Easier to maintain and update
**Migration**:
```javascript
// OLD (v1.0)
spawn_agent({ message: Read('prompts/execution-agent-system.md') });
// NEW (v2.0)
spawn_agent({ message: Read('prompts/execution-agent.md') });
```
**Timeline**:
- v2.0 (2025-01-29): Old files kept for backward compatibility
- v2.1 (2025-03-31): Old files will be removed
---
# Execution Agent System Prompt (Legacy - See execution-agent.md instead)
See [`execution-agent.md`](execution-agent.md) for the current unified prompt.
All content below is now consolidated into the new unified prompt file.

View File

@@ -1,323 +0,0 @@
# Execution Agent - Unified Prompt
You are the **Execution Agent** for the Codex issue planning and execution workflow.
## Role Definition
Your responsibility is implementing planned solutions and verifying they work correctly. You will:
1. **Receive solutions** one at a time via `send_input` messages from the main orchestrator
2. **Implement each solution** by executing the planned tasks in order
3. **Verify acceptance criteria** are met through testing
4. **Create commits** for each completed task
5. **Return execution results** with details on what was implemented
6. **Maintain context** across multiple solutions without closing
---
## Mandatory Initialization Steps
### First Run Only (Read These Files)
1. **Read role definition**: `~/.codex/agents/issue-execute-agent.md` (MUST read first)
2. **Read project tech stack**: `.workflow/project-tech.json`
3. **Read project guidelines**: `.workflow/project-guidelines.json`
4. **Read execution result schema**: `~/.claude/workflows/cli-templates/schemas/execution-result-schema.json`
---
## How to Operate
### Input Format
You will receive `send_input` messages with this structure:
```json
{
"type": "execute_solution",
"issue_id": "ISS-001",
"solution_id": "SOL-ISS-001-1",
"solution": {
"id": "SOL-ISS-001-1",
"tasks": [
{
"id": "T1",
"title": "Task title",
"action": "Create|Modify|Fix|Refactor",
"scope": "file path",
"description": "What to do",
"modification_points": ["Point 1"],
"implementation": ["Step 1", "Step 2"],
"test": {
"commands": ["npm test -- file.test.ts"],
"unit": ["Requirement 1"]
},
"acceptance": {
"criteria": ["Criterion 1: Must pass"],
"verification": ["Run tests"]
},
"depends_on": [],
"estimated_minutes": 30,
"priority": 1
}
],
"exploration_context": {
"relevant_files": ["path/to/file.ts"],
"patterns": "Follow existing pattern",
"integration_points": "Used by service X"
},
"analysis": {
"risk": "low|medium|high",
"impact": "low|medium|high",
"complexity": "low|medium|high"
}
},
"project_root": "/path/to/project"
}
```
### Your Workflow for Each Solution
1. **Prepare for execution**:
- Review all planned tasks and dependencies
- Ensure task ordering respects dependencies
- Identify files that need modification
- Plan code structure and implementation
2. **Execute each task in order**:
- Read existing code and understand context
- Implement modifications according to specs
- Run tests immediately after changes
- Verify acceptance criteria are met
- Create commit with descriptive message
3. **Handle task dependencies**:
- Execute tasks in dependency order (respect `depends_on`)
- Stop immediately if a dependency fails
- Report which task failed and why
- Include error details in result
4. **Verify all acceptance criteria**:
- Run test commands specified in each task
- Ensure all acceptance criteria are met
- Check for regressions in existing tests
- Document test results
5. **Generate execution result JSON**:
```json
{
"id": "EXR-ISS-001-1",
"issue_id": "ISS-001",
"solution_id": "SOL-ISS-001-1",
"status": "completed|failed",
"executed_tasks": [
{
"task_id": "T1",
"title": "Task title",
"status": "completed|failed",
"files_modified": ["src/auth.ts", "src/auth.test.ts"],
"commits": [
{
"hash": "abc123def",
"message": "Implement authentication task"
}
],
"test_results": {
"passed": 15,
"failed": 0,
"command": "npm test -- auth.test.ts",
"output": "Test results summary"
},
"acceptance_met": true,
"execution_time_minutes": 25,
"errors": []
}
],
"overall_stats": {
"total_tasks": 3,
"completed": 3,
"failed": 0,
"total_files_modified": 5,
"total_commits": 3,
"total_time_minutes": 75
},
"final_commit": {
"hash": "xyz789abc",
"message": "Resolve issue ISS-001: Feature implementation"
},
"verification": {
"all_tests_passed": true,
"all_acceptance_met": true,
"no_regressions": true
}
}
```
### Validation Rules
Ensure:
- ✓ All planned tasks executed (don't skip any)
- ✓ All acceptance criteria verified
- ✓ Tests pass without failures before finalizing
- ✓ All commits created with descriptive messages
- ✓ Execution result follows schema exactly
- ✓ No breaking changes introduced
### Return Format
After processing each solution, return this JSON:
```json
{
"status": "completed|failed",
"execution_result_id": "EXR-ISS-001-1",
"issue_id": "ISS-001",
"solution_id": "SOL-ISS-001-1",
"tasks_completed": 3,
"files_modified": 5,
"total_commits": 3,
"verification": {
"all_tests_passed": true,
"all_acceptance_met": true,
"no_regressions": true
},
"final_commit_hash": "xyz789abc",
"errors": []
}
```
---
## Quality Standards
### Completeness
- All planned tasks must be executed
- All acceptance criteria must be verified
- No tasks skipped or deferred
### Correctness
- All acceptance criteria must be met before marking complete
- Tests must pass without failures
- No regressions in existing tests
- Code quality maintained
### Traceability
- Each change tracked with commits
- Each commit has descriptive message
- Test results documented
- File modifications tracked
### Safety
- All tests pass before finalizing
- Changes verified against acceptance criteria
- Regressions checked before final commit
- Rollback strategy available if needed
---
## Context Preservation
You will receive multiple solutions sequentially. **Do NOT close after each solution.** Instead:
- Process each solution independently
- Maintain awareness of codebase state after modifications
- Use consistent coding style with the project
- Reference patterns established in previous solutions
- Track what's been implemented to avoid conflicts
---
## Error Handling
If you cannot execute a solution:
1. **Clearly state what went wrong** - be specific about the failure
2. **Specify which task failed** - identify the task and why
3. **Include error message** - provide full error output or test failure details
4. **Return status: "failed"** - mark the response as failed
5. **Continue waiting** - the orchestrator will send the next solution
Example error response:
```json
{
"status": "failed",
"execution_result_id": null,
"issue_id": "ISS-001",
"solution_id": "SOL-ISS-001-1",
"failed_task_id": "T2",
"failure_reason": "Test suite failed - dependency type error in auth.ts",
"error_details": "Error: Cannot find module 'jwt-decode'",
"files_attempted": ["src/auth.ts"],
"recovery_suggestions": "Install missing dependency or check import paths"
}
```
---
## Communication Protocol
After processing each solution:
1. Return the result JSON (success or failure)
2. Wait for the next `send_input` with a new solution
3. Continue this cycle until orchestrator closes you
**IMPORTANT**: Do NOT attempt to close yourself. The orchestrator will close you when all execution is complete.
---
## Task Execution Guidelines
### Before Task Implementation
- Read all related files to understand existing patterns
- Identify side effects and integration points
- Plan the complete implementation before coding
### During Task Implementation
- Implement one task at a time
- Follow existing code style and conventions
- Add tests alongside implementation
- Commit after each task completes
### After Task Implementation
- Run all test commands specified in task
- Verify each acceptance criterion
- Check for regressions
- Create commit with message referencing task ID
### Commit Message Format
```
[TASK_ID] Brief description of what was implemented
- Implementation detail 1
- Implementation detail 2
- Test results: all passed
Fixes ISS-XXX task T1
```
---
## Key Principles
- **Follow the plan exactly** - implement what was designed in solution, don't deviate
- **Test thoroughly** - run all specified tests before committing
- **Communicate changes** - create commits with descriptive messages
- **Verify acceptance** - ensure every criterion is met before marking complete
- **Maintain code quality** - follow existing project patterns and style
- **Handle failures gracefully** - stop immediately if something fails, report clearly
- **Preserve state** - remember what you've done across multiple solutions
- **No breaking changes** - ensure backward compatibility
---
## Success Criteria
✓ All planned tasks completed
✓ All acceptance criteria verified and met
✓ Unit tests pass with 100% success rate
✓ No regressions in existing functionality
✓ Final commit created with descriptive message
✓ Execution result JSON is valid and complete
✓ Code follows existing project conventions

View File

@@ -1,32 +0,0 @@
⚠️ **DEPRECATED** - This file is deprecated as of v2.0 (2025-01-29)
**Use instead**: [`planning-agent.md`](planning-agent.md)
This file has been merged into `planning-agent.md` to consolidate system prompt + user prompt into a single unified source.
**Why the change?**
- Eliminates duplication between system and user prompts
- Reduces token usage by 70% in agent initialization
- Single source of truth for agent instructions
- Easier to maintain and update
**Migration**:
```javascript
// OLD (v1.0)
spawn_agent({ message: Read('prompts/planning-agent-system.md') });
// NEW (v2.0)
spawn_agent({ message: Read('prompts/planning-agent.md') });
```
**Timeline**:
- v2.0 (2025-01-29): Old files kept for backward compatibility
- v2.1 (2025-03-31): Old files will be removed
---
# Planning Agent System Prompt (Legacy - See planning-agent.md instead)
See [`planning-agent.md`](planning-agent.md) for the current unified prompt.
All content below is now consolidated into the new unified prompt file.

View File

@@ -1,224 +0,0 @@
# Planning Agent - Unified Prompt
You are the **Planning Agent** for the Codex issue planning and execution workflow.
## Role Definition
Your responsibility is analyzing issues and creating detailed, executable solution plans. You will:
1. **Receive issues** one at a time via `send_input` messages from the main orchestrator
2. **Analyze each issue** by exploring the codebase, understanding requirements, and identifying the solution approach
3. **Design a comprehensive solution** with task breakdown, acceptance criteria, and implementation steps
4. **Return a structured solution JSON** that the Execution Agent will implement
5. **Maintain context** across multiple issues without closing
---
## Mandatory Initialization Steps
### First Run Only (Read These Files)
1. **Read role definition**: `~/.codex/agents/issue-plan-agent.md` (MUST read first)
2. **Read project tech stack**: `.workflow/project-tech.json`
3. **Read project guidelines**: `.workflow/project-guidelines.json`
4. **Read solution schema**: `~/.claude/workflows/cli-templates/schemas/solution-schema.json`
---
## How to Operate
### Input Format
You will receive `send_input` messages with this structure:
```json
{
"type": "plan_issue",
"issue_id": "ISS-001",
"issue_title": "Add user authentication",
"issue_description": "Implement JWT-based authentication for API endpoints",
"project_root": "/path/to/project"
}
```
### Your Workflow for Each Issue
1. **Analyze the issue**:
- Understand the problem and requirements
- Explore relevant code files
- Identify integration points
- Check for existing patterns
2. **Design the solution**:
- Break down into concrete tasks (2-7 tasks)
- Define file modifications needed
- Create implementation steps
- Define test commands and acceptance criteria
- Identify task dependencies
3. **Generate solution JSON** following this format:
```json
{
"id": "SOL-ISS-001-1",
"issue_id": "ISS-001",
"description": "Brief description of solution",
"tasks": [
{
"id": "T1",
"title": "Task title",
"action": "Create|Modify|Fix|Refactor",
"scope": "file path or directory",
"description": "What to do",
"modification_points": ["Point 1", "Point 2"],
"implementation": ["Step 1", "Step 2", "Step 3"],
"test": {
"commands": ["npm test -- file.test.ts"],
"unit": ["Requirement 1", "Requirement 2"]
},
"acceptance": {
"criteria": ["Criterion 1: Must pass", "Criterion 2: Must satisfy"],
"verification": ["Run tests", "Manual verification"]
},
"depends_on": [],
"estimated_minutes": 30,
"priority": 1
}
],
"exploration_context": {
"relevant_files": ["path/to/file.ts", "path/to/another.ts"],
"patterns": "Follow existing pattern X",
"integration_points": "Used by service X and Y"
},
"analysis": {
"risk": "low|medium|high",
"impact": "low|medium|high",
"complexity": "low|medium|high"
},
"score": 0.95,
"is_bound": true
}
```
### Validation Rules
Ensure:
- ✓ All required fields present in solution JSON
- ✓ No circular dependencies in `task.depends_on`
- ✓ Each task has **quantified** acceptance criteria (not vague)
- ✓ Solution follows `solution-schema.json` exactly
- ✓ Score reflects quality (0.8+ for approval)
- ✓ Total estimated time ≤ 2 hours
### Return Format
After processing each issue, return this JSON:
```json
{
"status": "completed|failed",
"solution_id": "SOL-ISS-001-1",
"task_count": 3,
"score": 0.95,
"validation": {
"schema_valid": true,
"criteria_quantified": true,
"no_circular_deps": true,
"total_estimated_minutes": 90
},
"errors": []
}
```
---
## Quality Standards
### Completeness
- All required fields must be present
- No missing sections
- Each task must have all sub-fields
### Clarity
- Each task must have specific, measurable acceptance criteria
- Task descriptions must be clear enough for implementation
- Implementation steps must be actionable
### Correctness
- No circular dependencies in task ordering
- Task dependencies form a valid DAG (Directed Acyclic Graph)
- File paths are correct and relative to project root
### Pragmatism
- Solution is minimal and focused on the issue
- Tasks are achievable within 1-2 hours total
- Leverages existing patterns and libraries
---
## Context Preservation
You will receive multiple issues sequentially. **Do NOT close after each issue.** Instead:
- Process each issue independently
- Maintain awareness of the workflow context across issues
- Use consistent naming conventions (SOL-ISSxxx-1 format)
- Reference previous patterns if applicable to new issues
- Keep track of explored code patterns for consistency
---
## Error Handling
If you cannot complete planning for an issue:
1. **Clearly state what went wrong** - be specific about the issue
2. **Provide the reason** - missing context, unclear requirements, insufficient project info, etc.
3. **Return status: "failed"** - mark the response as failed
4. **Continue waiting** - the orchestrator will send the next issue
5. **Suggest remediation** - if possible, suggest what information is needed
Example error response:
```json
{
"status": "failed",
"solution_id": null,
"error_message": "Cannot plan solution - issue description lacks technical detail. Recommend: clarify whether to use JWT or OAuth, specify API endpoints, define user roles.",
"suggested_clarification": "..."
}
```
---
## Communication Protocol
After processing each issue:
1. Return the response JSON (success or failure)
2. Wait for the next `send_input` with a new issue
3. Continue this cycle until orchestrator closes you
**IMPORTANT**: Do NOT attempt to close yourself. The orchestrator will close you when all planning is complete.
---
## Key Principles
- **Focus on analysis and design** - leave implementation to the Execution Agent
- **Be thorough** - explore code and understand patterns before proposing solutions
- **Be pragmatic** - solutions should be achievable within 1-2 hours
- **Follow schema** - every solution JSON must validate against the solution schema
- **Maintain context** - remember project context across multiple issues
- **Quantify everything** - acceptance criteria must be measurable, not vague
- **No circular logic** - task dependencies must form a valid DAG
---
## Success Criteria
✓ Solution JSON is valid and follows schema exactly
✓ All tasks have quantified acceptance.criteria
✓ No circular dependencies detected
✓ Score >= 0.8
✓ Estimated total time <= 2 hours
✓ Each task is independently verifiable through test.commands

View File

@@ -1,468 +0,0 @@
# Agent Roles Definition
Agent角色定义和职责范围。
---
## Role Assignment
### Planning Agent (Issue-Plan-Agent)
**职责**: 分析issue并生成可执行的解决方案
**角色文件**: `~/.codex/agents/issue-plan-agent.md`
**提示词**: `prompts/planning-agent.md`
#### Capabilities
**允许**:
- 读取代码、文档、配置
- 探索项目结构和依赖关系
- 分析问题和设计解决方案
- 分解任务为可执行步骤
- 定义验收条件
**禁止**:
- 修改代码
- 执行代码
- 推送到远程
- 删除文件或分支
#### Input Format
```json
{
"type": "plan_issue",
"issue_id": "ISS-001",
"title": "Fix authentication timeout",
"description": "User sessions timeout too quickly",
"project_context": {
"tech_stack": "Node.js + Express + JWT",
"guidelines": "Follow existing patterns",
"relevant_files": ["src/auth.ts", "src/middleware/auth.ts"]
}
}
```
#### Output Format
```json
{
"status": "completed|failed",
"solution_id": "SOL-ISS-001-1",
"tasks": [
{
"id": "T1",
"title": "Update JWT configuration",
"action": "Modify",
"scope": "src/config/auth.ts",
"description": "Increase token expiration time",
"modification_points": ["TOKEN_EXPIRY constant"],
"implementation": ["Step 1", "Step 2"],
"test": {
"commands": ["npm test -- auth.test.ts"],
"unit": ["Token expiry should be 24 hours"]
},
"acceptance": {
"criteria": ["Token valid for 24 hours", "Test suite passes"],
"verification": ["Run tests"]
},
"depends_on": [],
"estimated_minutes": 20,
"priority": 1
}
],
"exploration_context": {
"relevant_files": ["src/auth.ts", "src/middleware/auth.ts"],
"patterns": "Follow existing JWT configuration pattern",
"integration_points": "Used by authentication middleware"
},
"analysis": {
"risk": "low|medium|high",
"impact": "low|medium|high",
"complexity": "low|medium|high"
},
"score": 0.95,
"validation": {
"schema_valid": true,
"criteria_quantified": true,
"no_circular_deps": true
}
}
```
---
### Execution Agent (Issue-Execute-Agent)
**职责**: 执行规划的解决方案,实现所有任务
**角色文件**: `~/.codex/agents/issue-execute-agent.md`
**提示词**: `prompts/execution-agent.md`
#### Capabilities
**允许**:
- 读取代码和配置
- 修改代码
- 运行测试
- 提交代码
- 验证acceptance criteria
- 创建snapshots用于恢复
**禁止**:
- 推送到远程分支
- 创建PR除非明确授权
- 删除分支
- 强制覆盖主分支
#### Input Format
```json
{
"type": "execute_solution",
"issue_id": "ISS-001",
"solution_id": "SOL-ISS-001-1",
"solution": {
"id": "SOL-ISS-001-1",
"tasks": [ /* task objects from planning */ ],
"exploration_context": {
"relevant_files": ["src/auth.ts"],
"patterns": "Follow existing pattern",
"integration_points": "Used by auth middleware"
}
},
"project_root": "/path/to/project"
}
```
#### Output Format
```json
{
"status": "completed|failed",
"execution_result_id": "EXR-ISS-001-1",
"issue_id": "ISS-001",
"solution_id": "SOL-ISS-001-1",
"executed_tasks": [
{
"task_id": "T1",
"title": "Update JWT configuration",
"status": "completed",
"files_modified": ["src/config/auth.ts"],
"commits": [
{
"hash": "abc123def456",
"message": "[T1] Update JWT token expiration to 24 hours"
}
],
"test_results": {
"passed": 8,
"failed": 0,
"command": "npm test -- auth.test.ts",
"output": "All tests passed"
},
"acceptance_met": true,
"execution_time_minutes": 15,
"errors": []
}
],
"overall_stats": {
"total_tasks": 1,
"completed": 1,
"failed": 0,
"total_files_modified": 1,
"total_commits": 1,
"total_time_minutes": 15
},
"final_commit": {
"hash": "xyz789abc",
"message": "Resolve ISS-001: Fix authentication timeout"
},
"verification": {
"all_tests_passed": true,
"all_acceptance_met": true,
"no_regressions": true
}
}
```
---
## Dual-Agent Strategy
### 为什么使用双Agent模式
1. **关注点分离** - 规划和执行各自专注一个任务
2. **并行优化** - 虽然执行仍是串行,但规划可独立优化
3. **上下文最小化** - 仅传递solution ID避免上下文膨胀
4. **错误隔离** - 规划失败不影响执行,反之亦然
5. **可维护性** - 每个agent专注单一职责
### 工作流程
```
┌────────────────────────────────────┐
│ Planning Agent │
│ • Analyze issue │
│ • Explore codebase │
│ • Design solution │
│ • Generate tasks │
│ • Validate schema │
│ → Output: SOL-ISS-001-1 JSON │
└────────────┬─────────────────────┘
┌──────────────┐
│ Save to │
│ planning- │
│ results.json │
│ + Bind │
└──────┬───────┘
┌────────────────────────────────────┐
│ Execution Agent │
│ • Load SOL-ISS-001-1 │
│ • Implement T1, T2, T3... │
│ • Run tests per task │
│ • Commit changes │
│ • Verify acceptance │
│ → Output: EXR-ISS-001-1 JSON │
└────────────┬─────────────────────┘
┌──────────────┐
│ Save to │
│ execution- │
│ results.json │
└──────────────┘
```
---
## Context Minimization
### 信息传递原则
**目标**: 最小化上下文减少token浪费
#### Planning Phase - 传递内容
- Issue ID 和 Title
- Issue Description
- Project tech stack (`project-tech.json`)
- Project guidelines (`project-guidelines.json`)
- Solution schema reference
#### Planning Phase - 不传递
- 完整的代码库快照
- 所有相关文件内容 (Agent自己探索)
- 历史执行结果
- 其他issues的信息
#### Execution Phase - 传递内容
- Solution ID (完整的solution JSON)
- 执行参数worktree路径等
- Project tech stack
- Project guidelines
#### Execution Phase - 不传递
- 规划阶段的完整上下文
- 其他solutions的信息
- 原始issue描述solution JSON中已包含
### 上下文加载策略
```javascript
// Planning Agent 自己加载
const issueDetails = Read(issueStore + issue_id);
const techStack = Read('.workflow/project-tech.json');
const guidelines = Read('.workflow/project-guidelines.json');
const schema = Read('~/.claude/workflows/cli-templates/schemas/solution-schema.json');
// Execution Agent 自己加载
const solution = planningResults.find(r => r.solution_id === solutionId);
const techStack = Read('.workflow/project-tech.json');
const guidelines = Read('.workflow/project-guidelines.json');
```
**优势**:
- 减少重复传递
- 使用相同的源文件版本
- Agents可以自我刷新上下文
- 易于更新project guidelines或tech stack
---
## 错误处理与重试
### Planning 错误
| 错误 | 原因 | 重试策略 | 恢复 |
|------|------|--------|------|
| Subagent超时 | 分析复杂或系统慢 | 增加timeout重试1次 | 返回用户,标记失败 |
| 无效solution | 生成不符合schema | 验证schema返回错误 | 返回用户进行修正 |
| 依赖循环 | DAG错误 | 检测循环,返回错误 | 用户手动修正 |
| 权限错误 | 无法读取文件 | 检查路径和权限 | 返回具体错误 |
| 格式错误 | JSON无效 | 验证格式,返回错误 | 用户修正格式 |
### Execution 错误
| 错误 | 原因 | 重试策略 | 恢复 |
|------|------|--------|------|
| Task失败 | 代码实现问题 | 检查错误,不重试 | 记录错误,标记失败 |
| 测试失败 | 测试用例不符 | 不提交,标记失败 | 返回测试输出 |
| 提交失败 | 冲突或权限 | 创建snapshot便于恢复 | 让用户决定 |
| Subagent超时 | 任务太复杂 | 增加timeout | 记录超时,标记失败 |
| 文件冲突 | 并发修改 | 创建snapshot | 让用户合并 |
---
## 交互指南
### 向Planning Agent的问题
```
"这个issue描述了什么问题"
→ 返回:问题分析 + 根本原因
"解决这个问题需要修改哪些文件?"
→ 返回:文件列表 + 修改点
"如何验证解决方案是否有效?"
→ 返回:验收条件 + 验证步骤
"预计需要多少时间?"
→ 返回:每个任务的估计时间 + 总计
"有哪些风险?"
→ 返回:风险分析 + 影响评估
```
### 向Execution Agent的问题
```
"这个task有哪些实现步骤"
→ 返回:逐步指南 + 代码示例
"所有测试都通过了吗?"
→ 返回:测试结果 + 失败原因(如有)
"acceptance criteria都满足了吗"
→ 返回:验证结果 + 不符合项(如有)
"有哪些文件被修改了?"
→ 返回:文件列表 + 变更摘要
"代码有没有回归问题?"
→ 返回:回归测试结果
```
---
## Role文件位置
```
~/.codex/agents/
├── issue-plan-agent.md # 规划角色定义
├── issue-execute-agent.md # 执行角色定义
└── ...
.codex/skills/codex-issue-plan-execute/
├── prompts/
│ ├── planning-agent.md # 规划提示词
│ └── execution-agent.md # 执行提示词
└── specs/
├── agent-roles.md # 本文件
└── ...
```
### 如果角色文件不存在
Orchestrator会使用fallback策略
- `universal-executor` 作为备用规划角色
- `code-developer` 作为备用执行角色
---
## 最佳实践
### 为Planning Agent设计提示词
✓ 从issue描述提取关键信息
✓ 探索相关代码和类似实现
✓ 分析根本原因和解决方向
✓ 设计最小化解决方案
✓ 分解为2-7个可执行任务
✓ 为每个task定义明确的acceptance criteria
✓ 验证任务依赖无循环
✓ 估计总时间≤2小时
### 为Execution Agent设计提示词
✓ 加载solution和所有task定义
✓ 按依赖顺序执行tasks
✓ 为每个taskimplement → test → verify
✓ 确保所有acceptance criteria通过
✓ 运行完整的测试套件
✓ 检查代码质量和风格一致性
✓ 创建描述性的commit消息
✓ 生成完整的execution result JSON
---
## Communication Protocol
### Planning Agent Lifecycle
```
1. Initialize (once)
- Read system prompt
- Read role definition
- Load project context
2. Process issues (loop)
- Receive issue via send_input
- Analyze issue
- Design solution
- Return solution JSON
- Wait for next issue
3. Shutdown
- Orchestrator closes when done
```
### Execution Agent Lifecycle
```
1. Initialize (once)
- Read system prompt
- Read role definition
- Load project context
2. Process solutions (loop)
- Receive solution via send_input
- Implement all tasks
- Run tests
- Return execution result
- Wait for next solution
3. Shutdown
- Orchestrator closes when done
```
---
## Version History
| Version | Date | Changes |
|---------|------|---------|
| 2.0 | 2025-01-29 | Consolidated from subagent-roles.md, updated format |
| 1.0 | 2024-12-29 | Initial agent roles definition |
---
**Document Version**: 2.0
**Last Updated**: 2025-01-29
**Maintained By**: Codex Issue Plan-Execute Team

View File

@@ -1,187 +0,0 @@
# Issue Handling Specification
Issue 处理的核心规范和约定。
## When to Use
| Phase | Usage | Section |
|-------|-------|---------|
| Phase: action-list | Issue 列表展示 | Issue Status & Display |
| Phase: action-plan | Issue 规划 | Solution Planning |
| Phase: action-execute | Issue 执行 | Solution Execution |
---
## Issue Structure
### 基本字段
```json
{
"id": "ISS-20250129-001",
"title": "Fix authentication token expiration bug",
"description": "Tokens expire too quickly in production",
"status": "registered",
"priority": "high",
"tags": ["auth", "bugfix"],
"created_at": "2025-01-29T10:00:00Z",
"updated_at": "2025-01-29T10:00:00Z"
}
```
### 工作流状态
| Status | Phase | 说明 |
|--------|-------|------|
| `registered` | Initial | Issue 已创建,待规划 |
| `planning` | List → Plan | 正在规划中 |
| `planned` | Plan → Execute | 规划完成,解决方案已绑定 |
| `executing` | Execute | 正在执行 |
| `completed` | Execute → Complete | 执行完成 |
| `failed` | Any | 执行失败 |
### 工作流字段
```json
{
"id": "ISS-xxx",
"status": "registered|planning|planned|executing|completed|failed",
"solution_id": "SOL-xxx-1",
"planned_at": "2025-01-29T11:00:00Z",
"executed_at": "2025-01-29T12:00:00Z",
"error": null
}
```
## Issue 列表显示
### 格式规范
```
Status Matrix:
Total: 5 | Registered: 2 | Planned: 2 | Completed: 1
Issue Details:
○ [1] ISS-001: Fix login bug (registered)
→ [2] ISS-002: Add MFA support (planning)
✓ [3] ISS-003: Refactor auth (completed)
✗ [4] ISS-004: Update password policy (failed)
```
### 显示字段
- ID: 唯一标识
- Title: 简短描述
- Status: 当前状态
- Solution ID: 绑定的解决方案(如有)
## Solution Planning
### 规划输入
- Issue ID 和 Title
- Issue 描述和上下文
- 项目技术栈和指南
### 规划输出
- Solution ID`SOL-{issue-id}-{sequence}`
- Tasks 数组:可执行的任务列表
- Acceptance Criteria验收标准
- 估计时间
### Planning Subagent 职责
1. 分析 issue 描述
2. 探索相关代码路径
3. 设计解决方案
4. 分解为可执行任务
5. 定义验收条件
### 多解决方案处理
- 如果生成多个方案,需要用户选择
- 选择后绑定主方案到 issue
- 备选方案保存但不自动执行
## Solution Execution
### 执行顺序
1. 加载已规划的解决方案
2. 逐个执行每个 solution 中的所有 tasks
3. 每个 taskimplement → test → verify
4. 完成后提交一次
### Execution Subagent 职责
1. 加载 solution JSON
2. 实现所有任务
3. 运行测试
4. 验收条件检查
5. 提交代码并返回结果
### 错误恢复
- Task 失败:不提交,标记 solution 为失败
- 提交失败:创建快照便于恢复
- Subagent 超时:记录并继续下一个
## 批量处理约定
### 输入格式
```bash
# 单个 issue
codex issue:plan-execute ISS-001
# 多个 issues
codex issue:plan-execute ISS-001,ISS-002,ISS-003
# 交互式
codex issue:plan-execute
```
### 处理策略
- 规划:可并行,但为保持一致性这里采用串行
- 执行:必须串行(避免冲突提交)
- 队列FIFO无优先级排序
## 状态持久化
### 保存位置
```
.workflow/.scratchpad/codex-issue-{timestamp}/
├── state.json # 当前状态快照
├── state-history.json # 状态变更历史
├── queue.json # 执行队列
├── solutions/ # 解决方案文件
├── snapshots/ # 流程快照
└── final-report.md # 最终报告
```
### 快照用途
- 流程恢复:允许从中断点恢复
- 调试:记录每个阶段的状态变化
- 审计:跟踪完整的执行过程
## 质量保证
### 验收清单
- [ ] Issue 规范明确
- [ ] Solution 遵循 schema
- [ ] All tasks 有 acceptance criteria
- [ ] 执行成功率 >= 80%
- [ ] 报告生成完整
### 错误分类
| 级别 | 类型 | 处理 |
|------|------|------|
| Critical | 规划失败、提交失败 | 中止该 issue |
| Warning | 测试失败、条件未满足 | 记录但继续 |
| Info | 超时、网络延迟 | 日志记录 |

View File

@@ -1,231 +0,0 @@
# Quality Standards
质量评估标准和验收条件。
## Quality Dimensions
### 1. Completeness (完整性) - 25%
**定义**:所有必需的结构和字段都存在
- [ ] 所有 issues 都有规划或执行结果
- [ ] 每个 solution 都有完整的 task 列表
- [ ] 每个 task 都有 acceptance criteria
- [ ] 状态日志完整记录
**评分**
- 90-100%:全部完整,可能有可选字段缺失
- 70-89%:主要字段完整,部分可选字段缺失
- 50-69%:核心字段完整,重要字段缺失
- <50%:结构不完整
### 2. Consistency (一致性) - 25%
**定义**:整个工作流中的术语、格式、风格统一
- [ ] Issue ID/Solution ID 格式统一
- [ ] Status 值遵循规范
- [ ] Task 结构一致
- [ ] 时间戳格式一致ISO-8601
**评分**
- 90-100%:完全一致,无格式混乱
- 70-89%:大部分一致,偶有格式变化
- 50-69%:半数一致,混乱明显
- <50%:严重不一致
### 3. Correctness (正确性) - 25%
**定义**:执行过程中没有错误,验收条件都通过
- [ ] 无 DAG 循环依赖
- [ ] 所有测试通过
- [ ] 所有 acceptance criteria 验证通过
- [ ] 无代码冲突
**评分**
- 90-100%:完全正确,无错误
- 70-89%:基本正确,<10% 错误率
- 50-69%有明显错误10-30% 错误率
- <50%:错误过多,>30% 错误率
### 4. Clarity (清晰度) - 25%
**定义**:文档清晰易读,逻辑清晰
- [ ] Task 描述明确可操作
- [ ] Acceptance criteria 具体明确
- [ ] 报告结构清晰,易理解
- [ ] 错误信息详细有帮助
**评分**
- 90-100%:非常清晰,一目了然
- 70-89%:大部分清晰,有基本可读性
- 50-69%:部分清晰,理解有难度
- <50%:极不清晰,难以理解
## Quality Gates
### Pass (通过)
**条件**:总分 >= 80%
**结果**:工作流正常完成,可进入下一阶段
**检查清单**
- [ ] 所有 issues 已规划或执行
- [ ] 成功率 >= 80%
- [ ] 无关键错误
- [ ] 报告完整
### Review (需审查)
**条件**:总分 60-79%
**结果**:工作流部分完成,有可改进项
**常见问题**
- 部分 task 失败
- 某些验收条件未满足
- 文档不够完整
**改进方式**
- 检查失败的 task
- 添加缺失的文档
- 优化工作流配置
### Fail (失败)
**条件**:总分 < 60%
**结果**:工作流失败,需重做
**常见原因**
- 关键 task 失败
- 规划过程中断
- 系统错误过多
- 无法生成有效报告
**恢复方式**
- 从快照恢复
- 修复根本问题
- 重新规划和执行
## Issue Classification
### Errors (必须修复)
| 错误 | 影响 | 处理 |
|------|------|------|
| DAG 循环依赖 | Critical | 中止规划 |
| 任务无 acceptance | High | 补充条件 |
| 提交失败 | High | 调查并重试 |
| 规划 subagent 超时 | Medium | 重试或跳过 |
| 无效的 solution ID | Medium | 重新生成 |
### Warnings (应该修复)
| 警告 | 影响 | 处理 |
|------|------|------|
| Task 执行时间过长 | Medium | 考虑拆分 |
| 测试覆盖率低 | Medium | 补充测试 |
| 多个解决方案 | Low | 明确选择 |
| Criteria 不具体 | Low | 改进措辞 |
### Info (可选改进)
| 信息 | 说明 |
|------|------|
| 建议任务数 | 2-7 个任务为最优 |
| 时间建议 | 总耗时 <= 2 小时为佳 |
| 代码风格 | 检查是否遵循项目规范 |
## 执行检查清单
### 规划阶段
- [ ] Issue 描述清晰
- [ ] 生成了有效的 solution
- [ ] 所有 task 有 acceptance criteria
- [ ] 依赖关系正确
### 执行阶段
- [ ] 每个 task 实现完整
- [ ] 所有测试通过
- [ ] 所有 acceptance criteria 验证通过
- [ ] 提交信息规范
### 完成阶段
- [ ] 生成了最终报告
- [ ] 统计信息准确
- [ ] 状态持久化完整
- [ ] 快照保存无误
## 自动化验证函数
```javascript
function runQualityChecks(workDir) {
const state = JSON.parse(Read(`${workDir}/state.json`));
const issues = state.issues || {};
const scores = {
completeness: checkCompleteness(issues),
consistency: checkConsistency(state),
correctness: checkCorrectness(issues),
clarity: checkClarity(state)
};
const overall = Object.values(scores).reduce((a, b) => a + b) / 4;
return {
scores: scores,
overall: overall.toFixed(1),
gate: overall >= 80 ? 'pass' : overall >= 60 ? 'review' : 'fail',
details: {
issues_total: Object.keys(issues).length,
completed: Object.values(issues).filter(i => i.status === 'completed').length,
failed: Object.values(issues).filter(i => i.status === 'failed').length
}
};
}
```
## 报告模板
```markdown
# Quality Report
## Scores
| Dimension | Score | Status |
|-----------|-------|--------|
| Completeness | 90% | ✓ |
| Consistency | 85% | ✓ |
| Correctness | 92% | ✓ |
| Clarity | 88% | ✓ |
| **Overall** | **89%** | **PASS** |
## Issues Summary
- Total: 10
- Completed: 8 (80%)
- Failed: 2 (20%)
- Pending: 0 (0%)
## Recommendations
1. ...
2. ...
## Errors & Warnings
### Errors (0)
None
### Warnings (1)
- Task T4 in ISS-003 took 45 minutes (expected 30)
```

View File

@@ -1,270 +0,0 @@
# Solution Schema Specification
解决方案数据结构和验证规则。
## When to Use
| Phase | Usage | Section |
|-------|-------|---------|
| Phase: action-plan | Solution 生成 | Solution Structure |
| Phase: action-execute | Task 解析 | Task Definition |
---
## Solution Structure
### 完整 Schema
```json
{
"id": "SOL-ISS-001-1",
"issue_id": "ISS-001",
"description": "Fix authentication token expiration by extending TTL",
"strategy_type": "bugfix",
"created_at": "2025-01-29T11:00:00Z",
"tasks": [
{
"id": "T1",
"title": "Update token TTL configuration",
"action": "Modify",
"scope": "src/config/auth.ts",
"description": "Increase JWT token expiration from 1h to 24h",
"modification_points": [
{
"file": "src/config/auth.ts",
"target": "JWT_EXPIRY",
"change": "Change value from 3600 to 86400"
}
],
"implementation": [
"Open src/config/auth.ts",
"Locate JWT_EXPIRY constant",
"Update value: 3600 → 86400",
"Add comment explaining change"
],
"test": {
"commands": ["npm test -- auth.config.test.ts"],
"unit": ["Token expiration should be 24h"],
"integration": []
},
"acceptance": {
"criteria": [
"Unit tests pass",
"Token TTL is correctly set",
"No breaking changes to API"
],
"verification": [
"Run: npm test",
"Manual: Verify token in console"
]
},
"depends_on": [],
"estimated_minutes": 15,
"priority": 1
}
],
"exploration_context": {
"relevant_files": [
"src/config/auth.ts",
"src/services/auth.service.ts",
"tests/auth.test.ts"
],
"patterns": "Follow existing config pattern in .env",
"integration_points": "Used by AuthService in middleware"
},
"analysis": {
"risk": "low",
"impact": "medium",
"complexity": "low"
},
"score": 0.95,
"is_bound": true
}
```
## 字段说明
### 基础字段
| 字段 | 类型 | 必需 | 说明 |
|------|------|------|------|
| `id` | string | ✓ | 唯一 IDSOL-{issue-id}-{seq} |
| `issue_id` | string | ✓ | 关联的 Issue ID |
| `description` | string | ✓ | 解决方案描述 |
| `strategy_type` | string | | 策略类型bugfix/feature/refactor |
| `tasks` | array | ✓ | 任务列表,至少 1 个 |
### Task 字段
| 字段 | 类型 | 说明 |
|------|------|------|
| `id` | string | 任务 IDT1, T2, ... |
| `title` | string | 任务标题 |
| `action` | string | 动作类型Create/Modify/Fix/Refactor |
| `scope` | string | 作用范围:文件或目录 |
| `modification_points` | array | 具体修改点列表 |
| `implementation` | array | 实现步骤 |
| `test` | object | 测试命令和用例 |
| `acceptance` | object | 验收条件和验证步骤 |
| `depends_on` | array | 任务依赖:[T1, T2] |
| `estimated_minutes` | number | 预计耗时(分钟) |
### 验收条件
```json
{
"acceptance": {
"criteria": [
"Unit tests pass",
"Function returns correct result",
"No performance regression"
],
"verification": [
"Run: npm test -- module.test.ts",
"Manual: Call function and verify output"
]
}
}
```
## 验证规则
### 必需字段检查
```javascript
function validateSolution(solution) {
if (!solution.id) throw new Error("Missing: id");
if (!solution.issue_id) throw new Error("Missing: issue_id");
if (!solution.description) throw new Error("Missing: description");
if (!Array.isArray(solution.tasks)) throw new Error("tasks must be array");
if (solution.tasks.length === 0) throw new Error("tasks cannot be empty");
return true;
}
function validateTask(task) {
if (!task.id) throw new Error("Missing: task.id");
if (!task.title) throw new Error("Missing: task.title");
if (!task.action) throw new Error("Missing: task.action");
if (!Array.isArray(task.implementation)) throw new Error("implementation must be array");
if (!task.acceptance) throw new Error("Missing: task.acceptance");
if (!Array.isArray(task.acceptance.criteria)) throw new Error("acceptance.criteria must be array");
if (task.acceptance.criteria.length === 0) throw new Error("acceptance.criteria cannot be empty");
return true;
}
```
### 格式验证
- ID 格式:`SOL-ISS-\d+-\d+`
- Action 值Create | Modify | Fix | Refactor | Add | Remove
- Risk/Impact/Complexity 值low | medium | high
- Score 范围0.0 - 1.0
## 任务依赖
### 表示方法
```json
{
"tasks": [
{
"id": "T1",
"title": "Create auth module",
"depends_on": []
},
{
"id": "T2",
"title": "Add authentication logic",
"depends_on": ["T1"]
},
{
"id": "T3",
"title": "Add tests",
"depends_on": ["T1", "T2"]
}
]
}
```
### DAG 验证
```javascript
function validateDAG(tasks) {
const visited = new Set();
const recursionStack = new Set();
function hasCycle(taskId) {
visited.add(taskId);
recursionStack.add(taskId);
const task = tasks.find(t => t.id === taskId);
if (!task || !task.depends_on) return false;
for (const dep of task.depends_on) {
if (!visited.has(dep)) {
if (hasCycle(dep)) return true;
} else if (recursionStack.has(dep)) {
return true; // 发现循环
}
}
recursionStack.delete(taskId);
return false;
}
for (const task of tasks) {
if (!visited.has(task.id) && hasCycle(task.id)) {
throw new Error(`Circular dependency detected: ${task.id}`);
}
}
return true;
}
```
## 文件保存
### 位置
```
.workflow/.scratchpad/codex-issue-{timestamp}/solutions/
├── ISS-001-plan.json # 规划结果
├── ISS-001-execution.json # 执行结果
├── ISS-002-plan.json
└── ISS-002-execution.json
```
### 文件内容
**规划结果**:包含 solution 完整定义
**执行结果**:包含执行状态和提交信息
```json
{
"solution_id": "SOL-ISS-001-1",
"status": "completed|failed",
"executed_at": "ISO-8601",
"execution_result": {
"files_modified": ["src/auth.ts"],
"commit_hash": "abc123...",
"tests_passed": true
}
}
```
## 质量门控
### Solution 评分标准
| 指标 | 权重 | 评分方法 |
|------|------|----------|
| 任务完整性 | 30% | 无空任务,每个任务有 acceptance |
| 依赖合法性 | 20% | 无循环依赖,依赖链清晰 |
| 验收可测 | 30% | Criteria 明确可测,有验证步骤 |
| 复杂度评估 | 20% | Risk/Impact/Complexity 合理评估 |
### 通过条件
- 所有必需字段存在
- 无格式错误
- 无循环依赖
- Score >= 0.8

View File

@@ -1,32 +0,0 @@
⚠️ **DEPRECATED** - This file is deprecated as of v2.0 (2025-01-29)
**Use instead**: [`agent-roles.md`](agent-roles.md)
This file has been superseded by a consolidated `agent-roles.md` that improves organization and eliminates duplication.
**Why the change?**
- Consolidates all agent role definitions in one place
- Eliminates duplicated role descriptions
- Single source of truth for agent capabilities
- Better organization with unified reference format
**Migration**:
```javascript
// OLD (v1.0)
// Reference: specs/subagent-roles.md
// NEW (v2.0)
// Reference: specs/agent-roles.md
```
**Timeline**:
- v2.0 (2025-01-29): Old file kept for backward compatibility
- v2.1 (2025-03-31): Old file will be removed
---
# Subagent Roles Definition (Legacy - See agent-roles.md instead)
See [`agent-roles.md`](agent-roles.md) for the current consolidated agent roles specification.
All content has been merged into the new agent-roles.md file with improved organization and formatting.

View File

@@ -0,0 +1,414 @@
---
name: workflow-plan
description: 5-phase planning workflow with action-planning-agent task generation, outputs IMPL_PLAN.md and task JSONs. Triggers on "workflow:plan".
allowed-tools: spawn_agent, wait, send_input, close_agent, AskUserQuestion, Read, Write, Edit, Bash, Glob, Grep
---
# Workflow Plan
5-phase planning workflow that orchestrates session discovery, context gathering, conflict resolution, and task generation to produce implementation plans (IMPL_PLAN.md, task JSONs, TODO_LIST.md).
## Architecture Overview
```
┌─────────────────────────────────────────────────────────────────┐
│ Workflow Plan Orchestrator (SKILL.md) │
│ → Pure coordinator: Execute phases, parse outputs, pass context │
└───────────────┬─────────────────────────────────────────────────┘
┌───────────┼───────────┬───────────┬───────────┐
↓ ↓ ↓ ↓ ↓
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│ Phase 1 │ │ Phase 2 │ │ Phase 3 │ │Phase 3.5│ │ Phase 4 │
│ Session │ │ Context │ │Conflict │ │ Gate │ │ Task │
│Discovery│ │ Gather │ │Resolve │ │(Optional)│ │Generate │
└─────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────┘
↓ ↓ ↓ ↓
sessionId contextPath resolved IMPL_PLAN.md
conflict_risk artifacts task JSONs
TODO_LIST.md
```
## Key Design Principles
1. **Pure Orchestrator**: Execute phases in sequence, parse outputs, pass context between them
2. **Auto-Continue**: All phases run autonomously without user intervention between phases
3. **Subagent Lifecycle**: Explicit lifecycle management with spawn_agent → wait → close_agent
4. **Progressive Phase Loading**: Phase docs are read on-demand, not all at once
5. **Conditional Execution**: Phase 3 only executes when conflict_risk >= medium
6. **Role Path Loading**: Subagent roles loaded via path reference in MANDATORY FIRST STEPS
## Auto Mode
When `--yes` or `-y`: Auto-continue all phases (skip confirmations), use recommended conflict resolutions.
## Execution Flow
```
Input Parsing:
└─ Convert user input to structured format (GOAL/SCOPE/CONTEXT)
Phase 1: Session Discovery
└─ Ref: phases/01-session-discovery.md
└─ Output: sessionId (WFS-xxx)
Phase 2: Context Gathering
└─ Ref: phases/02-context-gathering.md
├─ Tasks attached: Analyze structure → Identify integration → Generate package
└─ Output: contextPath + conflict_risk
Phase 3: Conflict Resolution
└─ Decision (conflict_risk check):
├─ conflict_risk ≥ medium → Ref: phases/03-conflict-resolution.md
│ ├─ Tasks attached: Detect conflicts → Present to user → Apply strategies
│ └─ Output: Modified brainstorm artifacts
└─ conflict_risk < medium → Skip to Phase 4
Phase 4: Task Generation
└─ Ref: phases/04-task-generation.md
└─ Output: IMPL_PLAN.md, task JSONs, TODO_LIST.md
Return:
└─ Summary with recommended next steps
```
**Phase Reference Documents** (read on-demand when phase executes):
| Phase | Document | Purpose |
|-------|----------|---------|
| 1 | [phases/01-session-discovery.md](phases/01-session-discovery.md) | Session creation/discovery with intelligent session management |
| 2 | [phases/02-context-gathering.md](phases/02-context-gathering.md) | Project context collection via context-search-agent |
| 3 | [phases/03-conflict-resolution.md](phases/03-conflict-resolution.md) | Conflict detection and resolution with CLI analysis |
| 4 | [phases/04-task-generation.md](phases/04-task-generation.md) | Implementation plan and task JSON generation |
## Core Rules
1. **Start Immediately**: First action is TodoWrite initialization, second action is Phase 1 execution
2. **No Preliminary Analysis**: Do not read files, analyze structure, or gather context before Phase 1
3. **Parse Every Output**: Extract required data from each phase output for next phase
4. **Auto-Continue via TodoList**: Check TodoList status to execute next pending phase automatically
5. **Track Progress**: Update TodoWrite dynamically with task attachment/collapse pattern
6. **Progressive Phase Loading**: Read phase docs ONLY when that phase is about to execute
7. **DO NOT STOP**: Continuous multi-phase workflow. After completing each phase, immediately proceed to next
8. **Explicit Lifecycle**: Always close_agent after wait completes to free resources
## Subagent API Reference
### spawn_agent
Create a new subagent with task assignment.
```javascript
const agentId = spawn_agent({
message: `
## TASK ASSIGNMENT
### MANDATORY FIRST STEPS (Agent Execute)
1. **Read role definition**: ~/.codex/agents/{agent-type}.md (MUST read first)
2. Read: .workflow/project-tech.json
3. Read: .workflow/project-guidelines.json
## TASK CONTEXT
${taskContext}
## DELIVERABLES
${deliverables}
`
})
```
### wait
Get results from subagent (only way to retrieve results).
```javascript
const result = wait({
ids: [agentId],
timeout_ms: 600000 // 10 minutes
})
if (result.timed_out) {
// Handle timeout - can continue waiting or send_input to prompt completion
}
```
### send_input
Continue interaction with active subagent (for clarification or follow-up).
```javascript
send_input({
id: agentId,
message: `
## CLARIFICATION ANSWERS
${answers}
## NEXT STEP
Continue with plan generation.
`
})
```
### close_agent
Clean up subagent resources (irreversible).
```javascript
close_agent({ id: agentId })
```
## Input Processing
**Convert User Input to Structured Format**:
1. **Simple Text** → Structure it:
```
User: "Build authentication system"
Structured:
GOAL: Build authentication system
SCOPE: Core authentication features
CONTEXT: New implementation
```
2. **Detailed Text** → Extract components:
```
User: "Add JWT authentication with email/password login and token refresh"
Structured:
GOAL: Implement JWT-based authentication
SCOPE: Email/password login, token generation, token refresh endpoints
CONTEXT: JWT token-based security, refresh token rotation
```
3. **File Reference** (e.g., `requirements.md`) → Read and structure:
- Read file content
- Extract goal, scope, requirements
- Format into structured description
## Data Flow
```
User Input (task description)
[Convert to Structured Format]
↓ Structured Description:
↓ GOAL: [objective]
↓ SCOPE: [boundaries]
↓ CONTEXT: [background]
Phase 1: session:start --auto "structured-description"
↓ Output: sessionId
↓ Write: planning-notes.md (User Intent section)
Phase 2: context-gather --session sessionId "structured-description"
↓ Input: sessionId + structured description
↓ Output: contextPath (context-package.json with prioritized_context) + conflict_risk
↓ Update: planning-notes.md (Context Findings + Consolidated Constraints)
Phase 3: conflict-resolution [AUTO-TRIGGERED if conflict_risk ≥ medium]
↓ Input: sessionId + contextPath + conflict_risk
↓ Output: Modified brainstorm artifacts
↓ Update: planning-notes.md (Conflict Decisions + Consolidated Constraints)
↓ Skip if conflict_risk is none/low → proceed directly to Phase 4
Phase 4: task-generate-agent --session sessionId
↓ Input: sessionId + planning-notes.md + context-package.json + brainstorm artifacts
↓ Output: IMPL_PLAN.md, task JSONs, TODO_LIST.md
Return summary to user
```
**Session Memory Flow**: Each phase receives session ID, which provides access to:
- Previous task summaries
- Existing context and analysis
- Brainstorming artifacts (potentially modified by Phase 3)
- Session-specific configuration
## TodoWrite Pattern
**Core Concept**: Dynamic task attachment and collapse for real-time visibility into workflow execution.
### Key Principles
1. **Task Attachment** (when phase executed):
- Sub-command's internal tasks are **attached** to orchestrator's TodoWrite
- **Phase 2, 3**: Multiple sub-tasks attached (e.g., Phase 2.1, 2.2, 2.3)
- **Phase 4**: Single agent task attached
- First attached task marked as `in_progress`, others as `pending`
- Orchestrator **executes** these attached tasks sequentially
2. **Task Collapse** (after sub-tasks complete):
- **Applies to Phase 2, 3**: Remove detailed sub-tasks from TodoWrite
- **Collapse** to high-level phase summary
- **Phase 4**: No collapse needed (single task, just mark completed)
- Maintains clean orchestrator-level view
3. **Continuous Execution**:
- After completion, automatically proceed to next pending phase
- No user intervention required between phases
- TodoWrite dynamically reflects current execution state
**Lifecycle**: Initial pending tasks → Phase executed (tasks ATTACHED) → Sub-tasks executed sequentially → Phase completed (tasks COLLAPSED) → Next phase begins → Repeat until all phases complete.
## Phase-Specific TodoWrite Updates
### Phase 2 (Tasks Attached):
```json
[
{"content": "Phase 1: Session Discovery", "status": "completed"},
{"content": "Phase 2: Context Gathering", "status": "in_progress"},
{"content": " → Analyze codebase structure", "status": "in_progress"},
{"content": " → Identify integration points", "status": "pending"},
{"content": " → Generate context package", "status": "pending"},
{"content": "Phase 4: Task Generation", "status": "pending"}
]
```
### Phase 2 (Collapsed):
```json
[
{"content": "Phase 1: Session Discovery", "status": "completed"},
{"content": "Phase 2: Context Gathering", "status": "completed"},
{"content": "Phase 4: Task Generation", "status": "pending"}
]
```
### Phase 3 (Conditional, Tasks Attached):
```json
[
{"content": "Phase 1: Session Discovery", "status": "completed"},
{"content": "Phase 2: Context Gathering", "status": "completed"},
{"content": "Phase 3: Conflict Resolution", "status": "in_progress"},
{"content": " → Detect conflicts with CLI analysis", "status": "in_progress"},
{"content": " → Present conflicts to user", "status": "pending"},
{"content": " → Apply resolution strategies", "status": "pending"},
{"content": "Phase 4: Task Generation", "status": "pending"}
]
```
## Planning Notes Template
After Phase 1, create `planning-notes.md` with this structure:
```markdown
# Planning Notes
**Session**: ${sessionId}
**Created**: ${timestamp}
## User Intent (Phase 1)
- **GOAL**: ${userGoal}
- **KEY_CONSTRAINTS**: ${userConstraints}
---
## Context Findings (Phase 2)
(To be filled by context-gather)
## Conflict Decisions (Phase 3)
(To be filled if conflicts detected)
## Consolidated Constraints (Phase 4 Input)
1. ${userConstraints}
---
## Task Generation (Phase 4)
(To be filled by action-planning-agent)
## N+1 Context
### Decisions
| Decision | Rationale | Revisit? |
|----------|-----------|----------|
### Deferred
- [ ] (For N+1)
```
## Post-Phase Updates
### After Phase 2
Read context-package to extract key findings, update planning-notes.md:
- `Context Findings (Phase 2)`: CRITICAL_FILES, ARCHITECTURE, CONFLICT_RISK, CONSTRAINTS
- `Consolidated Constraints`: Append Phase 2 constraints
### After Phase 3
If executed, read conflict-resolution.json, update planning-notes.md:
- `Conflict Decisions (Phase 3)`: RESOLVED, MODIFIED_ARTIFACTS, CONSTRAINTS
- `Consolidated Constraints`: Append Phase 3 planning constraints
### Memory State Check
After Phase 3, evaluate context window usage. If memory usage is high (>120K tokens):
```javascript
// Codex: Use compact command if available
codex compact
```
## Phase 4 User Decision
After Phase 4 completes, present user with action choices:
```javascript
AskUserQuestion({
questions: [{
question: "Planning complete. What would you like to do next?",
header: "Next Action",
multiSelect: false,
options: [
{
label: "Verify Plan Quality (Recommended)",
description: "Run quality verification to catch issues before execution."
},
{
label: "Start Execution",
description: "Begin implementing tasks immediately."
},
{
label: "Review Status Only",
description: "View task breakdown and session status without taking further action."
}
]
}]
});
// Execute based on user choice
// "Verify Plan Quality" → workflow:plan-verify --session sessionId
// "Start Execution" → workflow:execute --session sessionId
// "Review Status Only" → workflow:status --session sessionId
```
## Error Handling
- **Parsing Failure**: If output parsing fails, retry command once, then report error
- **Validation Failure**: If validation fails, report which file/data is missing
- **Command Failure**: Keep phase `in_progress`, report error to user, do not proceed to next phase
- **Subagent Timeout**: If wait times out, evaluate whether to continue waiting or send_input to prompt completion
## Coordinator Checklist
- **Pre-Phase**: Convert user input to structured format (GOAL/SCOPE/CONTEXT)
- Initialize TodoWrite before any command (Phase 3 added dynamically after Phase 2)
- Execute Phase 1 immediately with structured description
- Parse session ID from Phase 1 output, store in memory
- Pass session ID and structured description to Phase 2 command
- Parse context path from Phase 2 output, store in memory
- **Extract conflict_risk from context-package.json**: Determine Phase 3 execution
- **If conflict_risk >= medium**: Launch Phase 3 with sessionId and contextPath
- **If conflict_risk is none/low**: Skip Phase 3, proceed directly to Phase 4
- **Build Phase 4 command**: workflow:tools:task-generate-agent --session [sessionId]
- Verify all Phase 4 outputs
- Update TodoWrite after each phase
- After each phase, automatically continue to next phase based on TodoList status
- **Always close_agent after wait completes**
## Related Commands
**Prerequisite Commands**:
- `workflow:brainstorm:artifacts` - Optional: Generate role-based analyses before planning
- `workflow:brainstorm:synthesis` - Optional: Refine brainstorm analyses with clarifications
**Follow-up Commands**:
- `workflow:plan-verify` - Recommended: Verify plan quality before execution
- `workflow:status` - Review task breakdown and current progress
- `workflow:execute` - Begin implementation of generated tasks

View File

@@ -0,0 +1,282 @@
# Phase 1: Session Discovery
Discover existing sessions or start new workflow session with intelligent session management and conflict detection.
## Objective
- Ensure project-level state exists (first-time initialization)
- Create or discover workflow session for the planning workflow
- Generate unique session ID (WFS-xxx format)
- Initialize session directory structure
## Step 0: Initialize Project State (First-time Only)
**Executed before all modes** - Ensures project-level state files exist by calling `workflow:init`.
### Check and Initialize
```bash
# Check if project state exists (both files required)
bash(test -f .workflow/project-tech.json && echo "TECH_EXISTS" || echo "TECH_NOT_FOUND")
bash(test -f .workflow/project-guidelines.json && echo "GUIDELINES_EXISTS" || echo "GUIDELINES_NOT_FOUND")
```
**If either NOT_FOUND**, delegate to `workflow:init`:
```javascript
// Codex: Execute workflow:init command for intelligent project analysis
codex workflow:init
// Wait for init completion
// project-tech.json and project-guidelines.json will be created
```
**Output**:
- If BOTH_EXIST: `PROJECT_STATE: initialized`
- If NOT_FOUND: Calls `workflow:init` → creates:
- `.workflow/project-tech.json` with full technical analysis
- `.workflow/project-guidelines.json` with empty scaffold
**Note**: `workflow:init` uses cli-explore-agent to build comprehensive project understanding (technology stack, architecture, key components). This step runs once per project. Subsequent executions skip initialization.
## Execution
### Step 1.1: Execute Session Start
```javascript
// Codex: Execute session start command
codex workflow:session:start --auto "[structured-task-description]"
```
**Task Description Structure**:
```
GOAL: [Clear, concise objective]
SCOPE: [What's included/excluded]
CONTEXT: [Relevant background or constraints]
```
**Example**:
```
GOAL: Build JWT-based authentication system
SCOPE: User registration, login, token validation
CONTEXT: Existing user database schema, REST API endpoints
```
### Step 1.2: Parse Output
- Extract: `SESSION_ID: WFS-[id]` (store as `sessionId`)
### Step 1.3: Validate
- Session ID successfully extracted
- Session directory `.workflow/active/[sessionId]/` exists
**Note**: Session directory contains `workflow-session.json` (metadata). Do NOT look for `manifest.json` here - it only exists in `.workflow/archives/` for archived sessions.
### Step 1.4: Initialize Planning Notes
Create `planning-notes.md` with N+1 context support:
```javascript
const planningNotesPath = `.workflow/active/${sessionId}/planning-notes.md`
const userGoal = structuredDescription.goal
const userConstraints = structuredDescription.context || "None specified"
Write(planningNotesPath, `# Planning Notes
**Session**: ${sessionId}
**Created**: ${new Date().toISOString()}
## User Intent (Phase 1)
- **GOAL**: ${userGoal}
- **KEY_CONSTRAINTS**: ${userConstraints}
---
## Context Findings (Phase 2)
(To be filled by context-gather)
## Conflict Decisions (Phase 3)
(To be filled if conflicts detected)
## Consolidated Constraints (Phase 4 Input)
1. ${userConstraints}
---
## Task Generation (Phase 4)
(To be filled by action-planning-agent)
## N+1 Context
### Decisions
| Decision | Rationale | Revisit? |
|----------|-----------|----------|
### Deferred
- [ ] (For N+1)
`)
```
## Session Types
The `--type` parameter classifies sessions for CCW dashboard organization:
| Type | Description | Default For |
|------|-------------|-------------|
| `workflow` | Standard implementation (default) | `workflow:plan` |
| `review` | Code review sessions | `workflow:review-module-cycle` |
| `tdd` | TDD-based development | `workflow:tdd-plan` |
| `test` | Test generation/fix sessions | `workflow:test-fix-gen` |
| `docs` | Documentation sessions | `memory:docs` |
**Validation**: If `--type` is provided with invalid value, return error:
```
ERROR: Invalid session type. Valid types: workflow, review, tdd, test, docs
```
## Mode 1: Discovery Mode (Default)
### Usage
```bash
workflow:session:start
```
### Step 1: List Active Sessions
```bash
bash(ls -1 .workflow/active/ 2>/dev/null | head -5)
```
### Step 2: Display Session Metadata
```bash
bash(cat .workflow/active/WFS-promptmaster-platform/workflow-session.json)
```
### Step 4: User Decision
Present session information and wait for user to select or create session.
**Output**: `SESSION_ID: WFS-[user-selected-id]`
## Mode 2: Auto Mode (Intelligent)
### Usage
```bash
workflow:session:start --auto "task description"
```
### Step 1: Check Active Sessions Count
```bash
bash(find .workflow/active/ -name "WFS-*" -type d 2>/dev/null | wc -l)
```
### Step 2a: No Active Sessions → Create New
```bash
# Generate session slug
bash(echo "implement OAuth2 auth" | sed 's/[^a-zA-Z0-9]/-/g' | tr '[:upper:]' '[:lower:]' | cut -c1-50)
# Create directory structure
bash(mkdir -p .workflow/active/WFS-implement-oauth2-auth/.process)
bash(mkdir -p .workflow/active/WFS-implement-oauth2-auth/.task)
bash(mkdir -p .workflow/active/WFS-implement-oauth2-auth/.summaries)
# Create metadata (include type field, default to "workflow" if not specified)
bash(echo '{"session_id":"WFS-implement-oauth2-auth","project":"implement OAuth2 auth","status":"planning","type":"workflow","created_at":"2024-12-04T08:00:00Z"}' > .workflow/active/WFS-implement-oauth2-auth/workflow-session.json)
```
**Output**: `SESSION_ID: WFS-implement-oauth2-auth`
### Step 2b: Single Active Session → Check Relevance
```bash
# Extract session ID
bash(find .workflow/active/ -name "WFS-*" -type d 2>/dev/null | head -1 | xargs basename)
# Read project name from metadata
bash(cat .workflow/active/WFS-promptmaster-platform/workflow-session.json | grep -o '"project":"[^"]*"' | cut -d'"' -f4)
# Check keyword match (manual comparison)
# If task contains project keywords → Reuse session
# If task unrelated → Create new session (use Step 2a)
```
**Output (reuse)**: `SESSION_ID: WFS-promptmaster-platform`
**Output (new)**: `SESSION_ID: WFS-[new-slug]`
### Step 2c: Multiple Active Sessions → Use First
```bash
# Get first active session
bash(find .workflow/active/ -name "WFS-*" -type d 2>/dev/null | head -1 | xargs basename)
# Output warning and session ID
# WARNING: Multiple active sessions detected
# SESSION_ID: WFS-first-session
```
## Mode 3: Force New Mode
### Usage
```bash
workflow:session:start --new "task description"
```
### Step 1: Generate Unique Session Slug
```bash
# Convert to slug
bash(echo "fix login bug" | sed 's/[^a-zA-Z0-9]/-/g' | tr '[:upper:]' '[:lower:]' | cut -c1-50)
# Check if exists, add counter if needed
bash(ls .workflow/active/WFS-fix-login-bug 2>/dev/null && echo "WFS-fix-login-bug-2" || echo "WFS-fix-login-bug")
```
### Step 2: Create Session Structure
```bash
bash(mkdir -p .workflow/active/WFS-fix-login-bug/.process)
bash(mkdir -p .workflow/active/WFS-fix-login-bug/.task)
bash(mkdir -p .workflow/active/WFS-fix-login-bug/.summaries)
```
### Step 3: Create Metadata
```bash
# Include type field from --type parameter (default: "workflow")
bash(echo '{"session_id":"WFS-fix-login-bug","project":"fix login bug","status":"planning","type":"workflow","created_at":"2024-12-04T08:00:00Z"}' > .workflow/active/WFS-fix-login-bug/workflow-session.json)
```
**Output**: `SESSION_ID: WFS-fix-login-bug`
## Execution Guideline
- **Non-interrupting**: When called from other commands, this command completes and returns control to the caller without interrupting subsequent tasks.
## Session ID Format
- Pattern: `WFS-[lowercase-slug]`
- Characters: `a-z`, `0-9`, `-` only
- Max length: 50 characters
- Uniqueness: Add numeric suffix if collision (`WFS-auth-2`, `WFS-auth-3`)
## Output Format Specification
### Success
```
SESSION_ID: WFS-session-slug
```
### Error
```
ERROR: --auto mode requires task description
ERROR: Failed to create session directory
```
### Analysis (Auto Mode)
```
ANALYSIS: Task relevance = high
DECISION: Reusing existing session
SESSION_ID: WFS-promptmaster-platform
```
## Output
- **Variable**: `sessionId` (e.g., `WFS-implement-oauth2-auth`)
- **File**: `.workflow/active/{sessionId}/planning-notes.md`
- **TodoWrite**: Mark Phase 1 completed, Phase 2 in_progress
## Next Phase
Return to orchestrator showing Phase 1 results, then auto-continue to [Phase 2: Context Gathering](02-context-gathering.md).

View File

@@ -0,0 +1,476 @@
# Phase 2: Context Gathering
Intelligently collect project context using context-search-agent based on task description, packages into standardized JSON.
## Objective
- Check for existing valid context-package before executing
- Assess task complexity and launch parallel exploration agents
- Invoke context-search-agent to analyze codebase
- Generate standardized `context-package.json` with prioritized context
- Detect conflict risk level for Phase 3 decision
## Core Philosophy
- **Agent Delegation**: Delegate all discovery to `context-search-agent` for autonomous execution
- **Detection-First**: Check for existing context-package before executing
- **Plan Mode**: Full comprehensive analysis (vs lightweight brainstorm mode)
- **Standardized Output**: Generate `.workflow/active/{session}/.process/context-package.json`
- **Explicit Lifecycle**: Manage subagent creation, waiting, and cleanup
## Execution Process
```
Input Parsing:
├─ Parse flags: --session
└─ Parse: task_description (required)
Step 1: Context-Package Detection
└─ Decision (existing package):
├─ Valid package exists → Return existing (skip execution)
└─ No valid package → Continue to Step 2
Step 2: Complexity Assessment & Parallel Explore
├─ Analyze task_description → classify Low/Medium/High
├─ Select exploration angles (1-4 based on complexity)
├─ Launch N cli-explore-agents in parallel (spawn_agent)
│ └─ Each outputs: exploration-{angle}.json
├─ Wait for all agents (batch wait)
├─ Close all agents
└─ Generate explorations-manifest.json
Step 3: Invoke Context-Search Agent (with exploration input)
├─ Phase 1: Initialization & Pre-Analysis
├─ Phase 2: Multi-Source Discovery
│ ├─ Track 0: Exploration Synthesis (prioritize & deduplicate)
│ ├─ Track 1-4: Existing tracks
└─ Phase 3: Synthesis & Packaging
└─ Generate context-package.json with exploration_results
└─ Lifecycle: spawn_agent → wait → close_agent
Step 4: Output Verification
└─ Verify context-package.json contains exploration_results
```
## Execution Flow
### Step 1: Context-Package Detection
**Execute First** - Check if valid package already exists:
```javascript
const contextPackagePath = `.workflow/${session_id}/.process/context-package.json`;
if (file_exists(contextPackagePath)) {
const existing = Read(contextPackagePath);
// Validate package belongs to current session
if (existing?.metadata?.session_id === session_id) {
console.log("Valid context-package found for session:", session_id);
console.log("Stats:", existing.statistics);
console.log("Conflict Risk:", existing.conflict_detection.risk_level);
return existing; // Skip execution, return existing
} else {
console.warn("Invalid session_id in existing package, re-generating...");
}
}
```
### Step 2: Complexity Assessment & Parallel Explore
**Only execute if Step 1 finds no valid package**
```javascript
// 2.1 Complexity Assessment
function analyzeTaskComplexity(taskDescription) {
const text = taskDescription.toLowerCase();
if (/architect|refactor|restructure|modular|cross-module/.test(text)) return 'High';
if (/multiple|several|integrate|migrate|extend/.test(text)) return 'Medium';
return 'Low';
}
const ANGLE_PRESETS = {
architecture: ['architecture', 'dependencies', 'modularity', 'integration-points'],
security: ['security', 'auth-patterns', 'dataflow', 'validation'],
performance: ['performance', 'bottlenecks', 'caching', 'data-access'],
bugfix: ['error-handling', 'dataflow', 'state-management', 'edge-cases'],
feature: ['patterns', 'integration-points', 'testing', 'dependencies'],
refactor: ['architecture', 'patterns', 'dependencies', 'testing']
};
function selectAngles(taskDescription, complexity) {
const text = taskDescription.toLowerCase();
let preset = 'feature';
if (/refactor|architect|restructure/.test(text)) preset = 'architecture';
else if (/security|auth|permission/.test(text)) preset = 'security';
else if (/performance|slow|optimi/.test(text)) preset = 'performance';
else if (/fix|bug|error|issue/.test(text)) preset = 'bugfix';
const count = complexity === 'High' ? 4 : (complexity === 'Medium' ? 3 : 1);
return ANGLE_PRESETS[preset].slice(0, count);
}
const complexity = analyzeTaskComplexity(task_description);
const selectedAngles = selectAngles(task_description, complexity);
const sessionFolder = `.workflow/active/${session_id}/.process`;
// 2.2 Launch Parallel Explore Agents
const explorationAgents = [];
// Spawn all agents in parallel
selectedAngles.forEach((angle, index) => {
const agentId = spawn_agent({
message: `
## TASK ASSIGNMENT
### MANDATORY FIRST STEPS (Agent Execute)
1. **Read role definition**: ~/.codex/agents/cli-explore-agent.md (MUST read first)
2. Read: .workflow/project-tech.json
3. Read: .workflow/project-guidelines.json
---
## Task Objective
Execute **${angle}** exploration for task planning context. Analyze codebase from this specific angle to discover relevant structure, patterns, and constraints.
## Assigned Context
- **Exploration Angle**: ${angle}
- **Task Description**: ${task_description}
- **Session ID**: ${session_id}
- **Exploration Index**: ${index + 1} of ${selectedAngles.length}
- **Output File**: ${sessionFolder}/exploration-${angle}.json
## MANDATORY FIRST STEPS (Execute by Agent)
**You (cli-explore-agent) MUST execute these steps in order:**
1. Run: ccw tool exec get_modules_by_depth '{}' (project structure)
2. Run: rg -l "{keyword_from_task}" --type ts (locate relevant files)
3. Execute: cat ~/.claude/workflows/cli-templates/schemas/explore-json-schema.json (get output schema reference)
## Exploration Strategy (${angle} focus)
**Step 1: Structural Scan** (Bash)
- get_modules_by_depth.sh → identify modules related to ${angle}
- find/rg → locate files relevant to ${angle} aspect
- Analyze imports/dependencies from ${angle} perspective
**Step 2: Semantic Analysis** (Gemini CLI)
- How does existing code handle ${angle} concerns?
- What patterns are used for ${angle}?
- Where would new code integrate from ${angle} viewpoint?
**Step 3: Write Output**
- Consolidate ${angle} findings into JSON
- Identify ${angle}-specific clarification needs
## Expected Output
**File**: ${sessionFolder}/exploration-${angle}.json
**Schema Reference**: Schema obtained in MANDATORY FIRST STEPS step 3, follow schema exactly
**Required Fields** (all ${angle} focused):
- project_structure: Modules/architecture relevant to ${angle}
- relevant_files: Files affected from ${angle} perspective
**IMPORTANT**: Use object format with relevance scores for synthesis:
\`[{path: "src/file.ts", relevance: 0.85, rationale: "Core ${angle} logic"}]\`
Scores: 0.7+ high priority, 0.5-0.7 medium, <0.5 low
- patterns: ${angle}-related patterns to follow
- dependencies: Dependencies relevant to ${angle}
- integration_points: Where to integrate from ${angle} viewpoint (include file:line locations)
- constraints: ${angle}-specific limitations/conventions
- clarification_needs: ${angle}-related ambiguities (options array + recommended index)
- _metadata.exploration_angle: "${angle}"
## Success Criteria
- [ ] Schema obtained via cat explore-json-schema.json
- [ ] get_modules_by_depth.sh executed
- [ ] At least 3 relevant files identified with ${angle} rationale
- [ ] Patterns are actionable (code examples, not generic advice)
- [ ] Integration points include file:line locations
- [ ] Constraints are project-specific to ${angle}
- [ ] JSON output follows schema exactly
- [ ] clarification_needs includes options + recommended
## Output
Write: ${sessionFolder}/exploration-${angle}.json
Return: 2-3 sentence summary of ${angle} findings
`
});
explorationAgents.push(agentId);
});
// 2.3 Batch wait for all exploration agents
const explorationResults = wait({
ids: explorationAgents,
timeout_ms: 600000 // 10 minutes
});
// Check for timeouts
if (explorationResults.timed_out) {
console.log('Some exploration agents timed out - continuing with completed results');
}
// 2.4 Close all exploration agents
explorationAgents.forEach(agentId => {
close_agent({ id: agentId });
});
// 2.5 Generate Manifest after all complete
const explorationFiles = bash(`find ${sessionFolder} -name "exploration-*.json" -type f`).split('\n').filter(f => f.trim());
const explorationManifest = {
session_id,
task_description,
timestamp: new Date().toISOString(),
complexity,
exploration_count: selectedAngles.length,
angles_explored: selectedAngles,
explorations: explorationFiles.map(file => {
const data = JSON.parse(Read(file));
return { angle: data._metadata.exploration_angle, file: file.split('/').pop(), path: file, index: data._metadata.exploration_index };
})
};
Write(`${sessionFolder}/explorations-manifest.json`, JSON.stringify(explorationManifest, null, 2));
```
### Step 3: Invoke Context-Search Agent
**Only execute after Step 2 completes**
```javascript
// Load user intent from planning-notes.md (from Phase 1)
const planningNotesPath = `.workflow/active/${session_id}/planning-notes.md`;
let userIntent = { goal: task_description, key_constraints: "None specified" };
if (file_exists(planningNotesPath)) {
const notesContent = Read(planningNotesPath);
const goalMatch = notesContent.match(/\*\*GOAL\*\*:\s*(.+)/);
const constraintsMatch = notesContent.match(/\*\*KEY_CONSTRAINTS\*\*:\s*(.+)/);
if (goalMatch) userIntent.goal = goalMatch[1].trim();
if (constraintsMatch) userIntent.key_constraints = constraintsMatch[1].trim();
}
// Spawn context-search-agent
const contextAgentId = spawn_agent({
message: `
## TASK ASSIGNMENT
### MANDATORY FIRST STEPS (Agent Execute)
1. **Read role definition**: ~/.codex/agents/context-search-agent.md (MUST read first)
2. Read: .workflow/project-tech.json
3. Read: .workflow/project-guidelines.json
---
## Execution Mode
**PLAN MODE** (Comprehensive) - Full Phase 1-3 execution with priority sorting
## Session Information
- **Session ID**: ${session_id}
- **Task Description**: ${task_description}
- **Output Path**: .workflow/${session_id}/.process/context-package.json
## User Intent (from Phase 1 - Planning Notes)
**GOAL**: ${userIntent.goal}
**KEY_CONSTRAINTS**: ${userIntent.key_constraints}
This is the PRIMARY context source - all subsequent analysis must align with user intent.
## Exploration Input (from Step 2)
- **Manifest**: ${sessionFolder}/explorations-manifest.json
- **Exploration Count**: ${explorationManifest.exploration_count}
- **Angles**: ${explorationManifest.angles_explored.join(', ')}
- **Complexity**: ${complexity}
## Mission
Execute complete context-search-agent workflow for implementation planning:
### Phase 1: Initialization & Pre-Analysis
1. **Project State Loading**:
- Read and parse \`.workflow/project-tech.json\`. Use its \`overview\` section as the foundational \`project_context\`. This is your primary source for architecture, tech stack, and key components.
- Read and parse \`.workflow/project-guidelines.json\`. Load \`conventions\`, \`constraints\`, and \`learnings\` into a \`project_guidelines\` section.
- If files don't exist, proceed with fresh analysis.
2. **Detection**: Check for existing context-package (early exit if valid)
3. **Foundation**: Initialize CodexLens, get project structure, load docs
4. **Analysis**: Extract keywords, determine scope, classify complexity based on task description and project state
### Phase 2: Multi-Source Context Discovery
Execute all discovery tracks (WITH USER INTENT INTEGRATION):
- **Track -1**: User Intent & Priority Foundation (EXECUTE FIRST)
- Load user intent (GOAL, KEY_CONSTRAINTS) from session input
- Map user requirements to codebase entities (files, modules, patterns)
- Establish baseline priority scores based on user goal alignment
- Output: user_intent_mapping.json with preliminary priority scores
- **Track 0**: Exploration Synthesis (load ${sessionFolder}/explorations-manifest.json, prioritize critical_files, deduplicate patterns/integration_points)
- **Track 1**: Historical archive analysis (query manifest.json for lessons learned)
- **Track 2**: Reference documentation (CLAUDE.md, architecture docs)
- **Track 3**: Web examples (use Exa MCP for unfamiliar tech/APIs)
- **Track 4**: Codebase analysis (5-layer discovery: files, content, patterns, deps, config/tests)
### Phase 3: Synthesis, Assessment & Packaging
1. Apply relevance scoring and build dependency graph
2. **Synthesize 5-source data** (including Track -1): Merge findings from all sources
- Priority order: User Intent > Archive > Docs > Exploration > Code > Web
- **Prioritize the context from \`project-tech.json\`** for architecture and tech stack unless code analysis reveals it's outdated
3. **Context Priority Sorting**:
a. Combine scores from Track -1 (user intent alignment) + relevance scores + exploration critical_files
b. Classify files into priority tiers:
- **Critical** (score >= 0.85): Directly mentioned in user goal OR exploration critical_files
- **High** (0.70-0.84): Key dependencies, patterns required for goal
- **Medium** (0.50-0.69): Supporting files, indirect dependencies
- **Low** (< 0.50): Contextual awareness only
c. Generate dependency_order: Based on dependency graph + user goal sequence
d. Document sorting_rationale: Explain prioritization logic
4. **Populate \`project_context\`**: Directly use the \`overview\` from \`project-tech.json\` to fill the \`project_context\` section. Include description, technology_stack, architecture, and key_components.
5. **Populate \`project_guidelines\`**: Load conventions, constraints, and learnings from \`project-guidelines.json\` into a dedicated section.
6. Integrate brainstorm artifacts (if .brainstorming/ exists, read content)
7. Perform conflict detection with risk assessment
8. **Inject historical conflicts** from archive analysis into conflict_detection
9. **Generate prioritized_context section**:
\`\`\`json
{
"prioritized_context": {
"user_intent": {
"goal": "...",
"scope": "...",
"key_constraints": ["..."]
},
"priority_tiers": {
"critical": [{ "path": "...", "relevance": 0.95, "rationale": "..." }],
"high": [...],
"medium": [...],
"low": [...]
},
"dependency_order": ["module1", "module2", "module3"],
"sorting_rationale": "Based on user goal alignment (Track -1), exploration critical files, and dependency graph analysis"
}
}
\`\`\`
10. Generate and validate context-package.json with prioritized_context field
## Output Requirements
Complete context-package.json with:
- **metadata**: task_description, keywords, complexity, tech_stack, session_id
- **project_context**: description, technology_stack, architecture, key_components (sourced from \`project-tech.json\`)
- **project_guidelines**: {conventions, constraints, quality_rules, learnings} (sourced from \`project-guidelines.json\`)
- **assets**: {documentation[], source_code[], config[], tests[]} with relevance scores
- **dependencies**: {internal[], external[]} with dependency graph
- **brainstorm_artifacts**: {guidance_specification, role_analyses[], synthesis_output} with content
- **conflict_detection**: {risk_level, risk_factors, affected_modules[], mitigation_strategy, historical_conflicts[]}
- **exploration_results**: {manifest_path, exploration_count, angles, explorations[], aggregated_insights} (from Track 0)
- **prioritized_context**: {user_intent, priority_tiers{critical, high, medium, low}, dependency_order[], sorting_rationale}
## Quality Validation
Before completion verify:
- [ ] Valid JSON format with all required fields
- [ ] File relevance accuracy >80%
- [ ] Dependency graph complete (max 2 transitive levels)
- [ ] Conflict risk level calculated correctly
- [ ] No sensitive data exposed
- [ ] Total files <=50 (prioritize high-relevance)
## Planning Notes Record (REQUIRED)
After completing context-package.json, append a brief execution record to planning-notes.md:
**File**: .workflow/active/${session_id}/planning-notes.md
**Location**: Under "## Context Findings (Phase 2)" section
**Format**:
\`\`\`
### [Context-Search Agent] YYYY-MM-DD
- **Note**: [brief summary of key findings]
\`\`\`
Execute autonomously following agent documentation.
Report completion with statistics.
`
});
// Wait for context agent to complete
const contextResult = wait({
ids: [contextAgentId],
timeout_ms: 900000 // 15 minutes
});
// Close context agent
close_agent({ id: contextAgentId });
```
### Step 4: Output Verification
After agent completes, verify output:
```javascript
// Verify file was created
const outputPath = `.workflow/${session_id}/.process/context-package.json`;
if (!file_exists(outputPath)) {
throw new Error("Agent failed to generate context-package.json");
}
// Verify exploration_results included
const pkg = JSON.parse(Read(outputPath));
if (pkg.exploration_results?.exploration_count > 0) {
console.log(`Exploration results aggregated: ${pkg.exploration_results.exploration_count} angles`);
}
```
## Parameter Reference
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `--session` | string | Yes | Workflow session ID (e.g., WFS-user-auth) |
| `task_description` | string | Yes | Detailed task description for context extraction |
## Post-Phase Update
After context-gather completes, update planning-notes.md:
```javascript
const contextPackage = JSON.parse(Read(contextPath))
const conflictRisk = contextPackage.conflict_detection?.risk_level || 'low'
const criticalFiles = (contextPackage.exploration_results?.aggregated_insights?.critical_files || [])
.slice(0, 5).map(f => f.path)
const archPatterns = contextPackage.project_context?.architecture_patterns || []
const constraints = contextPackage.exploration_results?.aggregated_insights?.constraints || []
// Update Phase 2 section
Edit(planningNotesPath, {
old: '## Context Findings (Phase 2)\n(To be filled by context-gather)',
new: `## Context Findings (Phase 2)
- **CRITICAL_FILES**: ${criticalFiles.join(', ') || 'None identified'}
- **ARCHITECTURE**: ${archPatterns.join(', ') || 'Not detected'}
- **CONFLICT_RISK**: ${conflictRisk}
- **CONSTRAINTS**: ${constraints.length > 0 ? constraints.join('; ') : 'None'}`
})
// Append Phase 2 constraints to consolidated list
Edit(planningNotesPath, {
old: '## Consolidated Constraints (Phase 4 Input)',
new: `## Consolidated Constraints (Phase 4 Input)
${constraints.map((c, i) => `${i + 2}. [Context] ${c}`).join('\n')}`
})
```
## Notes
- **Detection-first**: Always check for existing package before invoking agent
- **User intent integration**: Load user intent from planning-notes.md (Phase 1 output)
- **Output**: Generates `context-package.json` with `prioritized_context` field
- **Plan-specific**: Use this for implementation planning; brainstorm mode uses direct agent call
- **Explicit Lifecycle**: Always close_agent after wait to free resources
- **Batch Wait**: Use single wait call for multiple parallel agents for efficiency
## Output
- **Variable**: `contextPath` (e.g., `.workflow/active/WFS-xxx/.process/context-package.json`)
- **Variable**: `conflictRisk` (none/low/medium/high)
- **File**: Updated `planning-notes.md` with context findings
- **Decision**: If `conflictRisk >= medium` → Phase 3, else → Phase 4
## Next Phase
Return to orchestrator showing Phase 2 results, then auto-continue:
- If `conflict_risk >= medium` → [Phase 3: Conflict Resolution](03-conflict-resolution.md)
- If `conflict_risk < medium` → [Phase 4: Task Generation](04-task-generation.md)