feat: migrate all codex team skills from spawn_agents_on_csv to spawn_agent + wait_agent architecture

- Delete 21 old team skill directories using CSV-wave pipeline pattern (~100+ files)
- Delete old team-lifecycle (v3) and team-planex-v2
- Create generic team-worker.toml and team-supervisor.toml (replacing tlv4-specific TOMLs)
- Convert 19 team skills from Claude Code format (Agent/SendMessage/TaskCreate)
  to Codex format (spawn_agent/wait_agent/tasks.json/request_user_input)
- Update team-lifecycle-v4 to use generic agent types (team_worker/team_supervisor)
- Convert all coordinator role files: dispatch.md, monitor.md, role.md
- Convert all worker role files: remove run_in_background, fix Bash syntax
- Convert all specs/pipelines.md references
- Final state: 20 team skills, 217 .md files, zero Claude Code API residuals

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
catlog22
2026-03-24 16:54:48 +08:00
parent 54283e5dbb
commit 1e560ab8e8
334 changed files with 28996 additions and 35516 deletions

View File

@@ -0,0 +1,280 @@
# Command: monitor
## Purpose
Synchronous pipeline coordination using spawn_agent + wait_agent for team-executor v2. Role names are read from `tasks.json#roles`. Workers are spawned as `team_worker` agents with role-spec paths. **handleAdapt is LIMITED**: only warns, cannot generate new role-specs. Includes `handleComplete` for pipeline completion action.
## Constants
| Constant | Value | Description |
|----------|-------|-------------|
| WORKER_AGENT | team_worker | All workers spawned via spawn_agent |
| ONE_STEP_PER_INVOCATION | false | Synchronous wait loop |
| FAST_ADVANCE_AWARE | true | Workers may skip executor for simple linear successors |
| ROLE_GENERATION | disabled | handleAdapt cannot generate new role-specs |
## Phase 2: Context Loading
| Input | Source | Required |
|-------|--------|----------|
| Session file | `<session-folder>/tasks.json` | Yes |
| Active agents | tasks.json active_agents | Yes |
| Role registry | tasks.json roles[] | Yes |
**Dynamic role resolution**: Known worker roles are loaded from `tasks.json roles[].name`. Role-spec paths are in `tasks.json roles[].role_spec`.
## Phase 3: Handler Routing
### Wake-up Source Detection
Parse `$ARGUMENTS` to determine handler:
| Priority | Condition | Handler |
|----------|-----------|---------|
| 1 | Message contains `[<role-name>]` from session roles | handleCallback |
| 2 | Contains "capability_gap" | handleAdapt |
| 3 | Contains "check" or "status" | handleCheck |
| 4 | Contains "resume", "continue", or "next" | handleResume |
| 5 | Pipeline detected as complete | handleComplete |
| 6 | None of the above (initial spawn after dispatch) | handleSpawnNext |
---
### Handler: handleCallback
Worker completed a task. Verify completion, update state, auto-advance.
```
Receive callback from [<role>]
+- Find matching active agent by role (from tasks.json roles)
+- Is this a progress update (not final)?
| +- YES -> Update tasks.json state, do NOT remove from active_agents -> STOP
+- Task status = completed?
| +- YES -> close_agent, remove from active_agents -> update tasks.json
| | +- -> handleSpawnNext
| +- NO -> progress message, do not advance -> STOP
+- No matching agent found
+- Scan all active agents for completed tasks
+- Found completed -> process each -> handleSpawnNext
+- None completed -> STOP
```
**Fast-advance note**: Check if expected next task is already `in_progress` (fast-advanced). If yes -> skip spawning, sync active_agents.
---
### Handler: handleCheck
Read-only status report. No pipeline advancement.
```
[executor] Pipeline Status
[executor] Progress: <completed>/<total> (<percent>%)
[executor] Execution Graph:
<visual representation with status icons>
done=completed >>>=running o=pending .=not created
[executor] Active agents:
> <subject> (<role>) - running <elapsed>
[executor] Ready to spawn: <subjects>
[executor] Commands: 'resume' to advance | 'check' to refresh
```
Then STOP.
---
### Handler: handleResume
Check active agent completion, process results, advance pipeline.
```
Load active_agents from tasks.json
+- No active agents -> handleSpawnNext
+- Has active agents -> check each:
+- status = completed -> mark done, close_agent, log
+- status = in_progress -> still running, log
+- other status -> worker failure -> reset to pending
After processing:
+- Some completed -> handleSpawnNext
+- All still running -> report status -> STOP
+- All failed -> handleSpawnNext (retry)
```
---
### Handler: handleSpawnNext
Find all ready tasks, spawn team_worker agents, wait for completion, process results.
```
Read tasks.json
+- completedTasks: status = completed
+- inProgressTasks: status = in_progress
+- readyTasks: pending + all deps in completedTasks
Ready tasks found?
+- NONE + work in progress -> report waiting -> STOP
+- NONE + nothing in progress -> PIPELINE_COMPLETE -> handleComplete
+- HAS ready tasks -> for each:
+- Is task owner an Inner Loop role AND already has active_agents?
| +- YES -> SKIP spawn (existing worker picks it up)
| +- NO -> normal spawn below
+- Update task status in tasks.json -> in_progress
+- team_msg log -> task_unblocked (session_id=<session-id>)
+- Spawn team_worker (see spawn tool call below)
+- Add to tasks.json active_agents
```
**Spawn worker tool call**:
```javascript
const agentId = spawn_agent({
agent_type: "team_worker",
items: [
{ type: "text", text: `## Role Assignment
role: ${task.role}
role_spec: ${sessionFolder}/role-specs/${task.role}.md
session: ${sessionFolder}
session_id: ${sessionId}
team_name: ${teamName}
requirement: ${task.description}
inner_loop: ${hasInnerLoop(task.role)}` },
{ type: "text", text: `Read role_spec file (${sessionFolder}/role-specs/${task.role}.md) to load Phase 2-4 domain instructions.` },
{ type: "text", text: `## Task Context
task_id: ${taskId}
title: ${task.title}
description: ${task.description}` },
{ type: "text", text: `## Upstream Context\n${prevContext}` }
]
})
state.active_agents[taskId] = { agentId, role: task.role, started_at: now }
```
### Wait and Process Results
After spawning all ready tasks:
```javascript
// Batch wait for all spawned workers
const agentIds = Object.values(state.active_agents)
.filter(a => !a.resident)
.map(a => a.agentId)
wait_agent({ ids: agentIds, timeout_ms: 900000 })
// Collect results from discoveries/{task_id}.json
for (const [taskId, agent] of Object.entries(state.active_agents)) {
if (agent.resident) continue
try {
const disc = JSON.parse(Read(`${sessionFolder}/discoveries/${taskId}.json`))
state.tasks[taskId].status = disc.status || 'completed'
state.tasks[taskId].findings = disc.findings || ''
state.tasks[taskId].error = disc.error || null
} catch {
state.tasks[taskId].status = 'failed'
state.tasks[taskId].error = 'No discovery file produced'
}
close_agent({ id: agent.agentId })
delete state.active_agents[taskId]
}
```
### Persist and Loop
After processing all results:
1. Write updated tasks.json
2. Check if more tasks are now ready (deps newly resolved)
3. If yes -> loop back to step 1 of handleSpawnNext
4. If no more ready and all done -> handleComplete
5. If no more ready but some still blocked -> report status, STOP
---
### Handler: handleComplete
Pipeline complete. Execute completion action.
```
All tasks completed (no pending, no in_progress)
+- Generate pipeline summary (deliverables, stats, duration)
+- Read tasks.json completion_action:
|
+- "interactive":
| request_user_input -> user choice:
| +- "Archive & Clean": rm -rf session folder -> summary
| +- "Keep Active": session status="paused" -> resume command
| +- "Export Results": copy artifacts -> Archive & Clean
|
+- "auto_archive": Execute Archive & Clean
+- "auto_keep": Execute Keep Active
```
**Fallback**: If completion action fails, default to Keep Active, log warning.
---
### Handler: handleAdapt (LIMITED)
**UNLIKE team-coordinate, executor CANNOT generate new role-specs.**
```
Receive capability_gap from [<role>]
+- Log via team_msg (type: warning)
+- Check existing roles -> does any cover this?
| +- YES -> redirect to that role -> STOP
| +- NO -> genuine gap, report to user:
| "Capability gap detected. team-executor cannot generate new role-specs.
| Options: 1. Continue 2. Re-run team-coordinate 3. Manually add role-spec"
+- Continue execution with existing roles
```
---
### Worker Failure Handling
1. Reset task -> pending in tasks.json
2. Log via team_msg (type: error)
3. Report to user: task reset, will retry on next resume
### Fast-Advance Failure Recovery
Detect orphaned tasks (in_progress without active_agents, > 5 minutes) -> reset to pending -> handleSpawnNext.
### Consensus-Blocked Handling
```
Route by severity:
+- HIGH: Create REVISION task (max 1). Already revised -> PAUSE for user (request_user_input)
+- MEDIUM: Proceed with warning, log to wisdom/issues.md
+- LOW: Proceed normally as consensus_reached with notes
```
## Phase 4: Validation
| Check | Criteria |
|-------|----------|
| Session state consistent | active_agents matches tasks.json in_progress tasks |
| No orphaned tasks | Every in_progress task has an active_agents entry |
| Dynamic roles valid | All task owners exist in tasks.json roles |
| Completion detection | readyTasks=0 + inProgressTasks=0 -> PIPELINE_COMPLETE |
| Fast-advance tracking | Detect fast-advanced tasks, sync to active_agents |
## Error Handling
| Scenario | Resolution |
|----------|------------|
| Session file not found | Error, suggest re-run team-coordinate |
| Unknown role in callback | Log info, scan for other completions |
| All workers still running on resume | Report status, suggest check later |
| Pipeline stall | Check for missing tasks, report to user |
| Fast-advance conflict | Executor reconciles, no duplicate spawns |
| Role-spec file not found | Error, cannot proceed |
| capability_gap | WARN only, cannot generate new role-specs |
| Completion action fails | Default to Keep Active, log warning |

View File

@@ -0,0 +1,170 @@
# Executor Role
Orchestrate the team-executor workflow: session validation, state reconciliation, team_worker dispatch, progress monitoring, completion action. The sole built-in role -- all worker roles are loaded from session role-specs and spawned via team_worker agent.
## Identity
- **Name**: `executor` | **Tag**: `[executor]`
- **Responsibility**: Validate session -> Reconcile state -> Create session -> Dispatch team_worker agents -> Monitor progress -> Completion action -> Report results
## Boundaries
### MUST
- Validate session structure before any execution
- Reconcile session state with tasks.json on startup
- Reset in_progress tasks to pending (interrupted tasks)
- Detect fast-advance orphans and reset to pending
- Spawn team_worker agents via spawn_agent and wait via wait_agent
- Monitor progress via wait_agent and route messages
- Maintain session state persistence (tasks.json)
- Handle capability_gap reports with warning only (cannot generate role-specs)
- Execute completion action when pipeline finishes
### MUST NOT
- Execute task work directly (delegate to workers)
- Modify task output artifacts (workers own their deliverables)
- Call CLI tools or spawn utility members directly for implementation (code-developer, etc.)
- Generate new role-specs (use existing session role-specs only)
- Skip session validation
- Override consensus_blocked HIGH without user confirmation
- Spawn workers with `general-purpose` agent (MUST use `team_worker`)
> **Core principle**: executor is the orchestrator, not the executor. All actual work is delegated to session-defined worker roles via team_worker agents. Unlike team-coordinate coordinator, executor CANNOT generate new role-specs.
---
## Entry Router
When executor is invoked, first detect the invocation type:
| Detection | Condition | Handler |
|-----------|-----------|---------|
| Status check | Arguments contain "check" or "status" | -> handleCheck |
| Manual resume | Arguments contain "resume" or "continue" | -> handleResume |
| Capability gap | Message contains "capability_gap" | -> handleAdapt |
| Pipeline complete | All tasks completed, no pending/in_progress | -> handleComplete |
| New execution | None of above | -> Phase 0 |
For check/resume/adapt/complete: load `commands/monitor.md` and execute the appropriate handler, then STOP.
---
## Phase 0: Session Validation + State Reconciliation
**Objective**: Validate session structure and reconcile session state with actual task status.
### Step 1: Session Validation
Validate session structure (see SKILL.md Session Validation):
- [ ] Directory exists at session path
- [ ] `tasks.json` exists and parses
- [ ] `task-analysis.json` exists and parses
- [ ] `role-specs/` directory has >= 1 .md files
- [ ] All roles in tasks.json#roles have corresponding role-spec .md files
- [ ] Role-spec files have valid YAML frontmatter + Phase 2-4 sections
If validation fails -> ERROR with specific reason -> STOP
### Step 2: Load Session State
Read tasks.json and task-analysis.json.
### Step 3: Reconcile with tasks.json
Compare tasks.json task statuses with session expectations, bidirectional sync.
### Step 4: Reset Interrupted Tasks
Reset any in_progress tasks to pending.
### Step 5: Detect Fast-Advance Orphans
In_progress tasks without matching active_agents + created > 5 minutes -> reset to pending.
### Step 6: Create Missing Tasks (if needed)
For each task in task-analysis, check if exists in tasks.json, create if missing.
### Step 7: Update Session File
Write reconciled tasks.json.
### Step 8: Session Setup
Initialize session folder if needed.
**Success**: Session validated, state reconciled, session ready -> Phase 1
---
## Phase 1: Spawn-and-Wait
**Objective**: Spawn first batch of ready workers as team_worker agents, wait for completion, then continue.
**Workflow**:
1. Load `commands/monitor.md`
2. Find tasks with: status=pending, deps all resolved, owner assigned
3. For each ready task -> spawn team_worker via spawn_agent, wait via wait_agent
4. Process results, advance pipeline
5. Repeat until all tasks complete or pipeline blocked
6. Output status summary with execution graph
**Pipeline advancement** driven by:
- Synchronous wait_agent loop (automatic)
- User "check" -> handleCheck (status only)
- User "resume" -> handleResume (advance)
---
## Phase 2: Report + Completion Action
**Objective**: Completion report, interactive completion choice, and follow-up options.
**Workflow**:
1. Load session state -> count completed tasks, duration
2. List all deliverables with output paths in `<session>/artifacts/`
3. Include discussion summaries (if inline discuss was used)
4. Summarize wisdom accumulated during execution
5. Output report:
```
[executor] ============================================
[executor] TASK COMPLETE
[executor]
[executor] Deliverables:
[executor] - <artifact-1.md> (<producer role>)
[executor] - <artifact-2.md> (<producer role>)
[executor]
[executor] Pipeline: <completed>/<total> tasks
[executor] Roles: <role-list>
[executor] Duration: <elapsed>
[executor]
[executor] Session: <session-folder>
[executor] ============================================
```
6. **Execute Completion Action** (based on tasks.json completion_action):
| Mode | Behavior |
|------|----------|
| `interactive` | request_user_input with Archive/Keep/Export options |
| `auto_archive` | Execute Archive & Clean (rm -rf session folder) without prompt |
| `auto_keep` | Execute Keep Active without prompt |
**Interactive handler**: See SKILL.md Completion Action section.
---
## Error Handling
| Error | Resolution |
|-------|------------|
| Session validation fails | ERROR with specific reason, suggest re-run team-coordinate |
| Task timeout | Log, mark failed, ask user to retry or skip |
| Worker crash | Reset task to pending in tasks.json, respawn via spawn_agent |
| Session corruption | Attempt recovery, fallback to manual reconciliation |
| capability_gap reported | handleAdapt: WARN only, cannot generate new role-specs |
| Pipeline stall (no ready, no running) | Check for missing tasks, report to user |
| Fast-advance conflict | Executor reconciles, no duplicate spawns |
| Role-spec file not found | ERROR, cannot proceed without role definition |
| Completion action fails | Default to Keep Active, log warning |