feat: migrate all codex team skills from spawn_agents_on_csv to spawn_agent + wait_agent architecture

- Delete 21 old team skill directories using CSV-wave pipeline pattern (~100+ files)
- Delete old team-lifecycle (v3) and team-planex-v2
- Create generic team-worker.toml and team-supervisor.toml (replacing tlv4-specific TOMLs)
- Convert 19 team skills from Claude Code format (Agent/SendMessage/TaskCreate)
  to Codex format (spawn_agent/wait_agent/tasks.json/request_user_input)
- Update team-lifecycle-v4 to use generic agent types (team_worker/team_supervisor)
- Convert all coordinator role files: dispatch.md, monitor.md, role.md
- Convert all worker role files: remove run_in_background, fix Bash syntax
- Convert all specs/pipelines.md references
- Final state: 20 team skills, 217 .md files, zero Claude Code API residuals

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
catlog22
2026-03-24 16:54:48 +08:00
parent 54283e5dbb
commit 1e560ab8e8
334 changed files with 28996 additions and 35516 deletions

View File

@@ -0,0 +1,111 @@
# Knowledge Transfer Protocols
## 1. Transfer Channels
| Channel | Scope | Mechanism | When to Use |
|---------|-------|-----------|-------------|
| **Artifacts** | Producer -> Consumer | Write to `<session>/artifacts/<name>.md`, consumer reads in Phase 2 | Structured deliverables (reports, plans, specs) |
| **State Updates** | Cross-role | `team_msg(operation="log", type="state_update", data={...})` / `team_msg(operation="get_state", session_id=<session-id>)` | Key findings, decisions, metadata (small, structured data) |
| **Wisdom** | Cross-task | Append to `<session>/wisdom/{learnings,decisions,conventions,issues}.md` | Patterns, conventions, risks discovered during execution |
| **Context Accumulator** | Intra-role (inner loop) | In-memory array, passed to each subsequent task in same-prefix loop | Prior task summaries within same role's inner loop |
| **Exploration Cache** | Cross-role | `<session>/explorations/cache-index.json` + per-angle JSON | Codebase discovery results, prevents duplicate exploration |
## 2. Context Loading Protocol (Phase 2)
Every role MUST load context in this order before starting work.
| Step | Action | Required |
|------|--------|----------|
| 1 | Extract session path from task description | Yes |
| 2 | `team_msg(operation="get_state", session_id=<session-id>)` | Yes |
| 3 | Read artifact files from upstream state's `ref` paths | Yes |
| 4 | Read `<session>/wisdom/*.md` if exists | Yes |
| 5 | Check `<session>/explorations/cache-index.json` before new exploration | If exploring |
| 6 | For inner_loop roles: load context_accumulator from prior tasks | If inner_loop |
**Loading rules**:
- Never skip step 2 -- state contains key decisions and findings
- If `ref` path in state does not exist, log warning and continue
- Wisdom files are append-only -- read all entries, newest last
## 3. Context Publishing Protocol (Phase 4)
| Step | Action | Required |
|------|--------|----------|
| 1 | Write deliverable to `<session>/artifacts/<task-id>-<name>.md` | Yes |
| 2 | Send `team_msg(type="state_update")` with payload (see schema below) | Yes |
| 3 | Append wisdom entries for learnings, decisions, issues found | If applicable |
## 4. State Update Schema
Sent via `team_msg(type="state_update")` on task completion.
```json
{
"status": "task_complete",
"task_id": "<TASK-NNN>",
"ref": "<session>/artifacts/<filename>",
"key_findings": [
"Finding 1",
"Finding 2"
],
"decisions": [
"Decision with rationale"
],
"files_modified": [
"path/to/file.ts"
],
"verification": "self-validated | peer-reviewed | tested"
}
```
**Field rules**:
- `ref`: Always an artifact path, never inline content
- `key_findings`: Max 5 items, each under 100 chars
- `decisions`: Include rationale, not just the choice
- `files_modified`: Only for implementation tasks
- `verification`: One of `self-validated`, `peer-reviewed`, `tested`
**Write state** (namespaced by role):
```
team_msg(operation="log", session_id=<session-id>, from=<role>, type="state_update", data={
"<role_name>": { "key_findings": [...], "scope": "..." }
})
```
**Read state**:
```
team_msg(operation="get_state", session_id=<session-id>)
// Returns merged state from all state_update messages
```
## 5. Exploration Cache Protocol
Prevents redundant research across tasks and discussion rounds.
| Step | Action |
|------|--------|
| 1 | Read `<session>/explorations/cache-index.json` |
| 2 | If angle already explored, read cached result from `explore-<angle>.json` |
| 3 | If not cached, perform exploration |
| 4 | Write result to `<session>/explorations/explore-<angle>.json` |
| 5 | Update `cache-index.json` with new entry |
**cache-index.json format**:
```json
{
"entries": [
{
"angle": "competitor-analysis",
"file": "explore-competitor-analysis.json",
"created_by": "RESEARCH-001",
"timestamp": "2026-01-15T10:30:00Z"
}
]
}
```
**Rules**:
- Cache key is the exploration `angle` (normalized to kebab-case)
- Cache entries never expire within a session
- Any role can read cached explorations; only the creator updates them

View File

@@ -0,0 +1,97 @@
# Pipeline Definitions — Team Coordinate
## Dynamic Pipeline Model
team-coordinate does NOT have a static pipeline. All pipelines are generated at runtime from task-analysis.json based on the user's task description.
## Pipeline Generation Process
```
Phase 1: analyze-task.md
-> Signal detection -> capability mapping -> dependency graph
-> Output: task-analysis.json
Phase 2: dispatch.md
-> Read task-analysis.json dependency graph
-> Create tasks.json entries per dependency node
-> Set deps chains from graph edges
-> Output: tasks.json with correct DAG
Phase 3-N: monitor.md
-> handleSpawnNext: spawn ready tasks as team-worker agents
-> handleCallback: mark completed, advance pipeline
-> Repeat until all tasks done
```
## Dynamic Task Naming
| Capability | Prefix | Example |
|------------|--------|---------|
| researcher | RESEARCH | RESEARCH-001 |
| developer | IMPL | IMPL-001 |
| analyst | ANALYSIS | ANALYSIS-001 |
| designer | DESIGN | DESIGN-001 |
| tester | TEST | TEST-001 |
| writer | DRAFT | DRAFT-001 |
| planner | PLAN | PLAN-001 |
| (default) | TASK | TASK-001 |
## Dependency Graph Structure
task-analysis.json encodes the pipeline:
```json
{
"dependency_graph": {
"RESEARCH-001": { "role": "researcher", "blockedBy": [], "priority": "P0" },
"IMPL-001": { "role": "developer", "blockedBy": ["RESEARCH-001"], "priority": "P1" },
"TEST-001": { "role": "tester", "blockedBy": ["IMPL-001"], "priority": "P2" }
}
}
```
## Role-Worker Map
Dynamic — loaded from session role-specs at runtime:
```
<session>/role-specs/<role-name>.md -> team-worker agent
```
Role-spec files contain YAML frontmatter:
```yaml
---
role: <role-name>
prefix: <PREFIX>
inner_loop: <true|false>
message_types:
success: <type>
error: error
---
```
## Checkpoint
| Trigger | Behavior |
|---------|----------|
| capability_gap reported | handleAdapt: generate new role-spec, spawn new worker |
| consensus_blocked HIGH | Create REVISION task or pause for user |
| All tasks complete | handleComplete: interactive completion action |
## Specs Reference
- [role-spec-template.md](role-spec-template.md) — Template for generating dynamic role-specs
- [quality-gates.md](quality-gates.md) — Quality thresholds and scoring dimensions
- [knowledge-transfer.md](knowledge-transfer.md) — Context transfer protocols between roles
## Quality Gate Integration
Dynamic pipelines reference quality thresholds from [specs/quality-gates.md](quality-gates.md).
| Gate Point | Trigger | Criteria Source |
|------------|---------|----------------|
| After artifact production | Producer role Phase 4 | Behavioral Traits in role-spec |
| After validation tasks | Tester/analyst completion | quality-gates.md thresholds |
| Pipeline completion | All tasks done | Aggregate scoring |
Issue classification: Error (blocks) > Warning (proceed with justification) > Info (log for future).

View File

@@ -0,0 +1,112 @@
# Quality Gates
## 1. Quality Thresholds
| Result | Score | Action |
|--------|-------|--------|
| Pass | >= 80% | Report completed |
| Review | 60-79% | Report completed with warnings |
| Fail | < 60% | Retry Phase 3 (max 2 retries) |
## 2. Scoring Dimensions
| Dimension | Weight | Criteria |
|-----------|--------|----------|
| Completeness | 25% | All required outputs present with substantive content |
| Consistency | 25% | Terminology, formatting, cross-references are uniform |
| Accuracy | 25% | Outputs are factually correct and verifiable against sources |
| Depth | 25% | Sufficient detail for downstream consumers to act on deliverables |
**Score** = weighted average of all dimensions (0-100 per dimension).
## 3. Dynamic Role Quality Checks
Quality checks vary by `output_type` (from task-analysis.json role metadata).
### output_type: artifact
| Check | Pass Criteria |
|-------|---------------|
| Artifact exists | File written to `<session>/artifacts/` |
| Content non-empty | Substantive content, not just headers |
| Format correct | Expected format (MD, JSON) matches deliverable |
| Cross-references | All references to upstream artifacts resolve |
### output_type: codebase
| Check | Pass Criteria |
|-------|---------------|
| Files modified | Claimed files actually changed (Read to confirm) |
| Syntax valid | No syntax errors in modified files |
| No regressions | Existing functionality preserved |
| Summary artifact | Implementation summary written to artifacts/ |
### output_type: mixed
All checks from both `artifact` and `codebase` apply.
## 4. Verification Protocol
Derived from Behavioral Traits in [role-spec-template.md](role-spec-template.md).
| Step | Action | Required |
|------|--------|----------|
| 1 | Verify all claimed files exist via Read | Yes |
| 2 | Confirm artifact written to `<session>/artifacts/` | Yes |
| 3 | Check verification summary fields present | Yes |
| 4 | Score against quality dimensions | Yes |
| 5 | Apply threshold -> Pass/Review/Fail | Yes |
**On Fail**: Retry Phase 3 (max 2 retries). After 2 retries, report `partial_completion`.
**On Review**: Proceed with warnings logged to `<session>/wisdom/issues.md`.
## 5. Code Review Dimensions
For REVIEW-* or validation tasks during implementation pipelines.
### Quality
| Check | Severity |
|-------|----------|
| Empty catch blocks | Error |
| `as any` type casts | Warning |
| `@ts-ignore` / `@ts-expect-error` | Warning |
| `console.log` in production code | Warning |
| Unused imports/variables | Info |
### Security
| Check | Severity |
|-------|----------|
| Hardcoded secrets/credentials | Error |
| SQL injection vectors | Error |
| `eval()` or `Function()` usage | Error |
| `innerHTML` assignment | Warning |
| Missing input validation | Warning |
### Architecture
| Check | Severity |
|-------|----------|
| Circular dependencies | Error |
| Deep cross-boundary imports (3+ levels) | Warning |
| Files > 500 lines | Warning |
| Functions > 50 lines | Info |
### Requirements Coverage
| Check | Severity |
|-------|----------|
| Core functionality implemented | Error if missing |
| Acceptance criteria covered | Error if missing |
| Edge cases handled | Warning |
| Error states handled | Warning |
## 6. Issue Classification
| Class | Label | Action |
|-------|-------|--------|
| Error | Must fix | Blocks progression, must resolve before proceeding |
| Warning | Should fix | Should resolve, can proceed with justification |
| Info | Nice to have | Optional improvement, log for future |

View File

@@ -0,0 +1,192 @@
# Dynamic Role-Spec Template
Template used by coordinator to generate lightweight worker role-spec files at runtime. Each generated role-spec is written to `<session>/role-specs/<role-name>.md`.
**Key difference from v1**: Role-specs contain ONLY Phase 2-4 domain logic + YAML frontmatter. All shared behavior (Phase 1 Task Discovery, Phase 5 Report/Fast-Advance, Message Bus, Consensus, Inner Loop) is built into the `team-worker` agent.
## Template
```markdown
---
role: <role_name>
prefix: <PREFIX>
inner_loop: <true|false>
CLI tools: [<CLI tool-names>]
message_types:
success: <prefix>_complete
error: error
---
# <Role Name> — Phase 2-4
## Phase 2: <phase2_name>
<phase2_content>
## Phase 3: <phase3_name>
<phase3_content>
## Phase 4: <phase4_name>
<phase4_content>
## Error Handling
| Scenario | Resolution |
|----------|------------|
<error_entries>
```
## Frontmatter Fields
| Field | Required | Description |
|-------|----------|-------------|
| `role` | Yes | Role name matching session registry |
| `prefix` | Yes | Task prefix to filter (e.g., RESEARCH, DRAFT, IMPL) |
| `inner_loop` | Yes | Whether team-worker loops through same-prefix tasks |
| `CLI tools` | No | Array of CLI tool types this role may call |
| `output_tag` | Yes | Output tag for all messages, e.g., `[researcher]` |
| `message_types` | Yes | Message type mapping for team_msg |
| `message_types.success` | Yes | Type string for successful completion |
| `message_types.error` | Yes | Type string for errors (usually "error") |
## Design Rules
| Rule | Description |
|------|-------------|
| Phase 2-4 only | No Phase 1 (Task Discovery) or Phase 5 (Report) — team-worker handles these |
| No message bus code | No team_msg calls — team-worker handles logging |
| No consensus handling | No consensus_reached/blocked logic — team-worker handles routing |
| No inner loop logic | No Phase 5-L/5-F — team-worker handles looping |
| ~80 lines target | Lightweight, domain-focused |
| No pseudocode | Decision tables + text + tool calls only |
| `<placeholder>` notation | Use angle brackets for variable substitution |
| Reference CLI tools by name | team-worker resolves invocation from its delegation templates |
## Generated Role-Spec Structure
Every generated role-spec MUST include these blocks:
### Identity Block (mandatory — first section of generated spec)
```
Tag: [<role_name>] | Prefix: <PREFIX>-*
Responsibility: <one-line from task analysis>
```
### Boundaries Block (mandatory — after Identity)
```
### MUST
- <3-5 rules derived from task analysis>
### MUST NOT
- Execute work outside assigned prefix
- Modify artifacts from other roles
- Skip Phase 4 verification
```
## Behavioral Traits
All dynamically generated role-specs MUST embed these traits into Phase 4. Coordinator copies this section verbatim into every generated role-spec as a Phase 4 appendix.
**Design principle**: Constrain behavioral characteristics (accuracy, feedback, quality gates), NOT specific actions (which tool, which CLI tool, which path). Tasks are diverse — the coordinator composes task-specific Phase 2-3 instructions, while these traits ensure execution quality regardless of task type.
### Accuracy — outputs must be verifiable
- Files claimed as **created** → Read to confirm file exists and has content
- Files claimed as **modified** → Read to confirm content actually changed
- Analysis claimed as **complete** → artifact file exists in `<session>/artifacts/`
### Feedback Contract — completion report must include evidence
Phase 4 must produce a verification summary with these fields:
| Field | When Required | Content |
|-------|---------------|---------|
| `files_produced` | New files created | Path list |
| `files_modified` | Existing files changed | Path + before/after line count |
| `artifacts_written` | Always | Paths in `<session>/artifacts/` |
| `verification_method` | Always | How verified: Read confirm / syntax check / diff |
### Quality Gate — verify before reporting complete
- Phase 4 MUST verify Phase 3's **actual output** (not planned output)
- Verification fails → retry Phase 3 (max 2 retries)
- Still fails → report `partial_completion` with details, NOT `completed`
- Update shared state via `team_msg(operation="log", type="state_update", data={...})` after verification passes
Quality thresholds from [specs/quality-gates.md](quality-gates.md):
- Pass >= 80%: report completed
- Review 60-79%: report completed with warnings
- Fail < 60%: retry Phase 3 (max 2)
### Error Protocol
- Primary approach fails → try alternative (different CLI tool / different tool)
- 2 retries exhausted → escalate to coordinator with failure details
- NEVER: skip verification and report completed
---
## Reference Patterns
Coordinator MAY reference these patterns when composing Phase 2-4 content for a role-spec. These are **structural guidance, not mandatory templates**. The task description determines specific behavior — patterns only suggest common phase structures.
### Research / Exploration
- Phase 2: Define exploration scope + load prior knowledge from shared state and wisdom
- Phase 3: Explore via CLI tools, direct tool calls, or codebase search — approach chosen by agent
- Phase 4: Verify findings documented (Behavioral Traits) + update shared state
### Document / Content
- Phase 2: Load upstream artifacts + read target files (if modifying existing docs)
- Phase 3: Create new documents OR modify existing documents — determined by task, not template
- Phase 4: Verify documents exist with expected content (Behavioral Traits) + update shared state
### Code Implementation
- Phase 2: Load design/spec artifacts from upstream
- Phase 3: Implement code changes — CLI tool choice and approach determined by task complexity
- Phase 4: Syntax check + file verification (Behavioral Traits) + update shared state
### Analysis / Audit
- Phase 2: Load analysis targets (artifacts or source files)
- Phase 3: Multi-dimension analysis — perspectives and depth determined by task
- Phase 4: Verify report exists + severity classification (Behavioral Traits) + update shared state
### Validation / Testing
- Phase 2: Detect test framework + identify changed files from upstream
- Phase 3: Run test-fix cycle — iteration count and strategy determined by task
- Phase 4: Verify pass rate + coverage (Behavioral Traits) + update shared state
---
## Knowledge Transfer Protocol
Full protocol: [specs/knowledge-transfer.md](knowledge-transfer.md)
Generated role-specs Phase 2 MUST declare which upstream sources to load.
Generated role-specs Phase 4 MUST include state update and artifact publishing.
---
## Generated Role-Spec Validation
Coordinator verifies before writing each role-spec:
| Check | Criteria |
|-------|----------|
| Frontmatter complete | All required fields present (role, prefix, inner_loop, output_tag, message_types, CLI tools) |
| Identity block | Tag, prefix, responsibility defined |
| Boundaries | MUST and MUST NOT rules present |
| Phase 2 | Context loading sources specified |
| Phase 3 | Execution goal clear, not prescriptive about tools |
| Phase 4 | Behavioral Traits copied verbatim |
| Error Handling | Table with 3+ scenarios |
| Line count | Target ~80 lines (max 120) |
| No built-in overlap | No Phase 1/5, no message bus code, no consensus handling |