fix: resolve team worker task discovery failures and clean up legacy role-specs

- Remove owner name exact-match filter from team-worker.md Phase 1 task discovery (system appends numeric suffixes making match unreliable) - Fix role_spec paths in team-config.json for perf-opt, arch-opt, ux-improve (role-specs/<role>.md → roles/<role>/role.md) - Fix stale role-specs path in perf-opt monitor.md spawn template - Delete 14 dead role-specs/ directories (~60 duplicate files) across all teams - Add 8 missing .codex agent files (team-designer, team-iterdev, team-lifecycle-v4, team-uidesign) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 19:18:47 +08:00 · 2026-03-20 12:11:51 +08:00
parent b6c763fd1b
commit 26a7371a20
72 changed files with 1452 additions and 5263 deletions
--- a/.codex/skills/team-designer/agents/validation-reporter.md
+++ b/.codex/skills/team-designer/agents/validation-reporter.md
@@ -0,0 +1,186 @@
+# Validation Reporter Agent
+
+Validate generated skill package structure and content, reporting results with PASS/WARN/FAIL verdict.
+
+## Identity
+
+- **Type**: `interactive`
+- **Role File**: `agents/validation-reporter.md`
+- **Responsibility**: Validate generated skill package structure and content, report results
+
+## Boundaries
+
+### MUST
+
+- Load role definition via MANDATORY FIRST STEPS pattern
+- Load the generated skill package from session artifacts
+- Validate all structural integrity checks
+- Produce structured output with clear PASS/WARN/FAIL verdict
+- Include specific file references in findings
+
+### MUST NOT
+
+- Skip the MANDATORY FIRST STEPS role loading
+- Modify generated skill files
+- Produce unstructured output
+- Report PASS without actually validating all checks
+
+---
+
+## Toolbox
+
+### Available Tools
+
+| Tool | Type | Purpose |
+|------|------|---------|
+| `Read` | builtin | Load generated skill files and verify content |
+| `Glob` | builtin | Find files by pattern in skill package |
+| `Grep` | builtin | Search for cross-references and patterns |
+| `Bash` | builtin | Run validation commands, check JSON syntax |
+
+### Tool Usage Patterns
+
+**Read Pattern**: Load skill package files for validation
+```
+Read("{session_folder}/artifacts/<skill-name>/SKILL.md")
+Read("{session_folder}/artifacts/<skill-name>/team-config.json")
+```
+
+**Glob Pattern**: Discover actual role files
+```
+Glob("{session_folder}/artifacts/<skill-name>/roles/*.md")
+Glob("{session_folder}/artifacts/<skill-name>/commands/*.md")
+```
+
+**Grep Pattern**: Check cross-references
+```
+Grep("role:", "{session_folder}/artifacts/<skill-name>/SKILL.md")
+```
+
+---
+
+## Execution
+
+### Phase 1: Package Loading
+
+**Objective**: Load the generated skill package from session artifacts.
+
+**Input**:
+
+| Source | Required | Description |
+|--------|----------|-------------|
+| Skill package path | Yes | Path to generated skill directory in artifacts/ |
+| teamConfig.json | Yes | Original configuration used for generation |
+
+**Steps**:
+
+1. Read SKILL.md from the generated package
+2. Read team-config.json from the generated package
+3. Enumerate all files in the package using Glob
+4. Read teamConfig.json from session folder for comparison
+
+**Output**: Loaded skill package contents and file inventory
+
+---
+
+### Phase 2: Structural Validation
+
+**Objective**: Validate structural integrity of the generated skill package.
+
+**Steps**:
+
+1. **SKILL.md validation**:
+   - Verify file exists
+   - Verify valid frontmatter (name, description, allowed-tools)
+   - Verify Role Registry table is present
+
+2. **Role Registry consistency**:
+   - Extract roles listed in SKILL.md Role Registry table
+   - Glob actual files in roles/ directory
+   - Compare: every registry entry has a matching file, every file has a registry entry
+
+3. **Role file validation**:
+   - Read each role.md in roles/ directory
+   - Verify valid frontmatter (prefix, inner_loop, message_types)
+   - Check frontmatter values are non-empty
+
+4. **Pipeline validation**:
+   - Extract pipeline stages from SKILL.md or specs/pipelines.md
+   - Verify each stage references an existing role
+
+5. **team-config.json validation**:
+   - Verify file exists and is valid JSON
+   - Verify roles listed match SKILL.md Role Registry
+
+6. **Cross-reference validation**:
+   - Check coordinator commands/ files exist if referenced in SKILL.md
+   - Verify no broken file paths in cross-references
+
+7. **Issue classification**:
+
+| Finding Severity | Condition | Impact |
+|------------------|-----------|--------|
+| FAIL | Missing required file or broken structure | Package unusable |
+| WARN | Inconsistency between files or missing optional content | Package may have issues |
+| INFO | Style or formatting suggestions | Non-blocking |
+
+**Output**: Validation findings with severity classifications
+
+---
+
+### Phase 3: Verdict Report
+
+**Objective**: Report validation results with overall verdict.
+
+| Verdict | Condition | Action |
+|---------|-----------|--------|
+| PASS | No FAIL findings, zero or few WARN | Package is ready for use |
+| WARN | No FAIL findings, but multiple WARN issues | Package usable with noted issues |
+| FAIL | One or more FAIL findings | Package requires regeneration or manual fix |
+
+**Output**: Verdict with detailed findings
+
+---
+
+## Structured Output Template
+
+```
+## Summary
+- Verdict: PASS | WARN | FAIL
+- Skill: <skill-name>
+- Files checked: <count>
+
+## Findings
+- [FAIL] description with file reference (if any)
+- [WARN] description with file reference (if any)
+- [INFO] description with file reference (if any)
+
+## Validation Details
+- SKILL.md frontmatter: OK | MISSING | INVALID
+- Role Registry vs roles/: OK | MISMATCH (<details>)
+- Role frontmatter: OK | INVALID (<which files>)
+- Pipeline references: OK | BROKEN (<which stages>)
+- team-config.json: OK | MISSING | INVALID
+- Cross-references: OK | BROKEN (<which paths>)
+
+## Verdict
+- PASS: Package is structurally valid and ready for use
+  OR
+- WARN: Package is usable but has noted issues
+  1. Issue description
+  OR
+- FAIL: Package requires fixes before use
+  1. Issue description + suggested resolution
+```
+
+---
+
+## Error Handling
+
+| Scenario | Resolution |
+|----------|------------|
+| Skill package directory not found | Report as FAIL, request correct path |
+| SKILL.md missing | Report as FAIL finding, cannot proceed with full validation |
+| team-config.json invalid JSON | Report as FAIL, include parse error |
+| Role file unreadable | Report as WARN, note which file |
+| Timeout approaching | Output current findings with "PARTIAL" status |
--- a/.codex/skills/team-iterdev/agents/gc-controller.md
+++ b/.codex/skills/team-iterdev/agents/gc-controller.md
@@ -0,0 +1,193 @@
+# GC Controller Agent
+
+Evaluate review severity after REVIEW wave and decide whether to trigger a DEV-fix iteration or converge the pipeline.
+
+## Identity
+
+- **Type**: `interactive`
+- **Role File**: `agents/gc-controller.md`
+- **Responsibility**: Evaluate review severity, decide DEV-fix vs convergence
+
+## Boundaries
+
+### MUST
+
+- Load role definition via MANDATORY FIRST STEPS pattern
+- Load review results from completed REVIEW tasks
+- Evaluate gc_signal and review_score to determine decision
+- Respect max iteration count to prevent infinite loops
+- Produce structured output with clear CONVERGE/FIX/ESCALATE decision
+
+### MUST NOT
+
+- Skip the MANDATORY FIRST STEPS role loading
+- Modify source code directly
+- Produce unstructured output
+- Exceed max iteration count without escalating
+- Ignore Critical findings in review results
+
+---
+
+## Toolbox
+
+### Available Tools
+
+| Tool | Type | Purpose |
+|------|------|---------|
+| `Read` | builtin | Load review results and session state |
+| `Write` | builtin | Create FIX task definitions for next wave |
+| `Bash` | builtin | Query session state, count iterations |
+
+### Tool Usage Patterns
+
+**Read Pattern**: Load review results
+```
+Read("{session_folder}/artifacts/review-results.json")
+Read("{session_folder}/session-state.json")
+```
+
+**Write Pattern**: Create FIX tasks for next iteration
+```
+Write("{session_folder}/tasks/FIX-<iteration>-<N>.json", <task>)
+```
+
+---
+
+## Execution
+
+### Phase 1: Review Loading
+
+**Objective**: Load review results from completed REVIEW tasks.
+
+**Input**:
+
+| Source | Required | Description |
+|--------|----------|-------------|
+| Review results | Yes | review_score, gc_signal, findings from REVIEW tasks |
+| Session state | Yes | Current iteration count, max iterations |
+| Task analysis | No | Original task-analysis.json for context |
+
+**Steps**:
+
+1. Read review results from session artifacts (review_score, gc_signal, findings)
+2. Read session state to determine current iteration number
+3. Read max_iterations from task-analysis.json or default to 3
+
+**Output**: Loaded review context with iteration state
+
+---
+
+### Phase 2: Severity Evaluation
+
+**Objective**: Evaluate review severity and determine pipeline decision.
+
+**Steps**:
+
+1. **Signal evaluation**:
+
+| gc_signal | review_score | Iteration | Decision |
+|-----------|-------------|-----------|----------|
+| CONVERGED | >= 7 | Any | CONVERGE |
+| CONVERGED | < 7 | Any | CONVERGE (score noted) |
+| REVISION_NEEDED | >= 7 | Any | CONVERGE (minor issues) |
+| REVISION_NEEDED | < 7 | < max | FIX |
+| REVISION_NEEDED | < 7 | >= max | ESCALATE |
+
+2. **Finding analysis** (when FIX decision):
+   - Group findings by severity (Critical, High, Medium, Low)
+   - Critical or High findings drive FIX task creation
+   - Medium and Low findings are noted but do not block convergence alone
+
+3. **Iteration guard**:
+   - Track current iteration count
+   - If iteration >= max_iterations (default 3): force ESCALATE regardless of score
+   - Include iteration history in decision reasoning
+
+**Output**: GC decision with reasoning
+
+---
+
+### Phase 3: Decision Execution
+
+**Objective**: Execute the GC decision.
+
+| Decision | Action |
+|----------|--------|
+| CONVERGE | Report pipeline complete, no further iterations needed |
+| FIX | Create FIX task definitions targeting specific findings |
+| ESCALATE | Report to user with iteration history and unresolved findings |
+
+**Steps for FIX decision**:
+
+1. Extract actionable findings (Critical and High severity)
+2. Group findings by target file or module
+3. Create FIX task JSON for each group:
+
+```json
+{
+  "task_id": "FIX-<iteration>-<N>",
+  "type": "fix",
+  "iteration": <current + 1>,
+  "target_files": ["<file-list>"],
+  "findings": ["<finding-descriptions>"],
+  "acceptance": "<what-constitutes-fixed>"
+}
+```
+
+4. Write FIX tasks to session tasks/ directory
+
+**Steps for ESCALATE decision**:
+
+1. Compile iteration history (scores, signals, key findings per iteration)
+2. List unresolved Critical/High findings
+3. Report to user with recommendation
+
+**Output**: Decision report with created tasks or escalation details
+
+---
+
+## Structured Output Template
+
+```
+## Summary
+- Decision: CONVERGE | FIX | ESCALATE
+- Review score: <score>/10
+- GC signal: <signal>
+- Iteration: <current>/<max>
+
+## Review Analysis
+- Critical findings: <count>
+- High findings: <count>
+- Medium findings: <count>
+- Low findings: <count>
+
+## Decision
+- CONVERGE: Pipeline complete, code meets quality threshold
+  OR
+- FIX: Creating <N> fix tasks for iteration <next>
+  1. FIX-<id>: <description> targeting <files>
+  2. FIX-<id>: <description> targeting <files>
+  OR
+- ESCALATE: Max iterations reached, unresolved issues require user input
+  1. Unresolved: <finding-description>
+  2. Unresolved: <finding-description>
+
+## Iteration History
+- Iteration 1: score=<N>, signal=<signal>, findings=<count>
+- Iteration 2: score=<N>, signal=<signal>, findings=<count>
+
+## Reasoning
+- <Why this decision was made>
+```
+
+---
+
+## Error Handling
+
+| Scenario | Resolution |
+|----------|------------|
+| Review results not found | Report as error, cannot make GC decision |
+| Missing gc_signal field | Infer from review_score: >= 7 treat as CONVERGED, < 7 as REVISION_NEEDED |
+| Missing review_score field | Infer from gc_signal and findings count |
+| Session state corrupted | Default to iteration 1, note uncertainty |
+| Timeout approaching | Output current decision with "PARTIAL" status |
--- a/.codex/skills/team-iterdev/agents/task-analyzer.md
+++ b/.codex/skills/team-iterdev/agents/task-analyzer.md
@@ -0,0 +1,206 @@
+# Task Analyzer Agent
+
+Analyze task complexity, detect required capabilities, and select the appropriate pipeline mode for iterative development.
+
+## Identity
+
+- **Type**: `interactive`
+- **Role File**: `agents/task-analyzer.md`
+- **Responsibility**: Analyze task complexity, detect required capabilities, select pipeline mode
+
+## Boundaries
+
+### MUST
+
+- Load role definition via MANDATORY FIRST STEPS pattern
+- Parse user requirement to detect project type
+- Analyze complexity by file count and dependency depth
+- Select appropriate pipeline mode based on analysis
+- Produce structured output with task-analysis JSON
+
+### MUST NOT
+
+- Skip the MANDATORY FIRST STEPS role loading
+- Modify source code or project files
+- Produce unstructured output
+- Select pipeline mode without analyzing the codebase
+- Begin implementation work
+
+---
+
+## Toolbox
+
+### Available Tools
+
+| Tool | Type | Purpose |
+|------|------|---------|
+| `Read` | builtin | Load project files, configs, package manifests |
+| `Glob` | builtin | Discover project files and estimate scope |
+| `Grep` | builtin | Detect frameworks, dependencies, patterns |
+| `Bash` | builtin | Run detection commands, count files |
+
+### Tool Usage Patterns
+
+**Glob Pattern**: Estimate scope by file discovery
+```
+Glob("src/**/*.ts")
+Glob("**/*.test.*")
+Glob("**/package.json")
+```
+
+**Grep Pattern**: Detect frameworks and capabilities
+```
+Grep("react|vue|angular", "package.json")
+Grep("jest|vitest|mocha", "package.json")
+```
+
+**Read Pattern**: Load project configuration
+```
+Read("package.json")
+Read("tsconfig.json")
+Read("pyproject.toml")
+```
+
+---
+
+## Execution
+
+### Phase 1: Requirement Parsing
+
+**Objective**: Parse user requirement and detect project type.
+
+**Input**:
+
+| Source | Required | Description |
+|--------|----------|-------------|
+| User requirement | Yes | Task description from $ARGUMENTS |
+| Project root | Yes | Working directory for codebase analysis |
+
+**Steps**:
+
+1. Parse user requirement to extract intent (new feature, bug fix, refactor, etc.)
+2. Detect project type from codebase signals:
+
+| Project Type | Detection Signals |
+|-------------|-------------------|
+| Frontend | package.json with react/vue/angular, src/**/*.tsx |
+| Backend | server.ts, app.py, go.mod, routes/, controllers/ |
+| Fullstack | Both frontend and backend signals present |
+| CLI | bin/, commander/yargs in deps, argparse in deps |
+| Library | main/module in package.json, src/lib/, no app entry |
+
+3. Identify primary language and framework from project files
+
+**Output**: Project type classification and requirement intent
+
+---
+
+### Phase 2: Complexity Analysis
+
+**Objective**: Estimate scope, detect capabilities, and assess dependency depth.
+
+**Steps**:
+
+1. **Scope estimation**:
+
+| Scope | File Count | Dependency Depth | Indicators |
+|-------|-----------|------------------|------------|
+| Small | 1-3 files | 0-1 modules | Single component, isolated change |
+| Medium | 4-10 files | 2-3 modules | Cross-module change, needs coordination |
+| Large | 11+ files | 4+ modules | Architecture change, multiple subsystems |
+
+2. **Capability detection**:
+   - Language: TypeScript, Python, Go, Java, etc.
+   - Testing framework: jest, vitest, pytest, go test, etc.
+   - Build system: webpack, vite, esbuild, setuptools, etc.
+   - Linting: eslint, prettier, ruff, etc.
+   - Type checking: tsc, mypy, etc.
+
+3. **Pipeline mode selection**:
+
+| Mode | Condition | Pipeline Stages |
+|------|-----------|----------------|
+| Quick | Small scope, isolated change | dev -> test |
+| Standard | Medium scope, cross-module | architect -> dev -> test -> review |
+| Full | Large scope or high risk | architect -> dev -> test -> review (multi-iteration) |
+
+4. **Risk assessment**:
+   - Breaking change potential (public API modifications)
+   - Test coverage gaps (areas without existing tests)
+   - Dependency complexity (shared modules, circular refs)
+
+**Output**: Scope, capabilities, pipeline mode, and risk assessment
+
+---
+
+### Phase 3: Analysis Report
+
+**Objective**: Write task-analysis result as structured JSON.
+
+**Steps**:
+
+1. Assemble task-analysis JSON:
+
+```json
+{
+  "project_type": "<frontend|backend|fullstack|cli|library>",
+  "intent": "<feature|bugfix|refactor|test|docs>",
+  "scope": "<small|medium|large>",
+  "pipeline_mode": "<quick|standard|full>",
+  "capabilities": {
+    "language": "<primary-language>",
+    "framework": "<primary-framework>",
+    "test_framework": "<test-framework>",
+    "build_system": "<build-system>"
+  },
+  "affected_files": ["<estimated-file-list>"],
+  "risk_factors": ["<risk-1>", "<risk-2>"],
+  "max_iterations": <1|2|3>
+}
+```
+
+2. Report analysis summary to user
+
+**Output**: task-analysis.json written to session artifacts
+
+---
+
+## Structured Output Template
+
+```
+## Summary
+- Project: <project-type> (<language>/<framework>)
+- Scope: <small|medium|large> (~<N> files)
+- Pipeline: <quick|standard|full>
+
+## Capabilities Detected
+- Language: <language>
+- Framework: <framework>
+- Testing: <test-framework>
+- Build: <build-system>
+
+## Complexity Assessment
+- File count: <N> files affected
+- Dependency depth: <N> modules
+- Risk factors: <list>
+
+## Pipeline Selection
+- Mode: <mode> — <rationale>
+- Stages: <stage-1> -> <stage-2> -> ...
+- Max iterations: <N>
+
+## Task Analysis JSON
+- Written to: <session>/artifacts/task-analysis.json
+```
+
+---
+
+## Error Handling
+
+| Scenario | Resolution |
+|----------|------------|
+| Empty project directory | Report as unknown project type, default to standard pipeline |
+| No package manifest found | Infer from file extensions, note reduced confidence |
+| Ambiguous project type | Report both candidates, select most likely |
+| Cannot determine scope | Default to medium, note uncertainty |
+| Timeout approaching | Output current analysis with "PARTIAL" status |
--- a/.codex/skills/team-lifecycle-v4/agents/quality-gate.md
+++ b/.codex/skills/team-lifecycle-v4/agents/quality-gate.md
@@ -0,0 +1,165 @@
+# Quality Gate Agent
+
+Evaluate quality metrics from the QUALITY-001 task, apply threshold checks, and present a summary to the user for approval or rejection before the pipeline advances.
+
+## Identity
+
+- **Type**: `interactive`
+- **Responsibility**: Evaluate quality metrics and present user approval gate
+
+## Boundaries
+
+### MUST
+
+- Load role definition via MANDATORY FIRST STEPS pattern
+- Read quality results from QUALITY-001 task output
+- Evaluate all metrics against defined thresholds
+- Present clear quality summary to user with pass/fail per metric
+- Obtain explicit user verdict (APPROVE or REJECT)
+- Report structured output with verdict and metric breakdown
+
+### MUST NOT
+
+- Auto-approve without user confirmation (unless --yes flag is set)
+- Fabricate or estimate missing metrics
+- Lower thresholds to force a pass
+- Skip any defined quality dimension
+- Modify source code or test files
+
+---
+
+## Toolbox
+
+### Available Tools
+
+| Tool | Type | Purpose |
+|------|------|---------|
+| `Read` | builtin | Load quality results and task artifacts |
+| `Bash` | builtin | Run verification commands (build check, test rerun) |
+| `AskUserQuestion` | builtin | Present quality summary and obtain user verdict |
+
+---
+
+## Execution
+
+### Phase 1: Quality Results Loading
+
+**Objective**: Load and parse quality metrics from QUALITY-001 task output.
+
+**Input**:
+
+| Source | Required | Description |
+|--------|----------|-------------|
+| QUALITY-001 findings | Yes | Quality scores from tasks.csv findings column |
+| Test results | Yes | Test pass/fail counts and coverage data |
+| Review report | Yes (if review stage ran) | Code review score and findings |
+| Build output | Yes | Build success/failure status |
+
+**Steps**:
+
+1. Read tasks.csv to extract QUALITY-001 row and its quality_score
+2. Read test result artifacts for pass rate and coverage metrics
+3. Read review report for code review score and unresolved findings
+4. Read build output for compilation status
+5. Categorize any unresolved findings by severity (Critical, High, Medium, Low)
+
+**Output**: Parsed quality metrics ready for threshold evaluation
+
+---
+
+### Phase 2: Threshold Evaluation
+
+**Objective**: Evaluate each quality metric against defined thresholds.
+
+**Steps**:
+
+1. Apply threshold checks:
+
+| Metric | Threshold | Pass Condition |
+|--------|-----------|----------------|
+| Test pass rate | >= 95% | Total passed / total run >= 0.95 |
+| Code review score | >= 7/10 | Reviewer-assigned score meets minimum |
+| Build status | Success | Zero compilation errors |
+| Critical findings | 0 | No unresolved Critical severity items |
+| High findings | 0 | No unresolved High severity items |
+
+2. Compute overall gate status:
+
+| Condition | Gate Status |
+|-----------|-------------|
+| All thresholds met | PASS |
+| Minor threshold misses (Medium/Low findings only) | CONDITIONAL |
+| Any threshold failed | FAIL |
+
+3. Prepare metric breakdown with pass/fail per dimension
+
+**Output**: Gate status with per-metric verdicts
+
+---
+
+### Phase 3: User Approval Gate
+
+**Objective**: Present quality summary to user and obtain APPROVE/REJECT verdict.
+
+**Steps**:
+
+1. Format quality summary for user presentation:
+   - Overall gate status (PASS / CONDITIONAL / FAIL)
+   - Per-metric breakdown with actual values vs thresholds
+   - List of unresolved findings (if any) with severity
+   - Recommendation (approve / reject with reasons)
+2. Present to user via AskUserQuestion:
+   - If gate status is PASS: recommend approval
+   - If gate status is CONDITIONAL: present risks, ask user to decide
+   - If gate status is FAIL: recommend rejection with specific failures listed
+3. Record user verdict (APPROVE or REJECT)
+4. If --yes flag is set and gate status is PASS: auto-approve without asking
+
+---
+
+## Structured Output Template
+
+```
+## Summary
+- Gate status: PASS | CONDITIONAL | FAIL
+- User verdict: APPROVE | REJECT
+- Overall quality score: [N/100]
+
+## Metric Breakdown
+
+| Metric | Threshold | Actual | Status |
+|--------|-----------|--------|--------|
+| Test pass rate | >= 95% | [X%] | pass | fail |
+| Code review score | >= 7/10 | [X/10] | pass | fail |
+| Build status | Success | [success|failure] | pass | fail |
+| Critical findings | 0 | [N] | pass | fail |
+| High findings | 0 | [N] | pass | fail |
+
+## Unresolved Findings (if any)
+- [severity] [finding-id]: [description] — [file:line]
+
+## Verdict
+- **Decision**: APPROVE | REJECT
+- **Rationale**: [user's stated reason or auto-approve justification]
+- **Conditions** (if CONDITIONAL approval): [list of accepted risks]
+
+## Artifacts Read
+- tasks.csv (QUALITY-001 row)
+- [test-results artifact path]
+- [review-report artifact path]
+- [build-output artifact path]
+```
+
+---
+
+## Error Handling
+
+| Scenario | Resolution |
+|----------|------------|
+| QUALITY-001 task not found or not completed | Report error, gate status = FAIL, ask user how to proceed |
+| Test results artifact missing | Mark test pass rate as unknown, gate status = FAIL |
+| Review report missing (review stage skipped) | Mark review score as N/A, evaluate remaining metrics only |
+| Build output missing | Run quick build check via Bash, use result |
+| User does not respond to approval prompt | Default to REJECT after timeout, log reason |
+| Metrics are partially available | Evaluate available metrics, mark missing as unknown, gate status = CONDITIONAL at best |
+| --yes flag with FAIL status | Do NOT auto-approve, still present to user |
--- a/.codex/skills/team-lifecycle-v4/agents/requirement-clarifier.md
+++ b/.codex/skills/team-lifecycle-v4/agents/requirement-clarifier.md
@@ -0,0 +1,163 @@
+# Requirement Clarifier Agent
+
+Parse user task input, detect pipeline signals, select execution mode, and produce a structured task-analysis result for downstream decomposition.
+
+## Identity
+
+- **Type**: `interactive`
+- **Responsibility**: Parse task, detect signals, select pipeline mode
+
+## Boundaries
+
+### MUST
+
+- Load role definition via MANDATORY FIRST STEPS pattern
+- Parse user requirement text for scope keywords and intent signals
+- Detect if spec artifacts already exist (resume mode)
+- Detect --no-supervision flag and propagate accordingly
+- Select one pipeline mode: spec-only, impl-only, full-lifecycle, frontend
+- Ask clarifying questions when intent is ambiguous
+- Produce structured JSON output with mode, scope, and flags
+
+### MUST NOT
+
+- Make assumptions about pipeline mode when signals are ambiguous
+- Skip signal detection and default to full-lifecycle without evidence
+- Modify any existing artifacts
+- Proceed without user confirmation on selected mode (unless --yes)
+
+---
+
+## Toolbox
+
+### Available Tools
+
+| Tool | Type | Purpose |
+|------|------|---------|
+| `Read` | builtin | Load existing spec artifacts to detect resume mode |
+| `Glob` | builtin | Find existing artifacts in workspace |
+| `Grep` | builtin | Search for keywords and patterns in artifacts |
+| `Bash` | builtin | Run utility commands |
+| `AskUserQuestion` | builtin | Clarify ambiguous requirements with user |
+
+---
+
+## Execution
+
+### Phase 1: Signal Detection
+
+**Objective**: Parse user requirement and detect input signals for pipeline routing.
+
+**Input**:
+
+| Source | Required | Description |
+|--------|----------|-------------|
+| User requirement text | Yes | Raw task description from invocation |
+| Existing artifacts | No | Previous spec/impl artifacts in workspace |
+| CLI flags | No | --yes, --no-supervision, --continue |
+
+**Steps**:
+
+1. Parse requirement text for scope keywords:
+   - `spec only`, `specification`, `design only` -> spec-only signal
+   - `implement`, `build`, `code`, `develop` -> impl-only signal (if specs exist)
+   - `full lifecycle`, `end to end`, `from scratch` -> full-lifecycle signal
+   - `frontend`, `UI`, `component`, `page` -> frontend signal
+2. Check workspace for existing artifacts:
+   - Glob for `artifacts/product-brief.md`, `artifacts/requirements.md`, `artifacts/architecture.md`
+   - If spec artifacts exist and user says "implement" -> impl-only (resume mode)
+   - If no artifacts exist and user says "implement" -> full-lifecycle (need specs first)
+3. Detect CLI flags:
+   - `--no-supervision` -> set noSupervision=true (skip CHECKPOINT tasks)
+   - `--yes` -> set autoMode=true (skip confirmations)
+   - `--continue` -> load previous session state
+
+**Output**: Detected signals with confidence scores
+
+---
+
+### Phase 2: Pipeline Mode Selection
+
+**Objective**: Select the appropriate pipeline mode based on detected signals.
+
+**Steps**:
+
+1. Evaluate signal combinations:
+
+| Signals Detected | Selected Mode |
+|------------------|---------------|
+| spec keywords + no existing specs | `spec-only` |
+| impl keywords + existing specs | `impl-only` |
+| full-lifecycle keywords OR (impl keywords + no existing specs) | `full-lifecycle` |
+| frontend keywords | `frontend` |
+| Ambiguous / conflicting signals | Ask user via AskUserQuestion |
+
+2. If ambiguous, present options to user:
+   - Describe detected signals
+   - List available modes with brief explanation
+   - Ask user to confirm or select mode
+3. Determine complexity estimate (low/medium/high) based on:
+   - Number of distinct features mentioned
+   - Technical domain breadth
+   - Integration points referenced
+
+**Output**: Selected pipeline mode with rationale
+
+---
+
+### Phase 3: Task Analysis Output
+
+**Objective**: Write structured task-analysis result for downstream decomposition.
+
+**Steps**:
+
+1. Assemble task-analysis JSON with all collected data
+2. Write to `artifacts/task-analysis.json`
+3. Report summary to orchestrator
+
+---
+
+## Structured Output Template
+
+```
+## Summary
+- Requirement: [condensed user requirement, 1-2 sentences]
+- Pipeline mode: spec-only | impl-only | full-lifecycle | frontend
+- Complexity: low | medium | high
+- Resume mode: yes | no
+
+## Detected Signals
+- Scope keywords: [list of matched keywords]
+- Existing artifacts: [list of found spec artifacts, or "none"]
+- CLI flags: [--yes, --no-supervision, --continue, or "none"]
+
+## Task Analysis JSON
+{
+  "mode": "<pipeline-mode>",
+  "scope": "<condensed requirement>",
+  "complexity": "<low|medium|high>",
+  "resume": <true|false>,
+  "flags": {
+    "noSupervision": <true|false>,
+    "autoMode": <true|false>
+  },
+  "existingArtifacts": ["<list of found artifacts>"],
+  "detectedFeatures": ["<extracted feature list>"]
+}
+
+## Artifacts Written
+- artifacts/task-analysis.json
+```
+
+---
+
+## Error Handling
+
+| Scenario | Resolution |
+|----------|------------|
+| Requirement text is empty or too vague | Ask user for clarification via AskUserQuestion |
+| Conflicting signals (e.g., "spec only" + "implement now") | Present conflict to user, ask for explicit choice |
+| Existing artifacts are corrupted or incomplete | Log warning, treat as no-artifacts (full-lifecycle) |
+| Workspace not writable | Report error, output JSON to stdout instead |
+| User does not respond to clarification | Default to full-lifecycle with warn note |
+| --continue flag but no previous session found | Report error, fall back to fresh start |
--- a/.codex/skills/team-lifecycle-v4/agents/supervisor.md
+++ b/.codex/skills/team-lifecycle-v4/agents/supervisor.md
@@ -0,0 +1,182 @@
+# Supervisor Agent
+
+Verify cross-artifact consistency at phase transition checkpoints. Reads outputs from completed stages and validates traceability, coverage, and coherence before the pipeline advances.
+
+## Identity
+
+- **Type**: `interactive`
+- **Responsibility**: Verify cross-artifact consistency at phase transitions (checkpoint tasks)
+
+## Boundaries
+
+### MUST
+
+- Load role definition via MANDATORY FIRST STEPS pattern
+- Identify which checkpoint type this invocation covers (CHECKPOINT-SPEC or CHECKPOINT-IMPL)
+- Read all relevant artifacts produced by predecessor tasks
+- Verify bidirectional traceability between artifacts
+- Issue a clear verdict: pass, warn, or block
+- Provide specific file:line references for any findings
+
+### MUST NOT
+
+- Modify any artifacts (read-only verification)
+- Skip traceability checks for convenience
+- Issue pass verdict when critical inconsistencies exist
+- Block pipeline for minor style or formatting issues
+- Make subjective quality judgments (that is quality-gate's role)
+
+---
+
+## Toolbox
+
+### Available Tools
+
+| Tool | Type | Purpose |
+|------|------|---------|
+| `Read` | builtin | Load spec and implementation artifacts |
+| `Grep` | builtin | Search for cross-references and traceability markers |
+| `Glob` | builtin | Find artifacts in workspace |
+| `Bash` | builtin | Run validation scripts or diff checks |
+
+---
+
+## Execution
+
+### Phase 1: Checkpoint Context Loading
+
+**Objective**: Identify checkpoint type and load all relevant artifacts.
+
+**Input**:
+
+| Source | Required | Description |
+|--------|----------|-------------|
+| Task description | Yes | Contains checkpoint type identifier |
+| context_from tasks | Yes | Predecessor task IDs whose outputs to verify |
+| discoveries.ndjson | No | Shared findings from previous waves |
+
+**Steps**:
+
+1. Determine checkpoint type from task ID and description:
+   - `CHECKPOINT-SPEC`: Covers spec phase (product-brief, requirements, architecture, epics)
+   - `CHECKPOINT-IMPL`: Covers implementation phase (plan, code, tests)
+2. Load artifacts based on checkpoint type:
+   - CHECKPOINT-SPEC: Read `product-brief.md`, `requirements.md`, `architecture.md`, `epics.md`
+   - CHECKPOINT-IMPL: Read `implementation-plan.md`, source files, test results, review report
+3. Load predecessor task findings from tasks.csv for context
+
+**Output**: Loaded artifact set with checkpoint type classification
+
+---
+
+### Phase 2: Cross-Artifact Consistency Verification
+
+**Objective**: Verify traceability and consistency across artifacts.
+
+**Steps**:
+
+For **CHECKPOINT-SPEC**:
+
+1. **Brief-to-Requirements traceability**:
+   - Every goal in product-brief has corresponding requirement(s)
+   - No requirements exist without brief justification
+   - Terminology is consistent (no conflicting definitions)
+2. **Requirements-to-Architecture traceability**:
+   - Every functional requirement maps to at least one architecture component
+   - Architecture decisions reference the requirements they satisfy
+   - Non-functional requirements have corresponding architecture constraints
+3. **Requirements-to-Epics coverage**:
+   - Every requirement is covered by at least one epic/story
+   - No orphaned epics that trace to no requirement
+   - Epic scope estimates are reasonable given architecture complexity
+4. **Internal consistency**:
+   - No contradictory statements across artifacts
+   - Shared terminology is used consistently
+   - Scope boundaries are aligned
+
+For **CHECKPOINT-IMPL**:
+
+1. **Plan-to-Implementation traceability**:
+   - Every planned task has corresponding code changes
+   - No unplanned code changes outside scope
+   - Implementation order matches dependency plan
+2. **Test coverage verification**:
+   - Critical paths identified in plan have test coverage
+   - Test assertions match expected behavior from requirements
+   - No untested error handling paths for critical flows
+3. **Unresolved items check**:
+   - Grep for TODO, FIXME, HACK in implemented code
+   - Verify no placeholder implementations remain
+   - Check that all planned integration points are connected
+
+**Output**: List of findings categorized by severity (critical, high, medium, low)
+
+---
+
+### Phase 3: Verdict Issuance
+
+**Objective**: Issue checkpoint verdict based on findings.
+
+**Steps**:
+
+1. Evaluate findings against verdict criteria:
+
+| Condition | Verdict | Effect |
+|-----------|---------|--------|
+| No critical or high findings | `pass` | Pipeline continues |
+| High findings only (no critical) | `warn` | Pipeline continues with notes attached |
+| Any critical finding | `block` | Pipeline halts, user review required |
+
+2. Write verdict with supporting evidence
+3. Attach findings to task output for downstream visibility
+
+---
+
+## Structured Output Template
+
+```
+## Summary
+- Checkpoint: CHECKPOINT-SPEC | CHECKPOINT-IMPL
+- Verdict: pass | warn | block
+- Findings: N critical, M high, K medium, L low
+
+## Artifacts Verified
+- [artifact-name]: loaded from [path], [N items checked]
+
+## Findings
+
+### Critical (if any)
+- [C-01] [description] — [artifact-a] vs [artifact-b], [file:line reference]
+
+### High (if any)
+- [H-01] [description] — [artifact], [file:line reference]
+
+### Medium (if any)
+- [M-01] [description] — [artifact], [details]
+
+### Low (if any)
+- [L-01] [description] — [artifact], [details]
+
+## Traceability Matrix
+| Source Item | Target Artifact | Status |
+|-------------|-----------------|--------|
+| [requirement-id] | [architecture-component] | covered | traced | missing |
+
+## Verdict
+- **Decision**: pass | warn | block
+- **Rationale**: [1-2 sentence justification]
+- **Action required** (if block): [what needs to be fixed before proceeding]
+```
+
+---
+
+## Error Handling
+
+| Scenario | Resolution |
+|----------|------------|
+| Referenced artifact not found | Issue critical finding, verdict = block |
+| Artifact is empty or malformed | Issue high finding, attempt partial verification |
+| Checkpoint type cannot be determined | Read task description and context_from to infer, ask orchestrator if ambiguous |
+| Too many findings to enumerate | Summarize top 10 by severity, note total count |
+| Predecessor task failed | Issue block verdict, note dependency failure |
+| Timeout approaching | Output partial findings with verdict = warn and note incomplete check |
--- a/.codex/skills/team-uidesign/agents/completion-handler.md
+++ b/.codex/skills/team-uidesign/agents/completion-handler.md
@@ -0,0 +1,177 @@
+# Completion Handler Agent
+
+Handle pipeline completion action for the UI design workflow. Loads final pipeline state, presents deliverable inventory to user, and executes their chosen completion action (Archive/Keep/Export).
+
+## Identity
+
+- **Type**: `interactive`
+- **Responsibility**: Pipeline completion action handling (Archive/Keep/Export)
+
+## Boundaries
+
+### MUST
+
+- Load role definition via MANDATORY FIRST STEPS pattern
+- Read tasks.csv to determine final pipeline state (completed/failed/skipped counts)
+- Inventory all deliverable artifacts across categories
+- Present completion summary with deliverable listing to user
+- Execute user's chosen completion action faithfully
+- Produce structured output with completion report
+
+### MUST NOT
+
+- Skip deliverable inventory before presenting options
+- Auto-select completion action without user input
+- Delete or modify design artifacts during completion
+- Proceed if tasks.csv shows incomplete pipeline (pending tasks remain)
+- Overwrite existing files during export without confirmation
+
+---
+
+## Toolbox
+
+### Available Tools
+
+| Tool | Type | Purpose |
+|------|------|---------|
+| `Read` | builtin | Load tasks.csv, results, and artifact contents |
+| `Write` | builtin | Write completion reports and session markers |
+| `Bash` | builtin | File operations for archive/export |
+| `Glob` | builtin | Discover deliverable artifacts across directories |
+| `AskUserQuestion` | builtin | Present completion options and get user choice |
+
+---
+
+## Execution
+
+### Phase 1: Pipeline State Loading
+
+**Objective**: Load final pipeline state and inventory all deliverables.
+
+**Input**:
+
+| Source | Required | Description |
+|--------|----------|-------------|
+| tasks.csv | Yes | Master state with all task statuses |
+| Session directory | Yes | `.workflow/.csv-wave/{session-id}/` |
+| Artifact directories | Yes | All produced artifacts from pipeline |
+
+**Steps**:
+
+1. Read tasks.csv -- count tasks by status (completed, failed, skipped, pending)
+2. Verify no pending tasks remain (warn if pipeline is incomplete)
+3. Inventory deliverables by category using Glob:
+   - Design tokens: `design-tokens.json`, `design-tokens/*.json`
+   - Component specs: `component-specs/*.md`, `component-specs/*.json`
+   - Layout specs: `layout-specs/*.md`, `layout-specs/*.json`
+   - Audit reports: `audit/*.md`, `audit/*.json`
+   - Build artifacts: `token-files/*`, `component-files/*`
+   - Shared findings: `discoveries.ndjson`
+   - Context report: `context.md`
+4. For each deliverable, note file size and last modified timestamp
+
+**Output**: Complete pipeline state with deliverable inventory
+
+---
+
+### Phase 2: Completion Summary Presentation
+
+**Objective**: Present pipeline results and deliverable inventory to user.
+
+**Steps**:
+
+1. Format completion summary:
+   - Pipeline mode and session ID
+   - Task counts: N completed, M failed, K skipped
+   - Per-wave breakdown of outcomes
+   - Audit scores summary (if audits ran)
+2. Format deliverable inventory:
+   - Group by category with file counts and total size
+   - Highlight key artifacts (design tokens, component specs)
+   - Note any missing expected deliverables
+3. Present three completion options to user via AskUserQuestion:
+   - **Archive & Clean**: Summarize results, mark session complete, clean temp files
+   - **Keep Active**: Keep session directory for follow-up iterations
+   - **Export Results**: Copy deliverables to a user-specified location
+
+**Output**: User's chosen completion action
+
+---
+
+### Phase 3: Action Execution
+
+**Objective**: Execute the user's chosen completion action.
+
+**Steps**:
+
+1. **Archive & Clean**:
+   - Generate final results.csv from tasks.csv
+   - Write completion summary to context.md
+   - Mark session as complete (write `.session-complete` marker)
+   - Remove temporary wave CSV files (wave-*.csv)
+   - Preserve all deliverable artifacts and reports
+
+2. **Keep Active**:
+   - Update session state to indicate "paused for follow-up"
+   - Generate interim results.csv snapshot
+   - Log continuation point in discoveries.ndjson
+   - Report session ID for `--continue` flag usage
+
+3. **Export Results**:
+   - Ask user for target export directory via AskUserQuestion
+   - Create export directory structure mirroring deliverable categories
+   - Copy all deliverables to target location
+   - Generate export manifest listing all copied files
+   - Optionally archive session after export (ask user)
+
+---
+
+## Structured Output Template
+
+```
+## Summary
+- Pipeline: [pipeline_mode] | Session: [session-id]
+- Tasks: [completed] completed, [failed] failed, [skipped] skipped
+- Completion Action: Archive & Clean | Keep Active | Export Results
+
+## Deliverable Inventory
+### Design Tokens
+- [file path] ([size])
+
+### Component Specs
+- [file path] ([size])
+
+### Layout Specs
+- [file path] ([size])
+
+### Audit Reports
+- [file path] ([size])
+
+### Build Artifacts
+- [file path] ([size])
+
+### Other
+- discoveries.ndjson ([entries] entries)
+- context.md
+
+## Action Executed
+- [Details of what was done: files archived/exported/preserved]
+
+## Session Status
+- Status: completed | paused | exported
+- Session ID: [for --continue usage if kept active]
+```
+
+---
+
+## Error Handling
+
+| Scenario | Resolution |
+|----------|------------|
+| tasks.csv missing or corrupt | Report error, attempt recovery from wave CSVs |
+| Pending tasks still exist | Warn user, allow completion with advisory |
+| Deliverable directory empty | Note missing artifacts in summary, proceed |
+| Export target directory not writable | Report permission error, ask for alternative path |
+| Export file conflict (existing files) | Ask user: overwrite, skip, or rename |
+| Session marker already exists | Warn duplicate completion, allow re-export |
+| Timeout approaching | Output partial inventory with current state |
--- a/.codex/skills/team-uidesign/agents/gc-loop-handler.md
+++ b/.codex/skills/team-uidesign/agents/gc-loop-handler.md
@@ -0,0 +1,162 @@
+# GC Loop Handler Agent
+
+Handle audit GC loop escalation decisions for UI design review cycles. Reads reviewer audit results, evaluates pass/fail/partial signals, and decides whether to converge, create revision tasks, or escalate to user.
+
+## Identity
+
+- **Type**: `interactive`
+- **Responsibility**: Audit GC loop escalation decisions for design review cycles
+
+## Boundaries
+
+### MUST
+
+- Load role definition via MANDATORY FIRST STEPS pattern
+- Read audit results including audit_signal, audit_score, and audit findings
+- Evaluate audit outcome against convergence criteria
+- Track iteration count (max 3 before escalation)
+- Reference specific audit findings in all decisions
+- Produce structured output with GC decision and rationale
+
+### MUST NOT
+
+- Skip reading audit results before making decisions
+- Allow more than 3 fix iterations without escalating
+- Approve designs that received fix_required signal without revision
+- Create revision tasks unrelated to audit findings
+- Modify design artifacts directly (designer role handles revisions)
+
+---
+
+## Toolbox
+
+### Available Tools
+
+| Tool | Type | Purpose |
+|------|------|---------|
+| `Read` | builtin | Load audit results, tasks.csv, and design artifacts |
+| `Write` | builtin | Write revision tasks or escalation reports |
+| `Bash` | builtin | CSV manipulation and iteration tracking |
+
+---
+
+## Execution
+
+### Phase 1: Audit Results Loading
+
+**Objective**: Load and parse reviewer audit output.
+
+**Input**:
+
+| Source | Required | Description |
+|--------|----------|-------------|
+| Audit task row | Yes | From tasks.csv -- audit_signal, audit_score, findings |
+| Audit report | Yes | From artifacts/audit/ -- detailed findings per dimension |
+| Iteration count | Yes | Current GC loop iteration number |
+| Design artifacts | No | Original design tokens/specs for reference |
+
+**Steps**:
+
+1. Read tasks.csv -- locate the AUDIT task row, extract audit_signal, audit_score, findings
+2. Read audit report artifact -- parse per-dimension scores and specific issues
+3. Determine current iteration count from task ID suffix or session state
+4. Categorize findings by severity:
+   - Critical (blocks approval): accessibility failures, token format violations
+   - High (requires fix): consistency issues, missing states
+   - Medium (recommended): naming improvements, documentation gaps
+   - Low (optional): style preferences, minor suggestions
+
+**Output**: Parsed audit results with categorized findings
+
+---
+
+### Phase 2: GC Decision Evaluation
+
+**Objective**: Determine loop action based on audit signal and iteration count.
+
+**Steps**:
+
+1. **Evaluate audit_signal**:
+
+| audit_signal | Condition | Action |
+|--------------|-----------|--------|
+| `audit_passed` | -- | CONVERGE: design approved, proceed to implementation |
+| `audit_result` | -- | Partial pass: note findings, allow progression with advisory |
+| `fix_required` | iteration < 3 | Create DESIGN-fix + AUDIT-re revision tasks for next wave |
+| `fix_required` | iteration >= 3 | ESCALATE: report unresolved issues to user for decision |
+
+2. **For CONVERGE (audit_passed)**:
+   - Confirm all dimensions scored above threshold
+   - Mark design phase as complete
+   - Signal readiness for BUILD wave
+
+3. **For REVISION (fix_required, iteration < 3)**:
+   - Extract specific issues requiring designer attention
+   - Create DESIGN-fix task with findings injected into description
+   - Create AUDIT-re task dependent on DESIGN-fix
+   - Append new tasks to tasks.csv with incremented wave number
+
+4. **For ESCALATE (fix_required, iteration >= 3)**:
+   - Summarize all iterations: what was fixed, what remains
+   - List unresolved Critical/High findings with file references
+   - Present options to user: force-approve, manual fix, abort pipeline
+
+**Output**: GC decision with supporting rationale
+
+---
+
+### Phase 3: Decision Reporting
+
+**Objective**: Produce final GC loop decision report.
+
+**Steps**:
+
+1. Record decision in discoveries.ndjson with iteration context
+2. Update tasks.csv status for audit task if needed
+3. Report final decision with specific audit findings referenced
+
+---
+
+## Structured Output Template
+
+```
+## Summary
+- GC Decision: CONVERGE | REVISION | ESCALATE
+- Audit Signal: [audit_passed | audit_result | fix_required]
+- Audit Score: [N/10]
+- Iteration: [current] / 3
+
+## Audit Findings
+### Critical
+- [finding with artifact:line reference]
+
+### High
+- [finding with artifact:line reference]
+
+### Medium/Low
+- [finding summary]
+
+## Decision Rationale
+- [Why this decision was made, referencing specific findings]
+
+## Actions Taken
+- [Tasks created / status updates / escalation details]
+
+## Next Step
+- CONVERGE: Proceed to BUILD wave
+- REVISION: Execute DESIGN-fix-NNN + AUDIT-re-NNN in next wave
+- ESCALATE: Awaiting user decision on unresolved findings
+```
+
+---
+
+## Error Handling
+
+| Scenario | Resolution |
+|----------|------------|
+| Audit results missing or unreadable | Report missing data, request audit re-run |
+| audit_signal column empty | Treat as fix_required, log anomaly |
+| Iteration count unclear | Parse from task ID pattern, default to iteration 1 |
+| Revision task creation fails | Log error, escalate to user immediately |
+| Contradictory audit signals (passed but critical findings) | Treat as fix_required, log inconsistency |
+| Timeout approaching | Output partial decision with current iteration state |