feat: enhance workflow-tune skill with reference document extraction and intent matching improvements

feat: enhance workflow-tune skill with test requirements and analysis improvements
feat: enhance spec loading capabilities and add new categories
2026-03-27 20:00:44 +08:00 · 2026-03-20 15:27:05 +08:00 · 2026-03-20 15:11:32 +08:00 · 2026-03-20 15:06:57 +08:00 · 2026-03-20 14:59:57 +08:00 · 2026-03-20 14:55:47 +08:00
139 changed files with 4528 additions and 6575 deletions
--- a/.claude/CLAUDE.md
+++ b/.claude/CLAUDE.md
@@ -10,10 +10,6 @@
 **Strictly follow the cli-tools.json configuration**
 Available CLI endpoints are dynamically defined by the config file
 ## Tool Execution
 - **Context Requirements**: @~/.ccw/workflows/context-tools.md
 - **File Modification**: @~/.ccw/workflows/file-modification.md
 ### Agent Calls
 - **Always use `run_in_background: false`** for Agent tool calls: `Agent({ subagent_type: "xxx", prompt: "...", run_in_background: false })` to ensure synchronous execution and immediate result visibility
--- a/.claude/agents/cli-planning-agent.md
+++ b/.claude/agents/cli-planning-agent.md
@@ -31,6 +31,9 @@ If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool
 to load every file listed there before performing any other actions. This is your
 primary context.
 **Load Project Context** (from spec system):
 - Run: `ccw spec load --category test` for test framework context, coverage targets, and conventions
 **Core responsibilities:**
 - **FIRST: Execute CLI analysis** with appropriate templates and context
 - Parse structured results (fix strategies, root causes, modification points)
--- a/.claude/agents/debug-explore-agent.md
+++ b/.claude/agents/debug-explore-agent.md
@@ -36,6 +36,7 @@ Phase 5: Fix & Verification
 ## Phase 1: Bug Analysis
 **Load Project Context** (from spec system):
 - Load debug specs using: `ccw spec load --category debug` for known issues, workarounds, and root-cause notes
 - Load exploration specs using: `ccw spec load --category exploration` for tech stack context and coding constraints
 **Session Setup**:
--- a/.claude/agents/tdd-developer.md
+++ b/.claude/agents/tdd-developer.md
@@ -383,6 +383,7 @@ Bash(
 ### Context Loading (Inherited from code-developer)
 **Standard Context Sources**:
 - Test specs: Run `ccw spec load --category test` for test framework context, conventions, and coverage targets
 - Task JSON: `description`, `convergence.criteria`, `focus_paths`
 - Context Package: `context_package_path` → brainstorm artifacts, exploration results
 - Tech Stack: `meta.shared_context.tech_stack` (skip auto-detection if present)
--- a/.claude/agents/team-worker.md
+++ b/.claude/agents/team-worker.md
@@ -99,8 +99,8 @@ Execute on every loop iteration:
   - Subject starts with this role's `prefix` + `-` (e.g., `DRAFT-`, `IMPL-`)
   - Status is `pending`
   - `blockedBy` list is empty (all dependencies resolved)
   - **Owner matches** `agent_name` from prompt (e.g., task owner "explorer-1" matches agent_name "explorer-1"). This prevents parallel workers from claiming each other's tasks.
   - If role has `additional_prefixes` (e.g., reviewer handles REVIEW-* + QUALITY-* + IMPROVE-*), check all prefixes
   - **NOTE**: Do NOT filter by owner name. The system appends numeric suffixes to agent names (e.g., `profiler` → `profiler-4`), making exact owner matching unreliable. Prefix-based filtering is sufficient to prevent cross-role task claiming.
 3. **No matching tasks?**
   - If first iteration → report idle, SendMessage "No tasks found for [role]", STOP
   - If inner loop continuation → proceed to Phase 5-F (all done)
--- a/.claude/agents/test-action-planning-agent.md
+++ b/.claude/agents/test-action-planning-agent.md
@@ -52,6 +52,9 @@ Read("d:\Claude_dms3\.claude\agents\action-planning-agent.md")
 ```
 <!-- TODO: verify mandatory read path -->
 **Load Project Context** (from spec system):
 - Run: `ccw spec load --category test` for test framework, coverage targets, and conventions
 </role>
 <test_specification_reference>
--- a/.claude/agents/test-context-search-agent.md
+++ b/.claude/agents/test-context-search-agent.md
@@ -25,6 +25,7 @@ You are a test context discovery specialist focused on gathering test coverage i
 **Mandatory Initial Read:**
 - Project `CLAUDE.md` for coding standards and conventions
 - Test session metadata (`workflow-session.json`) for session context
 - Run: `ccw spec load --category test` for test framework, coverage targets, and conventions
 **Core Responsibilities:**
 - Coverage-first analysis of existing tests
--- a/.claude/agents/test-fix-agent.md
+++ b/.claude/agents/test-fix-agent.md
@@ -34,6 +34,9 @@ If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool
 to load every file listed there before performing any other actions. This is your
 primary context.
 **Load Project Context** (from spec system):
 - Run: `ccw spec load --category test` for test framework, coverage targets, and conventions
 ## Core Philosophy
 **"Tests Are the Review"** - When all tests pass across all layers, the code is approved and ready. No separate review process is needed.
--- a/.claude/skills/review-code/SKILL.md
+++ b/.claude/skills/review-code/SKILL.md
@@ -40,6 +40,8 @@ Multi-dimensional code review skill that analyzes code across 6 key dimensions a
 └─────────────────────────────────────────────────────────────────┘
 ```
 **Project Context**: Run `ccw spec load --category review` for review standards, checklists, and approval gates.
 ## Key Design Principles
 1. **多维度审查**: 覆盖正确性、可读性、性能、安全性、测试覆盖、架构一致性六大维度
--- a/.claude/skills/review-cycle/SKILL.md
+++ b/.claude/skills/review-cycle/SKILL.md
@@ -98,6 +98,10 @@ Skill(skill="review-cycle", args="-y src/auth/**")
 | module | [phases/review-module.md](phases/review-module.md) | review-module-cycle.md | Module-based review: path patterns → 7-dimension parallel analysis → aggregation → deep-dive → completion |
 | fix | [phases/review-fix.md](phases/review-fix.md) | review-cycle-fix.md | Automated fix: export file → intelligent batching → parallel planning → execution → completion |
 ## Project Context
 Run `ccw spec load --category review` for review standards, checklists, and approval gates.
 ## Core Rules
 1. **Mode Detection First**: Parse input to determine session/module/fix mode before anything else
--- a/.claude/skills/skill-simplify/phases/02-optimize.md
+++ b/.claude/skills/skill-simplify/phases/02-optimize.md
@@ -9,6 +9,10 @@ Apply simplification rules from analysisResult to produce optimized content. Wri
 - Fix pseudo-code format issues
 - Write optimized content back to target file
 ## Pre-Step: Load Context
 Run `ccw spec load --category validation` for verification rules and acceptance criteria to validate optimization preserves functional integrity.
 ## Execution
 ### Step 2.1: Apply Operations in Order
--- a/.claude/skills/spec-generator/phases/05-epics-stories.md
+++ b/.claude/skills/spec-generator/phases/05-epics-stories.md
@@ -19,6 +19,10 @@ Decompose the specification into executable Epics and Stories with dependency ma
 ## Execution Steps
 ### Step 0: Load Validation Context
 Run `ccw spec load --category validation` for verification rules and acceptance criteria to validate epic decomposition.
 ### Step 1: Load Phase 2-4 Context
 ```javascript
--- a/.claude/skills/team-arch-opt/role-specs/analyzer.md
+++ b/.claude/skills/team-arch-opt/role-specs/analyzer.md
@@ -1,80 +0,0 @@
 ---
 prefix: ANALYZE
 inner_loop: false
 cli_tools: [explore]
 message_types:
  success: analyze_complete
  error: error
 ---
 # Architecture Analyzer
 Analyze codebase architecture to identify structural issues: dependency cycles, coupling/cohesion problems, layering violations, God Classes, code duplication, dead code, and API surface bloat. Produce quantified baseline metrics and a ranked architecture report.
 ## Phase 2: Context & Environment Detection
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
 1. Extract session path and target scope from task description
 2. Detect project type by scanning for framework markers:
 | Signal File | Project Type | Analysis Focus |
 |-------------|-------------|----------------|
 | package.json + React/Vue/Angular | Frontend | Component tree, prop drilling, state management, barrel exports |
 | package.json + Express/Fastify/NestJS | Backend Node | Service layer boundaries, middleware chains, DB access patterns |
 | Cargo.toml / go.mod / pom.xml | Native/JVM Backend | Module boundaries, trait/interface usage, dependency injection |
 | Mixed framework markers | Full-stack / Monorepo | Cross-package dependencies, shared types, API contracts |
 | CLI entry / bin/ directory | CLI Tool | Command structure, plugin architecture, configuration layering |
 | No detection | Generic | All architecture dimensions |
 3. Use `explore` CLI tool to map module structure, dependency graph, and layer boundaries within target scope
 4. Detect available analysis tools (linters, dependency analyzers, build tools)
 ## Phase 3: Architecture Analysis
 Execute analysis based on detected project type:
 **Dependency analysis**:
 - Build import/require graph across modules
 - Detect circular dependencies (direct and transitive cycles)
 - Identify layering violations (e.g., UI importing from data layer, utils importing from domain)
 - Calculate fan-in/fan-out per module (high fan-out = fragile hub, high fan-in = tightly coupled)
 **Structural analysis**:
 - Identify God Classes / God Modules (> 500 LOC, > 10 public methods, too many responsibilities)
 - Calculate coupling metrics (afferent/efferent coupling per module)
 - Calculate cohesion metrics (LCOM -- Lack of Cohesion of Methods)
 - Detect code duplication (repeated logic blocks, copy-paste patterns)
 - Identify missing abstractions (repeated conditionals, switch-on-type patterns)
 **API surface analysis**:
 - Count exported symbols per module (export bloat detection)
 - Identify dead exports (exported but never imported elsewhere)
 - Detect dead code (unreachable functions, unused variables, orphan files)
 - Check for pattern inconsistencies (mixed naming conventions, inconsistent error handling)
 **All project types**:
 - Collect quantified architecture baseline metrics (dependency count, cycle count, coupling scores, LOC distribution)
 - Rank top 3-7 architecture issues by severity (Critical / High / Medium)
 - Record evidence: file paths, line numbers, measured values
 ## Phase 4: Report Generation
 1. Write architecture baseline to `<session>/artifacts/architecture-baseline.json`:
   - Module count, dependency count, cycle count, average coupling, average cohesion
   - God Class candidates with LOC and method count
   - Dead code file count, dead export count
   - Timestamp and project type details
 2. Write architecture report to `<session>/artifacts/architecture-report.md`:
   - Ranked list of architecture issues with severity, location (file:line or module), measured impact
   - Issue categories: CYCLE, COUPLING, COHESION, GOD_CLASS, DUPLICATION, LAYER_VIOLATION, DEAD_CODE, API_BLOAT
   - Evidence summary per issue
   - Detected project type and analysis methods used
 3. Update `<session>/wisdom/.msg/meta.json` under `analyzer` namespace:
   - Read existing -> merge `{ "analyzer": { project_type, issue_count, top_issue, scope, categories } }` -> write back
--- a/.claude/skills/team-arch-opt/role-specs/designer.md
+++ b/.claude/skills/team-arch-opt/role-specs/designer.md
@@ -1,118 +0,0 @@
 ---
 prefix: DESIGN
 inner_loop: false
 discuss_rounds: [DISCUSS-REFACTOR]
 cli_tools: [discuss]
 message_types:
  success: design_complete
  error: error
 ---
 # Refactoring Designer
 Analyze architecture reports and baseline metrics to design a prioritized refactoring plan with concrete strategies, expected structural improvements, and risk assessments.
 ## Phase 2: Analysis Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Architecture report | <session>/artifacts/architecture-report.md | Yes |
 | Architecture baseline | <session>/artifacts/architecture-baseline.json | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
 | Wisdom files | <session>/wisdom/patterns.md | No |
 1. Extract session path from task description
 2. Read architecture report -- extract ranked issue list with severities and categories
 3. Read architecture baseline -- extract current structural metrics
 4. Load .msg/meta.json for analyzer findings (project_type, scope)
 5. Assess overall refactoring complexity:
 | Issue Count | Severity Mix | Complexity |
 |-------------|-------------|------------|
 | 1-2 | All Medium | Low |
 | 2-3 | Mix of High/Medium | Medium |
 | 3+ or any Critical | Any Critical present | High |
 ## Phase 3: Strategy Formulation
 For each architecture issue, select refactoring approach by type:
 | Issue Type | Strategies | Risk Level |
 |------------|-----------|------------|
 | Circular dependency | Interface extraction, dependency inversion, mediator pattern | High |
 | God Class/Module | SRP decomposition, extract class/module, delegate pattern | High |
 | Layering violation | Move to correct layer, introduce Facade, add anti-corruption layer | Medium |
 | Code duplication | Extract shared utility/base class, template method pattern | Low |
 | High coupling | Introduce interface/abstraction, dependency injection, event-driven | Medium |
 | API bloat / dead exports | Privatize internals, re-export only public API, barrel file cleanup | Low |
 | Dead code | Safe removal with reference verification | Low |
 | Missing abstraction | Extract interface/type, introduce strategy/factory pattern | Medium |
 Prioritize refactorings by impact/effort ratio:
 | Priority | Criteria |
 |----------|----------|
 | P0 (Critical) | High impact + Low effort -- quick wins (dead code removal, simple moves) |
 | P1 (High) | High impact + Medium effort (cycle breaking, layer fixes) |
 | P2 (Medium) | Medium impact + Low effort (duplication extraction) |
 | P3 (Low) | Low impact or High effort -- defer (large God Class decomposition) |
 If complexity is High, invoke `discuss` CLI tool (DISCUSS-REFACTOR round) to evaluate trade-offs between competing strategies before finalizing the plan.
 Define measurable success criteria per refactoring (target metric improvement or structural change).
 ## Phase 4: Plan Output
 1. Write refactoring plan to `<session>/artifacts/refactoring-plan.md`:
   Each refactoring MUST have a unique REFACTOR-ID and self-contained detail block:
   ```markdown
   ### REFACTOR-001: <title>
   - Priority: P0
   - Target issue: <issue from report>
   - Issue type: <CYCLE|COUPLING|GOD_CLASS|DUPLICATION|LAYER_VIOLATION|DEAD_CODE|API_BLOAT>
   - Target files: <file-list>
   - Strategy: <selected approach>
   - Expected improvement: <metric> by <description>
   - Risk level: <Low/Medium/High>
   - Success criteria: <specific structural change to verify>
   - Implementation guidance:
     1. <step 1>
     2. <step 2>
     3. <step 3>
   ### REFACTOR-002: <title>
   ...
   ```
   Requirements:
   - Each REFACTOR-ID is sequentially numbered (REFACTOR-001, REFACTOR-002, ...)
   - Each refactoring must be **non-overlapping** in target files (no two REFACTOR-IDs modify the same file unless explicitly noted with conflict resolution)
   - Implementation guidance must be self-contained -- a branch refactorer should be able to work from a single REFACTOR block without reading others
 2. Update `<session>/wisdom/.msg/meta.json` under `designer` namespace:
   - Read existing -> merge -> write back:
   ```json
   {
     "designer": {
       "complexity": "<Low|Medium|High>",
       "refactoring_count": 4,
       "priorities": ["P0", "P0", "P1", "P2"],
       "discuss_used": false,
       "refactorings": [
         {
           "id": "REFACTOR-001",
           "title": "<title>",
           "issue_type": "<CYCLE|COUPLING|...>",
           "priority": "P0",
           "target_files": ["src/a.ts", "src/b.ts"],
           "expected_improvement": "<metric> by <description>",
           "success_criteria": "<threshold>"
         }
       ]
     }
   }
   ```
 3. If DISCUSS-REFACTOR was triggered, record discussion summary in `<session>/discussions/DISCUSS-REFACTOR.md`
--- a/.claude/skills/team-arch-opt/role-specs/refactorer.md
+++ b/.claude/skills/team-arch-opt/role-specs/refactorer.md
@@ -1,106 +0,0 @@
 ---
 prefix: REFACTOR
 inner_loop: true
 additional_prefixes: [FIX]
 cli_tools: [explore]
 message_types:
  success: refactor_complete
  error: error
  fix: fix_required
 ---
 # Code Refactorer
 Implement architecture refactoring changes following the design plan. For FIX tasks, apply targeted corrections based on review/validation feedback.
 ## Modes
 | Mode | Task Prefix | Trigger | Focus |
 |------|-------------|---------|-------|
 | Refactor | REFACTOR | Design plan ready | Apply refactorings per plan priority |
 | Fix | FIX | Review/validation feedback | Targeted fixes for identified issues |
 ## Phase 2: Plan & Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Refactoring plan | <session>/artifacts/refactoring-plan.md | Yes (REFACTOR, no branch) |
 | Branch refactoring detail | <session>/artifacts/branches/B{NN}/refactoring-detail.md | Yes (REFACTOR with branch) |
 | Pipeline refactoring plan | <session>/artifacts/pipelines/{P}/refactoring-plan.md | Yes (REFACTOR with pipeline) |
 | Review/validation feedback | From task description | Yes (FIX) |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
 | Wisdom files | <session>/wisdom/patterns.md | No |
 | Context accumulator | From prior REFACTOR/FIX tasks | Yes (inner loop) |
 1. Extract session path and task mode (REFACTOR or FIX) from task description
 2. **Detect branch/pipeline context** from task description:
 | Task Description Field | Value | Context |
 |----------------------|-------|---------
 | `BranchId: B{NN}` | Present | Fan-out branch -- load single refactoring detail |
 | `PipelineId: {P}` | Present | Independent pipeline -- load pipeline-scoped plan |
 | Neither present | - | Single mode -- load full refactoring plan |
 3. **Load refactoring context by mode**:
   - **Single mode (no branch)**: Read `<session>/artifacts/refactoring-plan.md` -- extract ALL priority-ordered changes
   - **Fan-out branch**: Read `<session>/artifacts/branches/B{NN}/refactoring-detail.md` -- extract ONLY this branch's refactoring (single REFACTOR-ID)
   - **Independent pipeline**: Read `<session>/artifacts/pipelines/{P}/refactoring-plan.md` -- extract this pipeline's plan
 4. For FIX: parse review/validation feedback for specific issues to address
 5. Use `explore` CLI tool to load implementation context for target files
 6. For inner loop (single mode only): load context_accumulator from prior REFACTOR/FIX tasks
 **Meta.json namespace**:
 - Single: write to `refactorer` namespace
 - Fan-out: write to `refactorer.B{NN}` namespace
 - Independent: write to `refactorer.{P}` namespace
 ## Phase 3: Code Implementation
 Implementation backend selection:
 | Backend | Condition | Method |
 |---------|-----------|--------|
 | CLI | Multi-file refactoring with clear plan | ccw cli --tool gemini --mode write |
 | Direct | Single-file changes or targeted fixes | Inline Edit/Write tools |
 For REFACTOR tasks:
 - **Single mode**: Apply refactorings in plan priority order (P0 first, then P1, etc.)
 - **Fan-out branch**: Apply ONLY this branch's single refactoring (from refactoring-detail.md)
 - **Independent pipeline**: Apply this pipeline's refactorings in priority order
 - Follow implementation guidance from plan (target files, patterns)
 - **Preserve existing behavior -- refactoring must not change functionality**
 - **Update ALL import references** when moving/renaming modules
 - **Update ALL test files** that reference moved/renamed symbols
 For FIX tasks:
 - Read specific issues from review/validation feedback
 - Apply targeted corrections to flagged code locations
 - Verify the fix addresses the exact concern raised
 General rules:
 - Make minimal, focused changes per refactoring
 - Add comments only where refactoring logic is non-obvious
 - Preserve existing code style and conventions
 - Verify no dangling imports after module moves
 ## Phase 4: Self-Validation
 | Check | Method | Pass Criteria |
 |-------|--------|---------------|
 | Syntax | IDE diagnostics or build check | No new errors |
 | File integrity | Verify all planned files exist and are modified | All present |
 | Import integrity | Verify no broken imports after moves | All imports resolve |
 | Acceptance | Match refactoring plan success criteria | All structural changes applied |
 | No regression | Run existing tests if available | No new failures |
 If validation fails, attempt auto-fix (max 2 attempts) before reporting error.
 Append to context_accumulator for next REFACTOR/FIX task (single/inner-loop mode only):
 - Files modified, refactorings applied, validation results
 - Any discovered patterns or caveats for subsequent iterations
 **Branch output paths**:
 - Single: write artifacts to `<session>/artifacts/`
 - Fan-out: write artifacts to `<session>/artifacts/branches/B{NN}/`
 - Independent: write artifacts to `<session>/artifacts/pipelines/{P}/`
--- a/.claude/skills/team-arch-opt/role-specs/reviewer.md
+++ b/.claude/skills/team-arch-opt/role-specs/reviewer.md
@@ -1,116 +0,0 @@
 ---
 prefix: REVIEW
 inner_loop: false
 additional_prefixes: [QUALITY]
 discuss_rounds: [DISCUSS-REVIEW]
 cli_tools: [discuss]
 message_types:
  success: review_complete
  error: error
  fix: fix_required
 ---
 # Architecture Reviewer
 Review refactoring code changes for correctness, pattern consistency, completeness, migration safety, and adherence to best practices. Provide structured verdicts with actionable feedback.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Refactoring code changes | From REFACTOR task artifacts / git diff | Yes |
 | Refactoring plan / detail | Varies by mode (see below) | Yes |
 | Validation results | Varies by mode (see below) | No |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
 1. Extract session path from task description
 2. **Detect branch/pipeline context** from task description:
 | Task Description Field | Value | Context |
 |----------------------|-------|---------
 | `BranchId: B{NN}` | Present | Fan-out branch -- review only this branch's changes |
 | `PipelineId: {P}` | Present | Independent pipeline -- review pipeline-scoped changes |
 | Neither present | - | Single mode -- review all refactoring changes |
 3. **Load refactoring context by mode**:
   - Single: Read `<session>/artifacts/refactoring-plan.md`
   - Fan-out branch: Read `<session>/artifacts/branches/B{NN}/refactoring-detail.md`
   - Independent: Read `<session>/artifacts/pipelines/{P}/refactoring-plan.md`
 4. Load .msg/meta.json for scoped refactorer namespace:
   - Single: `refactorer` namespace
   - Fan-out: `refactorer.B{NN}` namespace
   - Independent: `refactorer.{P}` namespace
 5. Identify changed files from refactorer context -- read ONLY files modified by this branch/pipeline
 6. If validation results available, read from scoped path:
   - Single: `<session>/artifacts/validation-results.json`
   - Fan-out: `<session>/artifacts/branches/B{NN}/validation-results.json`
   - Independent: `<session>/artifacts/pipelines/{P}/validation-results.json`
 ## Phase 3: Multi-Dimension Review
 Analyze refactoring changes across five dimensions:
 | Dimension | Focus | Severity |
 |-----------|-------|----------|
 | Correctness | No behavior changes, all references updated, no dangling imports | Critical |
 | Pattern consistency | Follows existing patterns, naming consistent, language-idiomatic | High |
 | Completeness | All related code updated (imports, tests, config, documentation) | High |
 | Migration safety | No dangling references, backward compatible, public API preserved | Critical |
 | Best practices | Clean Architecture / SOLID principles, appropriate abstraction level | Medium |
 Per-dimension review process:
 - Scan modified files for patterns matching each dimension
 - Record findings with severity (Critical / High / Medium / Low)
 - Include specific file:line references and suggested fixes
 **Correctness checks**:
 - Verify moved code preserves original behavior (no logic changes mixed with structural changes)
 - Check all import/require statements updated to new paths
 - Verify no orphaned files left behind after moves
 **Pattern consistency checks**:
 - New module names follow existing naming conventions
 - Extracted interfaces/classes use consistent patterns with existing codebase
 - File organization matches project conventions (e.g., index files, barrel exports)
 **Completeness checks**:
 - All test files updated for moved/renamed modules
 - Configuration files updated if needed (e.g., path aliases, build configs)
 - Type definitions updated for extracted interfaces
 **Migration safety checks**:
 - Public API surface unchanged (same exports available to consumers)
 - No circular dependencies introduced by the refactoring
 - Re-exports in place if module paths changed for backward compatibility
 **Best practices checks**:
 - Extracted modules have clear single responsibility
 - Dependency direction follows layer conventions (dependencies flow inward)
 - Appropriate abstraction level (not over-engineered, not under-abstracted)
 If any Critical findings detected, invoke `discuss` CLI tool (DISCUSS-REVIEW round) to validate the assessment before issuing verdict.
 ## Phase 4: Verdict & Feedback
 Classify overall verdict based on findings:
 | Verdict | Condition | Action |
 |---------|-----------|--------|
 | APPROVE | No Critical or High findings | Send review_complete |
 | REVISE | Has High findings, no Critical | Send fix_required with detailed feedback |
 | REJECT | Has Critical findings or fundamental approach flaw | Send fix_required + flag for designer escalation |
 1. Write review report to scoped output path:
   - Single: `<session>/artifacts/review-report.md`
   - Fan-out: `<session>/artifacts/branches/B{NN}/review-report.md`
   - Independent: `<session>/artifacts/pipelines/{P}/review-report.md`
   - Content: Per-dimension findings with severity, file:line, description; Overall verdict with rationale; Specific fix instructions for REVISE/REJECT verdicts
 2. Update `<session>/wisdom/.msg/meta.json` under scoped namespace:
   - Single: merge `{ "reviewer": { verdict, finding_count, critical_count, dimensions_reviewed } }`
   - Fan-out: merge `{ "reviewer.B{NN}": { verdict, finding_count, critical_count, dimensions_reviewed } }`
   - Independent: merge `{ "reviewer.{P}": { verdict, finding_count, critical_count, dimensions_reviewed } }`
 3. If DISCUSS-REVIEW was triggered, record discussion summary in `<session>/discussions/DISCUSS-REVIEW.md` (or `DISCUSS-REVIEW-B{NN}.md` for branch-scoped discussions)
--- a/.claude/skills/team-arch-opt/role-specs/validator.md
+++ b/.claude/skills/team-arch-opt/role-specs/validator.md
@@ -1,117 +0,0 @@
 ---
 prefix: VALIDATE
 inner_loop: false
 message_types:
  success: validate_complete
  error: error
  fix: fix_required
 ---
 # Architecture Validator
 Validate refactoring changes by running build checks, test suites, dependency metric comparisons, and API compatibility verification. Ensure refactoring improves architecture without breaking functionality.
 ## Phase 2: Environment & Baseline Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Architecture baseline | <session>/artifacts/architecture-baseline.json (shared) | Yes |
 | Refactoring plan / detail | Varies by mode (see below) | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
 1. Extract session path from task description
 2. **Detect branch/pipeline context** from task description:
 | Task Description Field | Value | Context |
 |----------------------|-------|---------
 | `BranchId: B{NN}` | Present | Fan-out branch -- validate only this branch's changes |
 | `PipelineId: {P}` | Present | Independent pipeline -- use pipeline-scoped baseline |
 | Neither present | - | Single mode -- full validation |
 3. **Load architecture baseline**:
   - Single / Fan-out: Read `<session>/artifacts/architecture-baseline.json` (shared baseline)
   - Independent: Read `<session>/artifacts/pipelines/{P}/architecture-baseline.json`
 4. **Load refactoring context**:
   - Single: Read `<session>/artifacts/refactoring-plan.md` -- all success criteria
   - Fan-out branch: Read `<session>/artifacts/branches/B{NN}/refactoring-detail.md` -- only this branch's criteria
   - Independent: Read `<session>/artifacts/pipelines/{P}/refactoring-plan.md`
 5. Load .msg/meta.json for project type and refactoring scope
 6. Detect available validation tools from project:
 | Signal | Validation Tool | Method |
 |--------|----------------|--------|
 | package.json + tsc | TypeScript compiler | Type-check entire project |
 | package.json + vitest/jest | Test runner | Run existing test suite |
 | package.json + eslint | Linter | Run lint checks for import/export issues |
 | Cargo.toml | Rust compiler | cargo check + cargo test |
 | go.mod | Go tools | go build + go test |
 | Makefile with test target | Custom tests | make test |
 | No tooling detected | Manual validation | File existence + import grep checks |
 7. Get changed files scope from .msg/meta.json:
   - Single: `refactorer` namespace
   - Fan-out: `refactorer.B{NN}` namespace
   - Independent: `refactorer.{P}` namespace
 ## Phase 3: Validation Execution
 Run validations across four dimensions:
 **Build validation**:
 - Compile/type-check the project -- zero new errors allowed
 - Verify all moved/renamed files are correctly referenced
 - Check for missing imports or unresolved modules
 **Test validation**:
 - Run existing test suite -- all previously passing tests must still pass
 - Identify any tests that need updating due to module moves (update, don't skip)
 - Check for test file imports that reference old paths
 **Dependency metric validation**:
 - Recalculate architecture metrics post-refactoring
 - Compare coupling scores against baseline (must improve or stay neutral)
 - Verify no new circular dependencies introduced
 - Check cohesion metrics for affected modules
 **API compatibility validation**:
 - Verify public API signatures are preserved (exported function/class/type names)
 - Check for dangling references (imports pointing to removed/moved files)
 - Verify no new dead exports introduced by the refactoring
 - Check that re-exports maintain backward compatibility where needed
 **Branch-scoped validation** (fan-out mode):
 - Only validate metrics relevant to this branch's refactoring (from refactoring-detail.md)
 - Still check for regressions across all metrics (not just branch-specific ones)
 ## Phase 4: Result Analysis
 Compare against baseline and plan criteria:
 | Metric | Threshold | Verdict |
 |--------|-----------|---------|
 | Build passes | Zero compilation errors | PASS |
 | All tests pass | No new test failures | PASS |
 | Coupling improved or neutral | No metric degradation > 5% | PASS |
 | No new cycles introduced | Cycle count <= baseline | PASS |
 | All plan success criteria met | Every criterion satisfied | PASS |
 | Partial improvement | Some metrics improved, none degraded | WARN |
 | Build fails | Compilation errors detected | FAIL -> fix_required |
 | Test failures | Previously passing tests now fail | FAIL -> fix_required |
 | New cycles introduced | Cycle count > baseline | FAIL -> fix_required |
 | Dangling references | Unresolved imports detected | FAIL -> fix_required |
 1. Write validation results to output path:
   - Single: `<session>/artifacts/validation-results.json`
   - Fan-out: `<session>/artifacts/branches/B{NN}/validation-results.json`
   - Independent: `<session>/artifacts/pipelines/{P}/validation-results.json`
   - Content: Per-dimension: name, baseline value, current value, improvement/regression, verdict; Overall verdict: PASS / WARN / FAIL; Failure details (if any)
 2. Update `<session>/wisdom/.msg/meta.json` under scoped namespace:
   - Single: merge `{ "validator": { verdict, improvements, regressions, build_pass, test_pass } }`
   - Fan-out: merge `{ "validator.B{NN}": { verdict, improvements, regressions, build_pass, test_pass } }`
   - Independent: merge `{ "validator.{P}": { verdict, improvements, regressions, build_pass, test_pass } }`
 3. If verdict is FAIL, include detailed feedback in message for FIX task creation:
   - Which validations failed, specific errors, suggested investigation areas
--- a/.claude/skills/team-arch-opt/specs/team-config.json
+++ b/.claude/skills/team-arch-opt/specs/team-config.json
@@ -23,7 +23,7 @@
      "name": "analyzer",
      "type": "orchestration",
      "description": "Analyzes architecture: dependency graphs, coupling/cohesion, layering violations, God Classes, dead code",
-      "role_spec": "role-specs/analyzer.md",
+      "role_spec": "roles/analyzer/role.md",
      "inner_loop": false,
      "frontmatter": {
        "prefix": "ANALYZE",
@@ -43,7 +43,7 @@
      "name": "designer",
      "type": "orchestration",
      "description": "Designs refactoring strategies from architecture analysis, produces prioritized refactoring plan with discrete REFACTOR-IDs",
-      "role_spec": "role-specs/designer.md",
+      "role_spec": "roles/designer/role.md",
      "inner_loop": false,
      "frontmatter": {
        "prefix": "DESIGN",
@@ -63,7 +63,7 @@
      "name": "refactorer",
      "type": "code_generation",
      "description": "Implements architecture refactoring changes following the design plan",
-      "role_spec": "role-specs/refactorer.md",
+      "role_spec": "roles/refactorer/role.md",
      "inner_loop": true,
      "frontmatter": {
        "prefix": "REFACTOR",
@@ -84,7 +84,7 @@
      "name": "validator",
      "type": "validation",
      "description": "Validates refactoring: build checks, test suites, dependency metrics, API compatibility",
-      "role_spec": "role-specs/validator.md",
+      "role_spec": "roles/validator/role.md",
      "inner_loop": false,
      "frontmatter": {
        "prefix": "VALIDATE",
@@ -105,7 +105,7 @@
      "name": "reviewer",
      "type": "read_only_analysis",
      "description": "Reviews refactoring code for correctness, pattern consistency, completeness, migration safety, and best practices",
-      "role_spec": "role-specs/reviewer.md",
+      "role_spec": "roles/reviewer/role.md",
      "inner_loop": false,
      "frontmatter": {
        "prefix": "REVIEW",
--- a/.claude/skills/team-brainstorm/role-specs/challenger.md
+++ b/.claude/skills/team-brainstorm/role-specs/challenger.md
@@ -1,63 +0,0 @@
 ---
 prefix: CHALLENGE
 inner_loop: false
 delegates_to: []
 message_types:
  success: critique_ready
  error: error
 ---
 # Challenger
 Devil's advocate role. Assumption challenging, feasibility questioning, risk identification. Acts as the Critic in the Generator-Critic loop.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Session folder | Task description (Session: line) | Yes |
 | Ideas | <session>/ideas/*.md files | Yes |
 | Previous critiques | <session>/.msg/meta.json critique_insights | No |
 1. Extract session path from task description (match "Session: <path>")
 2. Glob idea files from <session>/ideas/
 3. Read all idea files for analysis
 4. Read .msg/meta.json critique_insights to avoid repeating past challenges
 ## Phase 3: Critical Analysis
 **Challenge Dimensions** (apply to each idea):
 | Dimension | Focus |
 |-----------|-------|
 | Assumption Validity | Does the core assumption hold? Counter-examples? |
 | Feasibility | Technical/resource/time feasibility? |
 | Risk Assessment | Worst case scenario? Hidden risks? |
 | Competitive Analysis | Better alternatives already exist? |
 **Severity Classification**:
 | Severity | Criteria |
 |----------|----------|
 | CRITICAL | Fundamental issue, idea may need replacement |
 | HIGH | Significant flaw, requires revision |
 | MEDIUM | Notable weakness, needs consideration |
 | LOW | Minor concern, does not invalidate the idea |
 **Generator-Critic Signal**:
 | Condition | Signal |
 |-----------|--------|
 | Any CRITICAL or HIGH severity | REVISION_NEEDED |
 | All MEDIUM or lower | CONVERGED |
 **Output**: Write to `<session>/critiques/critique-<num>.md`
 - Sections: Ideas Reviewed, Per-idea challenges with severity, Summary table with counts, GC Signal
 ## Phase 4: Severity Summary
 1. Count challenges by severity level
 2. Determine signal: REVISION_NEEDED if critical+high > 0, else CONVERGED
 3. Update shared state:
   - Append challenges to .msg/meta.json critique_insights
   - Each entry: idea, severity, key_challenge, round
--- a/.claude/skills/team-brainstorm/role-specs/evaluator.md
+++ b/.claude/skills/team-brainstorm/role-specs/evaluator.md
@@ -1,58 +0,0 @@
 ---
 prefix: EVAL
 inner_loop: false
 delegates_to: []
 message_types:
  success: evaluation_ready
  error: error
 ---
 # Evaluator
 Scoring, ranking, and final selection. Multi-dimension evaluation of synthesized proposals with weighted scoring and priority recommendations.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Session folder | Task description (Session: line) | Yes |
 | Synthesis results | <session>/synthesis/*.md files | Yes |
 | All ideas | <session>/ideas/*.md files | No (for context) |
 | All critiques | <session>/critiques/*.md files | No (for context) |
 1. Extract session path from task description (match "Session: <path>")
 2. Glob synthesis files from <session>/synthesis/
 3. Read all synthesis files for evaluation
 4. Optionally read ideas and critiques for full context
 ## Phase 3: Evaluation and Scoring
 **Scoring Dimensions**:
 | Dimension | Weight | Focus |
 |-----------|--------|-------|
 | Feasibility | 30% | Technical feasibility, resource needs, timeline |
 | Innovation | 25% | Novelty, differentiation, breakthrough potential |
 | Impact | 25% | Scope of impact, value creation, problem resolution |
 | Cost Efficiency | 20% | Implementation cost, risk cost, opportunity cost |
 **Weighted Score**: `(Feasibility * 0.30) + (Innovation * 0.25) + (Impact * 0.25) + (Cost * 0.20)`
 **Per-Proposal Evaluation**:
 - Score each dimension (1-10) with rationale
 - Overall recommendation: Strong Recommend / Recommend / Consider / Pass
 **Output**: Write to `<session>/evaluation/evaluation-<num>.md`
 - Sections: Input summary, Scoring Matrix (ranked table), Detailed Evaluation per proposal, Final Recommendation, Action Items, Risk Summary
 ## Phase 4: Consistency Check
 | Check | Pass Criteria | Action on Failure |
 |-------|---------------|-------------------|
 | Score spread | max - min >= 0.5 (with >1 proposal) | Re-evaluate differentiators |
 | No perfect scores | Not all 10s | Adjust to reflect critique findings |
 | Ranking deterministic | Consistent ranking | Verify calculation |
 After passing checks, update shared state:
 - Set .msg/meta.json evaluation_scores
 - Each entry: title, weighted_score, rank, recommendation
--- a/.claude/skills/team-brainstorm/role-specs/ideator.md
+++ b/.claude/skills/team-brainstorm/role-specs/ideator.md
@@ -1,71 +0,0 @@
 ---
 prefix: IDEA
 inner_loop: false
 delegates_to: []
 message_types:
  success: ideas_ready
  error: error
 ---
 # Ideator
 Multi-angle idea generator. Divergent thinking, concept exploration, and idea revision as the Generator in the Generator-Critic loop.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Session folder | Task description (Session: line) | Yes |
 | Topic | <session>/.msg/meta.json | Yes |
 | Angles | <session>/.msg/meta.json | Yes |
 | GC Round | <session>/.msg/meta.json | Yes |
 | Previous critique | <session>/critiques/*.md | For revision tasks only |
 | Previous ideas | <session>/.msg/meta.json generated_ideas | No |
 1. Extract session path from task description (match "Session: <path>")
 2. Read .msg/meta.json for topic, angles, gc_round
 3. Detect task mode:
 | Condition | Mode |
 |-----------|------|
 | Task subject contains "revision" or "fix" | GC Revision |
 | Otherwise | Initial Generation |
 4. If GC Revision mode:
   - Glob critique files from <session>/critiques/
   - Read latest critique for revision context
 5. Read previous ideas from .msg/meta.json generated_ideas state
 ## Phase 3: Idea Generation
 ### Mode Router
 | Mode | Focus |
 |------|-------|
 | Initial Generation | Multi-angle divergent thinking, no prior critique |
 | GC Revision | Address HIGH/CRITICAL challenges from critique |
 **Initial Generation**:
 - For each angle, generate 3+ ideas
 - Each idea: title, description (2-3 sentences), key assumption, potential impact, implementation hint
 **GC Revision**:
 - Focus on HIGH/CRITICAL severity challenges from critique
 - Retain unchallenged ideas intact
 - Revise ideas with revision rationale
 - Replace unsalvageable ideas with new alternatives
 **Output**: Write to `<session>/ideas/idea-<num>.md`
 - Sections: Topic, Angles, Mode, [Revision Context if applicable], Ideas list, Summary
 ## Phase 4: Self-Review
 | Check | Pass Criteria | Action on Failure |
 |-------|---------------|-------------------|
 | Minimum count | >= 6 (initial) or >= 3 (revision) | Generate additional ideas |
 | No duplicates | All titles unique | Replace duplicates |
 | Angle coverage | At least 1 idea per angle | Generate missing angle ideas |
 After passing checks, update shared state:
 - Append new ideas to .msg/meta.json generated_ideas
 - Each entry: id, title, round, revised flag
--- a/.claude/skills/team-brainstorm/role-specs/synthesizer.md
+++ b/.claude/skills/team-brainstorm/role-specs/synthesizer.md
@@ -1,59 +0,0 @@
 ---
 prefix: SYNTH
 inner_loop: false
 delegates_to: []
 message_types:
  success: synthesis_ready
  error: error
 ---
 # Synthesizer
 Cross-idea integrator. Extracts themes from multiple ideas and challenge feedback, resolves conflicts, generates consolidated proposals.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Session folder | Task description (Session: line) | Yes |
 | All ideas | <session>/ideas/*.md files | Yes |
 | All critiques | <session>/critiques/*.md files | Yes |
 | GC rounds completed | <session>/.msg/meta.json gc_round | Yes |
 1. Extract session path from task description (match "Session: <path>")
 2. Glob all idea files from <session>/ideas/
 3. Glob all critique files from <session>/critiques/
 4. Read all idea and critique files for synthesis
 5. Read .msg/meta.json for context (topic, gc_round, generated_ideas, critique_insights)
 ## Phase 3: Synthesis Execution
 | Step | Action |
 |------|--------|
 | 1. Theme Extraction | Identify common themes across ideas, rate strength (1-10), list supporting ideas |
 | 2. Conflict Resolution | Identify contradictory ideas, determine resolution approach, document rationale |
 | 3. Complementary Grouping | Group complementary ideas together |
 | 4. Gap Identification | Discover uncovered perspectives |
 | 5. Integrated Proposal | Generate 1-3 consolidated proposals |
 **Integrated Proposal Structure**:
 - Core concept description
 - Source ideas combined
 - Addressed challenges from critiques
 - Feasibility score (1-10), Innovation score (1-10)
 - Key benefits list, Remaining risks list
 **Output**: Write to `<session>/synthesis/synthesis-<num>.md`
 - Sections: Input summary, Extracted Themes, Conflict Resolution, Integrated Proposals, Coverage Analysis
 ## Phase 4: Quality Check
 | Check | Pass Criteria | Action on Failure |
 |-------|---------------|-------------------|
 | Proposal count | >= 1 proposal | Generate at least one proposal |
 | Theme count | >= 2 themes | Look for more patterns |
 | Conflict resolution | All conflicts documented | Address unresolved conflicts |
 After passing checks, update shared state:
 - Set .msg/meta.json synthesis_themes
 - Each entry: name, strength, supporting_ideas
--- a/.claude/skills/team-frontend-debug/roles/analyzer/role.md
+++ b/.claude/skills/team-frontend-debug/roles/analyzer/role.md
@@ -31,8 +31,9 @@ Root cause analysis from debug evidence.
 ## Phase 2: Load Evidence
-1. Read upstream artifacts via team_msg(operation="get_state", role="reproducer")
+1. Load debug specs: Run `ccw spec load --category debug` for known issues, workarounds, and root-cause notes
-2. Extract evidence paths from reproducer's state_update ref
+2. Read upstream artifacts via team_msg(operation="get_state", role="reproducer")
 3. Extract evidence paths from reproducer's state_update ref
 3. Load evidence-summary.json from session evidence/
 4. Load all evidence files:
   - Read screenshot files (visual inspection)
--- a/.claude/skills/team-frontend/role-specs/analyst.md
+++ b/.claude/skills/team-frontend/role-specs/analyst.md
@@ -1,91 +0,0 @@
 ---
 prefix: ANALYZE
 inner_loop: false
 message_types:
  success: analyze_ready
  error: error
 ---
 # Requirements Analyst
 Analyze frontend requirements and retrieve industry design intelligence via ui-ux-pro-max skill. Produce design-intelligence.json and requirements.md for downstream consumption by architect and developer roles.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | Industry context | Extracted from task description | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | No |
 1. Extract session path, industry type, and tech stack from task description
 2. Detect existing design system:
 | Signal | Detection Method |
 |--------|-----------------|
 | Token files | Glob `**/*token*.*` |
 | CSS files | Glob `**/*.css` |
 | Package.json | Read for framework dependencies |
 3. Detect tech stack from package.json:
 | Dependency | Stack |
 |------------|-------|
 | `next` | nextjs |
 | `react` | react |
 | `vue` | vue |
 | `svelte` | svelte |
 | `@shadcn/ui` | shadcn |
 | (none) | html-tailwind |
 4. Load .msg/meta.json for shared state
 ## Phase 3: Design Intelligence Retrieval
 Retrieve design intelligence via ui-ux-pro-max skill integration.
 **Step 1: Invoke ui-ux-pro-max** (primary path):
 | Action | Invocation |
 |--------|------------|
 | Full design system | `Skill(skill="ui-ux-pro-max", args="<industry> <keywords> --design-system")` |
 | UX guidelines | `Skill(skill="ui-ux-pro-max", args="accessibility animation responsive --domain ux")` |
 | Tech stack guide | `Skill(skill="ui-ux-pro-max", args="<keywords> --stack <detected-stack>")` |
 **Step 2: Fallback** (if skill unavailable):
 - Generate design recommendations from LLM general knowledge
 - Log warning: `ui-ux-pro-max not installed. Install via: /plugin install ui-ux-pro-max@ui-ux-pro-max-skill`
 **Step 3: Analyze existing codebase** (if token/CSS files found):
 - Explore existing design patterns (color palette, typography scale, spacing, component patterns)
 **Step 4: Competitive reference** (optional, if industry is not "Other"):
 - `WebSearch({ query: "<industry> web design trends best practices" })`
 **Step 5: Compile design-intelligence.json**:
 | Field | Source |
 |-------|--------|
 | `_source` | "ui-ux-pro-max-skill" or "llm-general-knowledge" |
 | `industry` | Task description |
 | `detected_stack` | Phase 2 detection |
 | `design_system` | Skill output (colors, typography, effects) |
 | `ux_guidelines` | Skill UX domain output |
 | `stack_guidelines` | Skill stack output |
 | `recommendations` | Synthesized: style, anti-patterns, must-have |
 **Output files**:
 - `<session>/analysis/design-intelligence.json`
 - `<session>/analysis/requirements.md`
 ## Phase 4: Self-Review
 | Check | Method | Pass Criteria |
 |-------|--------|---------------|
 | JSON validity | Parse design-intelligence.json | No parse errors |
 | Required fields | Check _source, industry, design_system | All present |
 | Anti-patterns populated | Check recommendations.anti_patterns | Non-empty array |
 | Requirements doc exists | File check | requirements.md written |
 Update .msg/meta.json: merge `design_intelligence` and `industry_context` keys.
--- a/.claude/skills/team-frontend/role-specs/architect.md
+++ b/.claude/skills/team-frontend/role-specs/architect.md
@@ -1,85 +0,0 @@
 ---
 prefix: ARCH
 inner_loop: false
 message_types:
  success: arch_ready
  error: error
 ---
 # Frontend Architect
 Consume design-intelligence.json to define design token system, component architecture, and project structure. Token values prioritize ui-ux-pro-max recommendations. Produce architecture artifacts for developer consumption.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | Scope | Extracted from task description (tokens/components/full) | No (default: full) |
 | Design intelligence | <session>/analysis/design-intelligence.json | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | No |
 1. Extract session path and scope from task description
 2. Load design intelligence from analyst output
 3. Load .msg/meta.json for shared state (industry_context, design_intelligence)
 4. Detect existing project structure via Glob `src/**/*`
 **Fail-safe**: If design-intelligence.json not found, use default token values and log warning.
 ## Phase 3: Architecture Design
 **Scope selection**:
 | Scope | Output |
 |-------|--------|
 | `tokens` | Design token system only |
 | `components` | Component specs only |
 | `full` | Both tokens and components + project structure |
 **Step 1: Design Token System** (scope: tokens or full):
 Generate `<session>/architecture/design-tokens.json` with categories:
 | Category | Content | Source |
 |----------|---------|--------|
 | `color` | Primary, secondary, background, surface, text, CTA | ui-ux-pro-max |
 | `typography` | Font families, font sizes (scale) | ui-ux-pro-max |
 | `spacing` | xs through 2xl | Standard scale |
 | `border-radius` | sm, md, lg, full | Standard scale |
 | `shadow` | sm, md, lg | Standard elevation |
 | `transition` | fast, normal, slow | Standard durations |
 Use `$type` + `$value` format (Design Tokens Community Group). Support light/dark mode via nested values.
 **Step 2: Component Architecture** (scope: components or full):
 Generate component specs in `<session>/architecture/component-specs/`:
 - Design reference (style, stack)
 - Props table (name, type, default, description)
 - Variants table
 - Accessibility requirements (role, keyboard, ARIA, contrast)
 - Anti-patterns to avoid (from design intelligence)
 **Step 3: Project Structure** (scope: full):
 Generate `<session>/architecture/project-structure.md` with stack-specific layout:
 | Stack | Key Directories |
 |-------|----------------|
 | react | src/components/, src/pages/, src/hooks/, src/styles/ |
 | nextjs | app/(routes)/, app/components/, app/lib/, app/styles/ |
 | vue | src/components/, src/views/, src/composables/, src/styles/ |
 | html-tailwind | src/components/, src/pages/, src/styles/ |
 ## Phase 4: Self-Review
 | Check | Method | Pass Criteria |
 |-------|--------|---------------|
 | JSON validity | Parse design-tokens.json | No errors |
 | Required categories | Check color, typography, spacing | All present |
 | Anti-pattern compliance | Token values vs anti-patterns | No violations |
 | Component specs complete | Each has props + accessibility | All complete |
 | File existence | Verify all planned files | All present |
 Update .msg/meta.json: merge `design_token_registry` and `component_inventory` keys.
--- a/.claude/skills/team-frontend/role-specs/developer.md
+++ b/.claude/skills/team-frontend/role-specs/developer.md
@@ -1,92 +0,0 @@
 ---
 prefix: DEV
 inner_loop: true
 message_types:
  success: dev_complete
  error: error
 ---
 # Frontend Developer
 Consume architecture artifacts (design tokens, component specs, project structure) to implement frontend code. Reference design-intelligence.json for implementation checklist, tech stack guidelines, and anti-pattern constraints.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | Scope | Extracted from task description (tokens/components/full) | No (default: full) |
 | Design intelligence | <session>/analysis/design-intelligence.json | No |
 | Design tokens | <session>/architecture/design-tokens.json | Yes |
 | Component specs | <session>/architecture/component-specs/*.md | No |
 | Project structure | <session>/architecture/project-structure.md | No |
 | .msg/meta.json | <session>/.msg/meta.json | No |
 1. Extract session path and scope from task description
 2. Load design tokens (required -- if missing, report to coordinator)
 3. Load design intelligence for anti-patterns and guidelines
 4. Load component specs and project structure
 5. Detect tech stack from design intelligence `detected_stack`
 6. Load .msg/meta.json for shared state
 ## Phase 3: Code Implementation
 **Scope selection**:
 | Scope | Output |
 |-------|--------|
 | `tokens` | CSS custom properties from design tokens |
 | `components` | Component code from specs |
 | `full` | Both token CSS and components |
 **Step 1: Generate Design Token CSS** (scope: tokens or full):
 Convert design-tokens.json to `src/styles/tokens.css`:
 | JSON Category | CSS Variable Prefix |
 |---------------|---------------------|
 | color | `--color-` |
 | typography.font-family | `--font-` |
 | typography.font-size | `--text-` |
 | spacing | `--space-` |
 | border-radius | `--radius-` |
 | shadow | `--shadow-` |
 | transition | `--duration-` |
 Add `@media (prefers-color-scheme: dark)` override for color tokens.
 **Step 2: Implement Components** (scope: components or full):
 Implementation strategy by complexity:
 | Condition | Strategy |
 |-----------|----------|
 | <= 2 components | Direct inline Edit/Write |
 | 3-5 components | Single batch implementation |
 | > 5 components | Group by module, implement per batch |
 **Coding standards** (mandatory):
 - Use design token CSS variables -- never hardcode colors/spacing
 - All interactive elements: `cursor: pointer`
 - Transitions: 150-300ms via `var(--duration-normal)`
 - Text contrast: minimum 4.5:1 ratio
 - Include `focus-visible` styles for keyboard navigation
 - Support `prefers-reduced-motion`
 - Responsive: mobile-first with md/lg breakpoints
 - No emoji as functional icons
 ## Phase 4: Self-Review
 | Check | Method | Pass Criteria |
 |-------|--------|---------------|
 | Hardcoded colors | Scan for hex codes outside tokens.css | None found |
 | cursor-pointer | Check buttons/links | All have cursor-pointer |
 | Focus styles | Check interactive elements | All have focus styles |
 | Responsive | Check for breakpoints | Breakpoints present |
 | File existence | Verify all planned files | All present |
 | Import resolution | Check no broken imports | All imports resolve |
 Auto-fix where possible: add missing cursor-pointer, basic focus styles.
 Update .msg/meta.json: merge `component_inventory` key with implemented file list.
--- a/.claude/skills/team-frontend/role-specs/qa.md
+++ b/.claude/skills/team-frontend/role-specs/qa.md
@@ -1,78 +0,0 @@
 ---
 prefix: QA
 inner_loop: false
 message_types:
  success: qa_passed
  error: error
 ---
 # QA Engineer
 Execute 5-dimension quality audit integrating ux-guidelines Do/Don't rules, pre-delivery checklist, and industry anti-pattern library. Perform CSS-level precise review on architecture artifacts and implementation code.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | Review type | Extracted from task description | No (default: code-review) |
 | Design intelligence | <session>/analysis/design-intelligence.json | No |
 | Design tokens | <session>/architecture/design-tokens.json | No |
 | .msg/meta.json | <session>/.msg/meta.json | No |
 1. Extract session path and review type from task description
 2. Load design intelligence (for anti-patterns, must-have rules)
 3. Load design tokens (for compliance checks)
 4. Load .msg/meta.json (for industry context, strictness level)
 5. Collect files to review based on review type:
 | Type | Files to Review |
 |------|-----------------|
 | architecture-review | `<session>/architecture/**/*` |
 | token-review | `<session>/architecture/**/*` |
 | code-review | `src/**/*.{tsx,jsx,vue,svelte,html,css}` |
 | final | `src/**/*.{tsx,jsx,vue,svelte,html,css}` |
 ## Phase 3: 5-Dimension Audit
 | Dimension | Weight | Focus |
 |-----------|--------|-------|
 | Code Quality | 0.20 | Structure, naming, maintainability |
 | Accessibility | 0.25 | WCAG compliance, keyboard nav, screen reader |
 | Design Compliance | 0.20 | Anti-pattern check, design token usage |
 | UX Best Practices | 0.20 | Interaction patterns, responsive, animations |
 | Pre-Delivery | 0.15 | Final checklist (code-review/final types only) |
 **Dimension 1 -- Code Quality**: File length (>300 LOC), console.log, empty catch, unused imports.
 **Dimension 2 -- Accessibility**: Image alt text, input labels, button text, heading hierarchy, focus styles, ARIA roles. Strict mode (medical/financial): prefers-reduced-motion required.
 **Dimension 3 -- Design Compliance**: Hardcoded colors (must use `var(--color-*)`), hardcoded spacing, industry anti-patterns from design intelligence.
 **Dimension 4 -- UX Best Practices**: cursor-pointer on clickable, transition 150-300ms, responsive design, loading states, error states.
 **Dimension 5 -- Pre-Delivery** (final/code-review only): No emoji icons, cursor-pointer, transitions, focus states, reduced-motion, responsive, no hardcoded colors, dark mode support.
 **Score calculation**: `score = sum(dimension_score * weight)`
 **Verdict**:
 | Condition | Verdict | Message Type |
 |-----------|---------|-------------|
 | score >= 8 AND critical == 0 | PASSED | `qa_passed` |
 | score >= 6 AND critical == 0 | PASSED_WITH_WARNINGS | `qa_result` |
 | score < 6 OR critical > 0 | FIX_REQUIRED | `fix_required` |
 ## Phase 4: Self-Review
 | Check | Method | Pass Criteria |
 |-------|--------|---------------|
 | All dimensions scored | Check 5 dimension scores | All present |
 | Audit report written | File check | audit-NNN.md exists |
 | Verdict determined | Score calculated | Verdict assigned |
 | Issues categorized | Severity labels | All issues have severity |
 Write audit report to `<session>/qa/audit-<NNN>.md` with: summary, dimension scores, issues by severity, passed dimensions.
 Update .msg/meta.json: append to `qa_history` array.
--- a/.claude/skills/team-issue/role-specs/explorer.md
+++ b/.claude/skills/team-issue/role-specs/explorer.md
@@ -1,95 +0,0 @@
 ---
 prefix: EXPLORE
 inner_loop: false
 message_types:
  success: context_ready
  error: error
 ---
 # Issue Explorer
 Analyze issue context, explore codebase for relevant files, map dependencies and impact scope. Produce a shared context report for planner, reviewer, and implementer.
 ## Phase 2: Issue Loading & Context Setup
 | Input | Source | Required |
 |-------|--------|----------|
 | Issue ID | Task description (GH-\d+ or ISS-\d{8}-\d{6}) | Yes |
 | Issue details | `ccw issue status <id> --json` | Yes |
 | Session path | Extracted from task description | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
 1. Extract issue ID from task description via regex: `(?:GH-\d+|ISS-\d{8}-\d{6})`
 2. If no issue ID found -> report error, STOP
 3. Load issue details:
 ```
 Bash("ccw issue status <issueId> --json")
 ```
 4. Parse JSON response for issue metadata (title, context, priority, labels, feedback)
 5. Load wisdom files from `<session>/wisdom/` if available
 ## Phase 3: Codebase Exploration & Impact Analysis
 **Complexity assessment determines exploration depth**:
 | Signal | Weight | Keywords |
 |--------|--------|----------|
 | Structural change | +2 | refactor, architect, restructure, module, system |
 | Cross-cutting | +2 | multiple, across, cross |
 | Integration | +1 | integrate, api, database |
 | High priority | +1 | priority >= 4 |
 | Score | Complexity | Strategy |
 |-------|------------|----------|
 | >= 4 | High | Deep exploration via CLI tool |
 | 2-3 | Medium | Hybrid: ACE search + selective CLI |
 | 0-1 | Low | Direct ACE search only |
 **Exploration execution**:
 | Complexity | Execution |
 |------------|-----------|
 | Low | Direct ACE search: `mcp__ace-tool__search_context(project_root_path, query)` |
 | Medium/High | CLI exploration: `Bash("ccw cli -p \"<exploration_prompt>\" --tool gemini --mode analysis", { run_in_background: false })` |
 **CLI exploration prompt template**:
 ```
 PURPOSE: Explore codebase for issue <issueId> to identify relevant files, dependencies, and impact scope; success = comprehensive context report written to <session>/explorations/context-<issueId>.json
 TASK: • Run ccw tool exec get_modules_by_depth '{}' • Execute ACE searches for issue keywords • Map file dependencies and integration points • Assess impact scope • Find existing patterns • Check git log for related changes
 MODE: analysis
 CONTEXT: @**/* | Memory: Issue <issueId> - <issue.title> (Priority: <issue.priority>)
 EXPECTED: JSON report with: relevant_files (path + relevance), dependencies, impact_scope (low/medium/high), existing_patterns, related_changes, key_findings, complexity_assessment
 CONSTRAINTS: Focus on issue context | Write output to <session>/explorations/context-<issueId>.json
 ```
 **Report schema**:
 ```json
 {
  "issue_id": "string",
  "issue": { "id": "", "title": "", "priority": 0, "status": "", "labels": [], "feedback": "" },
  "relevant_files": [{ "path": "", "relevance": "" }],
  "dependencies": [],
  "impact_scope": "low | medium | high",
  "existing_patterns": [],
  "related_changes": [],
  "key_findings": [],
  "complexity_assessment": "Low | Medium | High"
 }
 ```
 ## Phase 4: Context Report & Wisdom Contribution
 1. Write context report to `<session>/explorations/context-<issueId>.json`
 2. If file not found from agent, build minimal report from ACE results
 3. Update `<session>/wisdom/.msg/meta.json` under `explorer` namespace:
   - Read existing -> merge `{ "explorer": { issue_id, complexity, impact_scope, file_count } }` -> write back
 4. Contribute discoveries to `<session>/wisdom/learnings.md` if new patterns found
--- a/.claude/skills/team-issue/role-specs/implementer.md
+++ b/.claude/skills/team-issue/role-specs/implementer.md
@@ -1,89 +0,0 @@
 ---
 prefix: BUILD
 inner_loop: false
 message_types:
  success: impl_complete
  failed: impl_failed
  error: error
 ---
 # Issue Implementer
 Load solution plan, route to execution backend (Agent/Codex/Gemini), run tests, and commit. Execution method determined by coordinator during task creation. Supports parallel instances for batch mode.
 ## Modes
 | Backend | Condition | Method |
 |---------|-----------|--------|
 | codex | task_count > 3 or explicit | `ccw cli --tool codex --mode write --id issue-<issueId>` |
 | gemini | task_count <= 3 or explicit | `ccw cli --tool gemini --mode write --id issue-<issueId>` |
 | qwen | explicit | `ccw cli --tool qwen --mode write --id issue-<issueId>` |
 ## Phase 2: Load Solution & Resolve Executor
 | Input | Source | Required |
 |-------|--------|----------|
 | Issue ID | Task description (GH-\d+ or ISS-\d{8}-\d{6}) | Yes |
 | Bound solution | `ccw issue solutions <id> --json` | Yes |
 | Explorer context | `<session>/explorations/context-<issueId>.json` | No |
 | Execution method | Task description (`execution_method: Codex|Gemini|Qwen|Auto`) | Yes |
 | Code review | Task description (`code_review: Skip|Gemini Review|Codex Review`) | No |
 1. Extract issue ID from task description
 2. If no issue ID -> report error, STOP
 3. Load bound solution: `Bash("ccw issue solutions <issueId> --json")`
 4. If no bound solution -> report error, STOP
 5. Load explorer context (if available)
 6. Resolve execution method (Auto: task_count <= 3 -> gemini, else codex)
 7. Update issue status: `Bash("ccw issue update <issueId> --status in-progress")`
 ## Phase 3: Implementation (Multi-Backend Routing)
 **Execution prompt template** (all backends):
 ```
 ## Issue
 ID: <issueId>
 Title: <solution.bound.title>
 ## Solution Plan
 <solution.bound JSON>
 ## Codebase Context (from explorer)
 Relevant files: <explorerContext.relevant_files>
 Existing patterns: <explorerContext.existing_patterns>
 Dependencies: <explorerContext.dependencies>
 ## Implementation Requirements
 1. Follow the solution plan tasks in order
 2. Write clean, minimal code following existing patterns
 3. Run tests after each significant change
 4. Ensure all existing tests still pass
 5. Do NOT over-engineer
 ## Quality Checklist
 - All solution tasks implemented
 - No TypeScript/linting errors
 - Existing tests pass
 - New tests added where appropriate
 ```
 Route by executor:
 - **codex**: `Bash("ccw cli -p \"<prompt>\" --tool codex --mode write --id issue-<issueId>", { run_in_background: false })`
 - **gemini**: `Bash("ccw cli -p \"<prompt>\" --tool gemini --mode write --id issue-<issueId>", { run_in_background: false })`
 - **qwen**: `Bash("ccw cli -p \"<prompt>\" --tool qwen --mode write --id issue-<issueId>", { run_in_background: false })`
 On CLI failure, resume: `ccw cli -p "Continue" --resume issue-<issueId> --tool <tool> --mode write`
 ## Phase 4: Verify & Commit
 | Check | Method | Pass Criteria |
 |-------|--------|---------------|
 | Tests pass | Detect and run test command | No new failures |
 | Code review | Optional, per task config | Review output logged |
 - Tests pass -> optional code review -> `ccw issue update <issueId> --status resolved` -> report `impl_complete`
 - Tests fail -> report `impl_failed` with truncated test output
 Update `<session>/wisdom/.msg/meta.json` under `implementer` namespace:
 - Read existing -> merge `{ "implementer": { issue_id, executor, test_status, review_status } }` -> write back
--- a/.claude/skills/team-issue/role-specs/integrator.md
+++ b/.claude/skills/team-issue/role-specs/integrator.md
@@ -1,86 +0,0 @@
 ---
 prefix: MARSHAL
 inner_loop: false
 message_types:
  success: queue_ready
  conflict: conflict_found
  error: error
 ---
 # Issue Integrator
 Queue orchestration, conflict detection, and execution order optimization. Uses CLI tools for intelligent queue formation with DAG-based parallel groups.
 ## Phase 2: Collect Bound Solutions
 | Input | Source | Required |
 |-------|--------|----------|
 | Issue IDs | Task description (GH-\d+ or ISS-\d{8}-\d{6}) | Yes |
 | Bound solutions | `ccw issue solutions <id> --json` | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
 1. Extract issue IDs from task description via regex
 2. Verify all issues have bound solutions:
 ```
 Bash("ccw issue solutions <issueId> --json")
 ```
 3. Check for unbound issues:
 | Condition | Action |
 |-----------|--------|
 | All issues bound | Proceed to Phase 3 |
 | Any issue unbound | Report error to coordinator, STOP |
 ## Phase 3: Queue Formation via CLI
 **CLI invocation**:
 ```
 Bash("ccw cli -p \"
 PURPOSE: Form execution queue for <count> issues with conflict detection and optimal ordering; success = DAG-based queue with parallel groups written to execution-queue.json
 TASK: • Load all bound solutions from .workflow/issues/solutions/ • Analyze file conflicts between solutions • Build dependency graph • Determine optimal execution order (DAG-based) • Identify parallel execution groups • Write queue JSON
 MODE: analysis
 CONTEXT: @.workflow/issues/solutions/**/*.json | Memory: Issues to queue: <issueIds>
 EXPECTED: Queue JSON with: ordered issue list, conflict analysis, parallel_groups (issues that can run concurrently), depends_on relationships
 Write to: .workflow/issues/queue/execution-queue.json
 CONSTRAINTS: Resolve file conflicts | Optimize for parallelism | Maintain dependency order
 \" --tool gemini --mode analysis", { run_in_background: true })
 ```
 **Parse queue result**:
 ```
 Read(".workflow/issues/queue/execution-queue.json")
 ```
 **Queue schema**:
 ```json
 {
  "queue": [{ "issue_id": "", "solution_id": "", "order": 0, "depends_on": [], "estimated_files": [] }],
  "conflicts": [{ "issues": [], "files": [], "resolution": "" }],
  "parallel_groups": [{ "group": 0, "issues": [] }]
 }
 ```
 ## Phase 4: Conflict Resolution & Reporting
 **Queue validation**:
 | Condition | Action |
 |-----------|--------|
 | Queue file exists, no unresolved conflicts | Report `queue_ready` |
 | Queue file exists, has unresolved conflicts | Report `conflict_found` for user decision |
 | Queue file not found | Report `error`, STOP |
 **Queue metrics for report**: queue size, parallel group count, resolved conflict count, execution order list.
 Update `<session>/wisdom/.msg/meta.json` under `integrator` namespace:
 - Read existing -> merge `{ "integrator": { queue_size, parallel_groups, conflict_count } }` -> write back
--- a/.claude/skills/team-issue/role-specs/planner.md
+++ b/.claude/skills/team-issue/role-specs/planner.md
@@ -1,83 +0,0 @@
 ---
 prefix: SOLVE
 inner_loop: false
 additional_prefixes: [SOLVE-fix]
 message_types:
  success: solution_ready
  multi: multi_solution
  error: error
 ---
 # Issue Planner
 Design solutions and decompose into implementation tasks. Uses CLI tools for ACE exploration and solution generation. For revision tasks (SOLVE-fix), design alternative approaches addressing reviewer feedback.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Issue ID | Task description (GH-\d+ or ISS-\d{8}-\d{6}) | Yes |
 | Explorer context | `<session>/explorations/context-<issueId>.json` | No |
 | Review feedback | Task description (for SOLVE-fix tasks) | No |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
 1. Extract issue ID from task description via regex: `(?:GH-\d+|ISS-\d{8}-\d{6})`
 2. If no issue ID found -> report error, STOP
 3. Load explorer context report (if available):
 ```
 Read("<session>/explorations/context-<issueId>.json")
 ```
 4. Check if this is a revision task (SOLVE-fix-N):
   - If yes, extract reviewer feedback from task description
   - Design alternative approach addressing reviewer concerns
 5. Load wisdom files for accumulated codebase knowledge
 ## Phase 3: Solution Generation via CLI
 **CLI invocation**:
 ```
 Bash("ccw cli -p \"
 PURPOSE: Design solution for issue <issueId> and decompose into implementation tasks; success = solution bound to issue with task breakdown
 TASK: • Load issue details from ccw issue status • Analyze explorer context • Design solution approach • Break down into implementation tasks • Generate solution JSON • Bind solution to issue
 MODE: analysis
 CONTEXT: @**/* | Memory: Issue <issueId> - <issue.title> (Priority: <issue.priority>)
 Explorer findings: <explorerContext.key_findings>
 Relevant files: <explorerContext.relevant_files>
 Complexity: <explorerContext.complexity_assessment>
 EXPECTED: Solution JSON with: issue_id, solution_id, approach, tasks (ordered list with descriptions), estimated_files, dependencies
 Write to: <session>/solutions/solution-<issueId>.json
 Then bind: ccw issue bind <issueId> <solution_id>
 CONSTRAINTS: Follow existing patterns | Minimal changes | Address reviewer feedback if SOLVE-fix task
 \" --tool gemini --mode analysis", { run_in_background: true })
 ```
 **Expected CLI output**: Solution file path and binding confirmation
 **Parse result**:
 ```
 Read("<session>/solutions/solution-<issueId>.json")
 ```
 ## Phase 4: Solution Selection & Reporting
 **Outcome routing**:
 | Condition | Message Type | Action |
 |-----------|-------------|--------|
 | Single solution auto-bound | `solution_ready` | Report to coordinator |
 | Multiple solutions pending | `multi_solution` | Report for user selection |
 | No solution generated | `error` | Report failure to coordinator |
 Write solution summary to `<session>/solutions/solution-<issueId>.json`.
 Update `<session>/wisdom/.msg/meta.json` under `planner` namespace:
 - Read existing -> merge `{ "planner": { issue_id, solution_id, task_count, is_revision } }` -> write back
--- a/.claude/skills/team-issue/role-specs/reviewer.md
+++ b/.claude/skills/team-issue/role-specs/reviewer.md
@@ -1,89 +0,0 @@
 ---
 prefix: AUDIT
 inner_loop: false
 message_types:
  success: approved
  concerns: concerns
  rejected: rejected
  error: error
 ---
 # Issue Reviewer
 Review solution plans for technical feasibility, risk, and completeness. Quality gate role between plan and execute phases. Provides clear verdicts: approved, rejected, or concerns.
 ## Phase 2: Context & Solution Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Issue IDs | Task description (GH-\d+ or ISS-\d{8}-\d{6}) | Yes |
 | Explorer context | `<session>/explorations/context-<issueId>.json` | No |
 | Bound solution | `ccw issue solutions <id> --json` | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
 1. Extract issue IDs from task description via regex
 2. Load explorer context reports for each issue
 3. Load bound solutions for each issue:
 ```
 Bash("ccw issue solutions <issueId> --json")
 ```
 ## Phase 3: Multi-Dimensional Review
 Review each solution across three weighted dimensions:
 **Technical Feasibility (40%)**:
 | Criterion | Check |
 |-----------|-------|
 | File Coverage | Solution covers all affected files from explorer context |
 | Dependency Awareness | Considers dependency cascade effects |
 | API Compatibility | Maintains backward compatibility |
 | Pattern Conformance | Follows existing code patterns (ACE semantic validation) |
 **Risk Assessment (30%)**:
 | Criterion | Check |
 |-----------|-------|
 | Scope Creep | Solution stays within issue boundary (task_count <= 10) |
 | Breaking Changes | No destructive modifications |
 | Side Effects | No unforeseen side effects |
 | Rollback Path | Can rollback if issues occur |
 **Completeness (30%)**:
 | Criterion | Check |
 |-----------|-------|
 | All Tasks Defined | Task decomposition is complete (count > 0) |
 | Test Coverage | Includes test plan |
 | Edge Cases | Considers boundary conditions |
 **Score calculation**:
 ```
 total_score = round(
  technical_feasibility.score * 0.4 +
  risk_assessment.score * 0.3 +
  completeness.score * 0.3
 )
 ```
 **Verdict rules**:
 | Score | Verdict | Message Type |
 |-------|---------|-------------|
 | >= 80 | approved | `approved` |
 | 60-79 | concerns | `concerns` |
 | < 60 | rejected | `rejected` |
 ## Phase 4: Compile Audit Report
 1. Write audit report to `<session>/audits/audit-report.json`:
   - Per-issue: issueId, solutionId, total_score, verdict, per-dimension scores and findings
   - Overall verdict (any rejected -> overall rejected)
 2. Update `<session>/wisdom/.msg/meta.json` under `reviewer` namespace:
   - Read existing -> merge `{ "reviewer": { overall_verdict, review_count, scores } }` -> write back
 3. For rejected solutions, include specific rejection reasons and actionable feedback for SOLVE-fix task creation
--- a/.claude/skills/team-iterdev/role-specs/architect.md
+++ b/.claude/skills/team-iterdev/role-specs/architect.md
@@ -1,64 +0,0 @@
 ---
 prefix: DESIGN
 inner_loop: false
 message_types:
  success: design_ready
  revision: design_revision
  error: error
 ---
 # Architect
 Technical design, task decomposition, and architecture decision records for iterative development.
 ## Phase 2: Context Loading + Codebase Exploration
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | No |
 | Wisdom files | <session>/wisdom/ | No |
 1. Extract session path and requirement from task description
 2. Read .msg/meta.json for shared context (architecture_decisions, implementation_context)
 3. Read wisdom files if available (learnings.md, decisions.md, conventions.md)
 4. Explore codebase for existing patterns, module structure, dependencies:
   - Use mcp__ace-tool__search_context for semantic discovery
   - Identify similar implementations and integration points
 ## Phase 3: Technical Design + Task Decomposition
 **Design strategy selection**:
 | Condition | Strategy |
 |-----------|----------|
 | Single module change | Direct inline design |
 | Cross-module change | Multi-component design with integration points |
 | Large refactoring | Phased approach with milestones |
 **Outputs**:
 1. **Design Document** (`<session>/design/design-<num>.md`):
   - Architecture decision: approach, rationale, alternatives
   - Component design: responsibility, dependencies, files, complexity
   - Task breakdown: files, estimated complexity, dependencies, acceptance criteria
   - Integration points and risks with mitigations
 2. **Task Breakdown JSON** (`<session>/design/task-breakdown.json`):
   - Array of tasks with id, title, files, complexity, dependencies, acceptance_criteria
   - Execution order for developer to follow
 ## Phase 4: Design Validation
 | Check | Method | Pass Criteria |
 |-------|--------|---------------|
 | Components defined | Verify component list | At least 1 component |
 | Task breakdown exists | Verify task list | At least 1 task |
 | Dependencies mapped | All components have dependencies field | All present (can be empty) |
 | Integration points | Verify integration section | Key integrations documented |
 1. Run validation checks above
 2. Write architecture_decisions entry to .msg/meta.json:
   - design_id, approach, rationale, components, task_count
 3. Write discoveries to wisdom/decisions.md and wisdom/conventions.md
--- a/.claude/skills/team-iterdev/role-specs/developer.md
+++ b/.claude/skills/team-iterdev/role-specs/developer.md
@@ -1,73 +0,0 @@
 ---
 prefix: DEV
 inner_loop: true
 message_types:
  success: dev_complete
  progress: dev_progress
  error: error
 ---
 # Developer
 Code implementer. Implements code according to design, incremental delivery. Acts as Generator in Generator-Critic loop (paired with reviewer).
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | Yes |
 | Design document | <session>/design/design-001.md | For non-fix tasks |
 | Task breakdown | <session>/design/task-breakdown.json | For non-fix tasks |
 | Review feedback | <session>/review/*.md | For fix tasks |
 | Wisdom files | <session>/wisdom/ | No |
 1. Extract session path from task description
 2. Read .msg/meta.json for shared context
 3. Detect task type:
 | Task Type | Detection | Loading |
 |-----------|-----------|---------|
 | Fix task | Subject contains "fix" | Read latest review file for feedback |
 | Normal task | No "fix" in subject | Read design document + task breakdown |
 4. Load previous implementation_context from .msg/meta.json
 5. Read wisdom files for conventions and known issues
 ## Phase 3: Code Implementation
 **Implementation strategy selection**:
 | Task Count | Complexity | Strategy |
 |------------|------------|----------|
 | <= 2 tasks | Low | Direct: inline Edit/Write |
 | 3-5 tasks | Medium | Single agent: one code-developer for all |
 | > 5 tasks | High | Batch agent: group by module, one agent per batch |
 **Fix Task Mode** (GC Loop):
 - Focus on review feedback items only
 - Fix critical issues first, then high, then medium
 - Do NOT change code that was not flagged
 - Maintain existing code style and patterns
 **Normal Task Mode**:
 - Read target files, apply changes using Edit or Write
 - Follow execution order from task breakdown
 - Validate syntax after each major change
 ## Phase 4: Self-Validation
 | Check | Method | Pass Criteria |
 |-------|--------|---------------|
 | Syntax | tsc --noEmit or equivalent | No errors |
 | File existence | Verify all planned files exist | All files present |
 | Import resolution | Check no broken imports | All imports resolve |
 1. Run syntax check: `tsc --noEmit` / `python -m py_compile` / equivalent
 2. Auto-fix if validation fails (max 2 attempts)
 3. Write dev log to `<session>/code/dev-log.md`:
   - Changed files count, syntax status, fix task flag, file list
 4. Update implementation_context in .msg/meta.json:
   - task, changed_files, is_fix, syntax_clean
 5. Write discoveries to wisdom/learnings.md
--- a/.claude/skills/team-iterdev/role-specs/reviewer.md
+++ b/.claude/skills/team-iterdev/role-specs/reviewer.md
@@ -1,65 +0,0 @@
 ---
 prefix: REVIEW
 inner_loop: false
 message_types:
  success: review_passed
  revision: review_revision
  critical: review_critical
  error: error
 ---
 # Reviewer
 Code reviewer. Multi-dimensional review, quality scoring, improvement suggestions. Acts as Critic in Generator-Critic loop (paired with developer).
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | Yes |
 | Design document | <session>/design/design-001.md | For requirements alignment |
 | Changed files | Git diff | Yes |
 1. Extract session path from task description
 2. Read .msg/meta.json for shared context and previous review_feedback_trends
 3. Read design document for requirements alignment
 4. Get changed files via git diff, read file contents (limit 20 files)
 ## Phase 3: Multi-Dimensional Review
 **Review dimensions**:
 | Dimension | Weight | Focus Areas |
 |-----------|--------|-------------|
 | Correctness | 30% | Logic correctness, boundary handling |
 | Completeness | 25% | Coverage of design requirements |
 | Maintainability | 25% | Readability, code style, DRY |
 | Security | 20% | Vulnerabilities, input validation |
 Per-dimension: scan modified files, record findings with severity (CRITICAL/HIGH/MEDIUM/LOW), include file:line references and suggestions.
 **Scoring**: Weighted average of dimension scores (1-10 each).
 **Output review report** (`<session>/review/review-<num>.md`):
 - Files reviewed count, quality score, issue counts by severity
 - Per-finding: severity, file:line, dimension, description, suggestion
 - Scoring breakdown by dimension
 - Signal: CRITICAL / REVISION_NEEDED / APPROVED
 - Design alignment notes
 ## Phase 4: Trend Analysis + Verdict
 1. Compare with previous review_feedback_trends from .msg/meta.json
 2. Identify recurring issues, improvement areas, new issues
 | Verdict Condition | Message Type |
 |-------------------|--------------|
 | criticalCount > 0 | review_critical |
 | score < 7 | review_revision |
 | else | review_passed |
 3. Update review_feedback_trends in .msg/meta.json:
   - review_id, score, critical count, high count, dimensions, gc_round
 4. Write discoveries to wisdom/learnings.md
--- a/.claude/skills/team-iterdev/role-specs/tester.md
+++ b/.claude/skills/team-iterdev/role-specs/tester.md
@@ -1,87 +0,0 @@
 ---
 prefix: VERIFY
 inner_loop: false
 message_types:
  success: verify_passed
  failure: verify_failed
  fix: fix_required
  error: error
 ---
 # Tester
 Test validator. Test execution, fix cycles, and regression detection.
 ## Phase 2: Environment Detection
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | Yes |
 | Changed files | Git diff | Yes |
 1. Extract session path from task description
 2. Read .msg/meta.json for shared context
 3. Get changed files via git diff
 4. Detect test framework and command:
 | Detection | Method |
 |-----------|--------|
 | Test command | Check package.json scripts, pytest.ini, Makefile |
 | Coverage tool | Check for nyc, coverage.py, jest --coverage config |
 Common commands: npm test, pytest, go test ./..., cargo test
 ## Phase 3: Execution + Fix Cycle
 **Iterative test-fix cycle** (max 5 iterations):
 | Step | Action |
 |------|--------|
 | 1 | Run test command |
 | 2 | Parse results, check pass rate |
 | 3 | Pass rate >= 95% -> exit loop (success) |
 | 4 | Extract failing test details |
 | 5 | Apply fix using CLI tool |
 | 6 | Increment iteration counter |
 | 7 | iteration >= MAX (5) -> exit loop (report failures) |
 | 8 | Go to Step 1 |
 **Fix delegation**: Use CLI tool to fix failing tests:
 ```bash
 ccw cli -p "PURPOSE: Fix failing tests; success = all listed tests pass
 TASK: • Analyze test failure output • Identify root cause in changed files • Apply minimal fix
 MODE: write
 CONTEXT: @<changed-files> | Memory: Test output from current iteration
 EXPECTED: Code fixes that make failing tests pass without breaking other tests
 CONSTRAINTS: Only modify files in changed list | Minimal changes
 Test output: <test-failure-details>
 Changed files: <file-list>" --tool gemini --mode write --rule development-debug-runtime-issues
 ```
 Wait for CLI completion before re-running tests.
 ## Phase 4: Regression Check + Report
 1. Run full test suite for regression: `<test-command> --all`
 | Check | Method | Pass Criteria |
 |-------|--------|---------------|
 | Regression | Run full test suite | No FAIL in output |
 | Coverage | Run coverage tool | >= 80% (if configured) |
 2. Write verification results to `<session>/verify/verify-<num>.json`:
   - verify_id, pass_rate, iterations, passed, timestamp, regression_passed
 3. Determine message type:
 | Condition | Message Type |
 |-----------|--------------|
 | passRate >= 0.95 | verify_passed |
 | passRate < 0.95 && iterations >= MAX | fix_required |
 | passRate < 0.95 | verify_failed |
 4. Update .msg/meta.json with test_patterns entry
 5. Write discoveries to wisdom/issues.md
--- a/.claude/skills/team-perf-opt/role-specs/benchmarker.md
+++ b/.claude/skills/team-perf-opt/role-specs/benchmarker.md
@@ -1,110 +0,0 @@
 ---
 prefix: BENCH
 inner_loop: false
 message_types:
  success: bench_complete
  error: error
  fix: fix_required
 ---
 # Performance Benchmarker
 Run benchmarks comparing before/after optimization metrics. Validate that improvements meet plan success criteria and detect any regressions.
 ## Phase 2: Environment & Baseline Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Baseline metrics | <session>/artifacts/baseline-metrics.json (shared) | Yes |
 | Optimization plan / detail | Varies by mode (see below) | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | Yes |
 1. Extract session path from task description
 2. **Detect branch/pipeline context** from task description:
 | Task Description Field | Value | Context |
 |----------------------|-------|---------|
 | `BranchId: B{NN}` | Present | Fan-out branch -- benchmark only this branch's metrics |
 | `PipelineId: {P}` | Present | Independent pipeline -- use pipeline-scoped baseline |
 | Neither present | - | Single mode -- full benchmark |
 3. **Load baseline metrics**:
   - Single / Fan-out: Read `<session>/artifacts/baseline-metrics.json` (shared baseline)
   - Independent: Read `<session>/artifacts/pipelines/{P}/baseline-metrics.json`
 4. **Load optimization context**:
   - Single: Read `<session>/artifacts/optimization-plan.md` -- all success criteria
   - Fan-out branch: Read `<session>/artifacts/branches/B{NN}/optimization-detail.md` -- only this branch's criteria
   - Independent: Read `<session>/artifacts/pipelines/{P}/optimization-plan.md`
 5. Load .msg/meta.json for project type and optimization scope
 6. Detect available benchmark tools from project:
 | Signal | Benchmark Tool | Method |
 |--------|---------------|--------|
 | package.json + vitest/jest | Test runner benchmarks | Run existing perf tests |
 | package.json + webpack/vite | Bundle analysis | Compare build output sizes |
 | Cargo.toml + criterion | Rust benchmarks | cargo bench |
 | go.mod | Go benchmarks | go test -bench |
 | Makefile with bench target | Custom benchmarks | make bench |
 | No tooling detected | Manual measurement | Timed execution via Bash |
 7. Get changed files scope from shared-memory:
   - Single: `optimizer` namespace
   - Fan-out: `optimizer.B{NN}` namespace
   - Independent: `optimizer.{P}` namespace
 ## Phase 3: Benchmark Execution
 Run benchmarks matching detected project type:
 **Frontend benchmarks**:
 - Compare bundle size before/after (build output analysis)
 - Measure render performance for affected components
 - Check for dependency weight changes
 **Backend benchmarks**:
 - Measure endpoint response times for affected routes
 - Profile memory usage under simulated load
 - Verify database query performance improvements
 **CLI / Library benchmarks**:
 - Measure execution time for representative workloads
 - Compare memory peak usage
 - Test throughput under sustained load
 **All project types**:
 - Run existing test suite to verify no regressions
 - Collect post-optimization metrics matching baseline format
 - Calculate improvement percentages per metric
 **Branch-scoped benchmarking** (fan-out mode):
 - Only benchmark metrics relevant to this branch's optimization (from optimization-detail.md)
 - Still check for regressions across all metrics (not just branch-specific ones)
 ## Phase 4: Result Analysis
 Compare against baseline and plan criteria:
 | Metric | Threshold | Verdict |
 |--------|-----------|---------|
 | Target improvement vs baseline | Meets plan success criteria | PASS |
 | No regression in unrelated metrics | < 5% degradation allowed | PASS |
 | All plan success criteria met | Every criterion satisfied | PASS |
 | Improvement below target | > 50% of target achieved | WARN |
 | Regression detected | Any unrelated metric degrades > 5% | FAIL -> fix_required |
 | Plan criteria not met | Any criterion not satisfied | FAIL -> fix_required |
 1. Write benchmark results to output path:
   - Single: `<session>/artifacts/benchmark-results.json`
   - Fan-out: `<session>/artifacts/branches/B{NN}/benchmark-results.json`
   - Independent: `<session>/artifacts/pipelines/{P}/benchmark-results.json`
   - Content: Per-metric: name, baseline value, current value, improvement %, verdict; Overall verdict: PASS / WARN / FAIL; Regression details (if any)
 2. Update `<session>/.msg/meta.json` under scoped namespace:
   - Single: merge `{ "benchmarker": { verdict, improvements, regressions } }`
   - Fan-out: merge `{ "benchmarker.B{NN}": { verdict, improvements, regressions } }`
   - Independent: merge `{ "benchmarker.{P}": { verdict, improvements, regressions } }`
 3. If verdict is FAIL, include detailed feedback in message for FIX task creation:
   - Which metrics failed, by how much, suggested investigation areas
--- a/.claude/skills/team-perf-opt/role-specs/optimizer.md
+++ b/.claude/skills/team-perf-opt/role-specs/optimizer.md
@@ -1,102 +0,0 @@
 ---
 prefix: IMPL
 inner_loop: true
 additional_prefixes: [FIX]
 delegates_to: []
 message_types:
  success: impl_complete
  error: error
  fix: fix_required
 ---
 # Code Optimizer
 Implement optimization changes following the strategy plan. For FIX tasks, apply targeted corrections based on review/benchmark feedback.
 ## Modes
 | Mode | Task Prefix | Trigger | Focus |
 |------|-------------|---------|-------|
 | Implement | IMPL | Strategy plan ready | Apply optimizations per plan priority |
 | Fix | FIX | Review/bench feedback | Targeted fixes for identified issues |
 ## Phase 2: Plan & Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Optimization plan | <session>/artifacts/optimization-plan.md | Yes (IMPL, no branch) |
 | Branch optimization detail | <session>/artifacts/branches/B{NN}/optimization-detail.md | Yes (IMPL with branch) |
 | Pipeline optimization plan | <session>/artifacts/pipelines/{P}/optimization-plan.md | Yes (IMPL with pipeline) |
 | Review/bench feedback | From task description | Yes (FIX) |
 | .msg/meta.json | <session>/.msg/meta.json | Yes |
 | Wisdom files | <session>/wisdom/patterns.md | No |
 | Context accumulator | From prior IMPL/FIX tasks | Yes (inner loop) |
 1. Extract session path and task mode (IMPL or FIX) from task description
 2. **Detect branch/pipeline context** from task description:
 | Task Description Field | Value | Context |
 |----------------------|-------|---------|
 | `BranchId: B{NN}` | Present | Fan-out branch -- load single optimization detail |
 | `PipelineId: {P}` | Present | Independent pipeline -- load pipeline-scoped plan |
 | Neither present | - | Single mode -- load full optimization plan |
 3. **Load optimization context by mode**:
   - **Single mode (no branch)**: Read `<session>/artifacts/optimization-plan.md` -- extract ALL priority-ordered changes
   - **Fan-out branch**: Read `<session>/artifacts/branches/B{NN}/optimization-detail.md` -- extract ONLY this branch's optimization (single OPT-ID)
   - **Independent pipeline**: Read `<session>/artifacts/pipelines/{P}/optimization-plan.md` -- extract this pipeline's plan
 4. For FIX: parse review/benchmark feedback for specific issues to address
 5. Use ACE search or CLI tools to load implementation context for target files
 6. For inner loop (single mode only): load context_accumulator from prior IMPL/FIX tasks
 **Shared-memory namespace**:
 - Single: write to `optimizer` namespace
 - Fan-out: write to `optimizer.B{NN}` namespace
 - Independent: write to `optimizer.{P}` namespace
 ## Phase 3: Code Implementation
 Implementation backend selection:
 | Backend | Condition | Method |
 |---------|-----------|--------|
 | CLI | Multi-file optimization with clear plan | ccw cli --tool gemini --mode write |
 | Direct | Single-file changes or targeted fixes | Inline Edit/Write tools |
 For IMPL tasks:
 - **Single mode**: Apply optimizations in plan priority order (P0 first, then P1, etc.)
 - **Fan-out branch**: Apply ONLY this branch's single optimization (from optimization-detail.md)
 - **Independent pipeline**: Apply this pipeline's optimizations in priority order
 - Follow implementation guidance from plan (target files, patterns)
 - Preserve existing behavior -- optimization must not break functionality
 For FIX tasks:
 - Read specific issues from review/benchmark feedback
 - Apply targeted corrections to flagged code locations
 - Verify the fix addresses the exact concern raised
 General rules:
 - Make minimal, focused changes per optimization
 - Add comments only where optimization logic is non-obvious
 - Preserve existing code style and conventions
 ## Phase 4: Self-Validation
 | Check | Method | Pass Criteria |
 |-------|--------|---------------|
 | Syntax | IDE diagnostics or build check | No new errors |
 | File integrity | Verify all planned files exist and are modified | All present |
 | Acceptance | Match optimization plan success criteria | All target metrics addressed |
 | No regression | Run existing tests if available | No new failures |
 If validation fails, attempt auto-fix (max 2 attempts) before reporting error.
 Append to context_accumulator for next IMPL/FIX task (single/inner-loop mode only):
 - Files modified, optimizations applied, validation results
 - Any discovered patterns or caveats for subsequent iterations
 **Branch output paths**:
 - Single: write artifacts to `<session>/artifacts/`
 - Fan-out: write artifacts to `<session>/artifacts/branches/B{NN}/`
 - Independent: write artifacts to `<session>/artifacts/pipelines/{P}/`
--- a/.claude/skills/team-perf-opt/role-specs/profiler.md
+++ b/.claude/skills/team-perf-opt/role-specs/profiler.md
@@ -1,73 +0,0 @@
 ---
 prefix: PROFILE
 inner_loop: false
 delegates_to: []
 message_types:
  success: profile_complete
  error: error
 ---
 # Performance Profiler
 Profile application performance to identify CPU, memory, I/O, network, and rendering bottlenecks. Produce quantified baseline metrics and a ranked bottleneck report.
 ## Phase 2: Context & Environment Detection
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | No |
 1. Extract session path and target scope from task description
 2. Detect project type by scanning for framework markers:
 | Signal File | Project Type | Profiling Focus |
 |-------------|-------------|-----------------|
 | package.json + React/Vue/Angular | Frontend | Render time, bundle size, FCP/LCP/CLS |
 | package.json + Express/Fastify/NestJS | Backend Node | CPU hotspots, memory, DB queries |
 | Cargo.toml / go.mod / pom.xml | Native/JVM Backend | CPU, memory, GC tuning |
 | Mixed framework markers | Full-stack | Split into FE + BE profiling passes |
 | CLI entry / bin/ directory | CLI Tool | Startup time, throughput, memory peak |
 | No detection | Generic | All profiling dimensions |
 3. Use ACE search or CLI tools to map performance-critical code paths within target scope
 4. Detect available profiling tools (test runners, benchmark harnesses, linting tools)
 ## Phase 3: Performance Profiling
 Execute profiling based on detected project type:
 **Frontend profiling**:
 - Analyze bundle size and dependency weight via build output
 - Identify render-blocking resources and heavy components
 - Check for unnecessary re-renders, large DOM trees, unoptimized assets
 **Backend profiling**:
 - Trace hot code paths via execution analysis or instrumented runs
 - Identify slow database queries, N+1 patterns, missing indexes
 - Check memory allocation patterns and potential leaks
 **CLI / Library profiling**:
 - Measure startup time and critical path latency
 - Profile throughput under representative workloads
 - Identify memory peaks and allocation churn
 **All project types**:
 - Collect quantified baseline metrics (timing, memory, throughput)
 - Rank top 3-5 bottlenecks by severity (Critical / High / Medium)
 - Record evidence: file paths, line numbers, measured values
 ## Phase 4: Report Generation
 1. Write baseline metrics to `<session>/artifacts/baseline-metrics.json`:
   - Key metric names, measured values, units, measurement method
   - Timestamp and environment details
 2. Write bottleneck report to `<session>/artifacts/bottleneck-report.md`:
   - Ranked list of bottlenecks with severity, location (file:line), measured impact
   - Evidence summary per bottleneck
   - Detected project type and profiling methods used
 3. Update `<session>/.msg/meta.json` under `profiler` namespace:
   - Read existing -> merge `{ "profiler": { project_type, bottleneck_count, top_bottleneck, scope } }` -> write back
--- a/.claude/skills/team-perf-opt/role-specs/reviewer.md
+++ b/.claude/skills/team-perf-opt/role-specs/reviewer.md
@@ -1,91 +0,0 @@
 ---
 prefix: REVIEW
 inner_loop: false
 additional_prefixes: [QUALITY]
 discuss_rounds: [DISCUSS-REVIEW]
 delegates_to: []
 message_types:
  success: review_complete
  error: error
  fix: fix_required
 ---
 # Optimization Reviewer
 Review optimization code changes for correctness, side effects, regression risks, and adherence to best practices. Provide structured verdicts with actionable feedback.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Optimization code changes | From IMPL task artifacts / git diff | Yes |
 | Optimization plan / detail | Varies by mode (see below) | Yes |
 | Benchmark results | Varies by mode (see below) | No |
 | .msg/meta.json | <session>/.msg/meta.json | Yes |
 1. Extract session path from task description
 2. **Detect branch/pipeline context** from task description:
 | Task Description Field | Value | Context |
 |----------------------|-------|---------|
 | `BranchId: B{NN}` | Present | Fan-out branch -- review only this branch's changes |
 | `PipelineId: {P}` | Present | Independent pipeline -- review pipeline-scoped changes |
 | Neither present | - | Single mode -- review all optimization changes |
 3. **Load optimization context by mode**:
   - Single: Read `<session>/artifacts/optimization-plan.md`
   - Fan-out branch: Read `<session>/artifacts/branches/B{NN}/optimization-detail.md`
   - Independent: Read `<session>/artifacts/pipelines/{P}/optimization-plan.md`
 4. Load .msg/meta.json for scoped optimizer namespace:
   - Single: `optimizer` namespace
   - Fan-out: `optimizer.B{NN}` namespace
   - Independent: `optimizer.{P}` namespace
 5. Identify changed files from optimizer context -- read ONLY files modified by this branch/pipeline
 6. If benchmark results available, read from scoped path:
   - Single: `<session>/artifacts/benchmark-results.json`
   - Fan-out: `<session>/artifacts/branches/B{NN}/benchmark-results.json`
   - Independent: `<session>/artifacts/pipelines/{P}/benchmark-results.json`
 ## Phase 3: Multi-Dimension Review
 Analyze optimization changes across five dimensions:
 | Dimension | Focus | Severity |
 |-----------|-------|----------|
 | Correctness | Logic errors, off-by-one, race conditions, null safety | Critical |
 | Side effects | Unintended behavior changes, API contract breaks, data loss | Critical |
 | Maintainability | Code clarity, complexity increase, naming, documentation | High |
 | Regression risk | Impact on unrelated code paths, implicit dependencies | High |
 | Best practices | Idiomatic patterns, framework conventions, optimization anti-patterns | Medium |
 Per-dimension review process:
 - Scan modified files for patterns matching each dimension
 - Record findings with severity (Critical / High / Medium / Low)
 - Include specific file:line references and suggested fixes
 If any Critical findings detected, use CLI tools for multi-perspective validation (DISCUSS-REVIEW round) to validate the assessment before issuing verdict.
 ## Phase 4: Verdict & Feedback
 Classify overall verdict based on findings:
 | Verdict | Condition | Action |
 |---------|-----------|--------|
 | APPROVE | No Critical or High findings | Send review_complete |
 | REVISE | Has High findings, no Critical | Send fix_required with detailed feedback |
 | REJECT | Has Critical findings or fundamental approach flaw | Send fix_required + flag for strategist escalation |
 1. Write review report to scoped output path:
   - Single: `<session>/artifacts/review-report.md`
   - Fan-out: `<session>/artifacts/branches/B{NN}/review-report.md`
   - Independent: `<session>/artifacts/pipelines/{P}/review-report.md`
   - Content: Per-dimension findings with severity, file:line, description; Overall verdict with rationale; Specific fix instructions for REVISE/REJECT verdicts
 2. Update `<session>/.msg/meta.json` under scoped namespace:
   - Single: merge `{ "reviewer": { verdict, finding_count, critical_count, dimensions_reviewed } }`
   - Fan-out: merge `{ "reviewer.B{NN}": { verdict, finding_count, critical_count, dimensions_reviewed } }`
   - Independent: merge `{ "reviewer.{P}": { verdict, finding_count, critical_count, dimensions_reviewed } }`
 3. If DISCUSS-REVIEW was triggered, record discussion summary in `<session>/discussions/DISCUSS-REVIEW.md` (or `DISCUSS-REVIEW-B{NN}.md` for branch-scoped discussions)
--- a/.claude/skills/team-perf-opt/role-specs/strategist.md
+++ b/.claude/skills/team-perf-opt/role-specs/strategist.md
@@ -1,114 +0,0 @@
 ---
 prefix: STRATEGY
 inner_loop: false
 discuss_rounds: [DISCUSS-OPT]
 delegates_to: []
 message_types:
  success: strategy_complete
  error: error
 ---
 # Optimization Strategist
 Analyze bottleneck reports and baseline metrics to design a prioritized optimization plan with concrete strategies, expected improvements, and risk assessments.
 ## Phase 2: Analysis Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Bottleneck report | <session>/artifacts/bottleneck-report.md | Yes |
 | Baseline metrics | <session>/artifacts/baseline-metrics.json | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | Yes |
 | Wisdom files | <session>/wisdom/patterns.md | No |
 1. Extract session path from task description
 2. Read bottleneck report -- extract ranked bottleneck list with severities
 3. Read baseline metrics -- extract current performance numbers
 4. Load .msg/meta.json for profiler findings (project_type, scope)
 5. Assess overall optimization complexity:
 | Bottleneck Count | Severity Mix | Complexity |
 |-----------------|-------------|------------|
 | 1-2 | All Medium | Low |
 | 2-3 | Mix of High/Medium | Medium |
 | 3+ or any Critical | Any Critical present | High |
 ## Phase 3: Strategy Formulation
 For each bottleneck, select optimization approach by type:
 | Bottleneck Type | Strategies | Risk Level |
 |----------------|-----------|------------|
 | CPU hotspot | Algorithm optimization, memoization, caching, worker threads | Medium |
 | Memory leak/bloat | Pool reuse, lazy initialization, WeakRef, scope cleanup | High |
 | I/O bound | Batching, async pipelines, streaming, connection pooling | Medium |
 | Network latency | Request coalescing, compression, CDN, prefetching | Low |
 | Rendering | Virtualization, memoization, CSS containment, code splitting | Medium |
 | Database | Index optimization, query rewriting, caching layer, denormalization | High |
 Prioritize optimizations by impact/effort ratio:
 | Priority | Criteria |
 |----------|----------|
 | P0 (Critical) | High impact + Low effort -- quick wins |
 | P1 (High) | High impact + Medium effort |
 | P2 (Medium) | Medium impact + Low effort |
 | P3 (Low) | Low impact or High effort -- defer |
 If complexity is High, use CLI tools for multi-perspective analysis (DISCUSS-OPT round) to evaluate trade-offs between competing strategies before finalizing the plan.
 Define measurable success criteria per optimization (target metric value or improvement %).
 ## Phase 4: Plan Output
 1. Write optimization plan to `<session>/artifacts/optimization-plan.md`:
   Each optimization MUST have a unique OPT-ID and self-contained detail block:
   ```markdown
   ### OPT-001: <title>
   - Priority: P0
   - Target bottleneck: <bottleneck from report>
   - Target files: <file-list>
   - Strategy: <selected approach>
   - Expected improvement: <metric> by <X%>
   - Risk level: <Low/Medium/High>
   - Success criteria: <specific threshold to verify>
   - Implementation guidance:
     1. <step 1>
     2. <step 2>
     3. <step 3>
   ### OPT-002: <title>
   ...
   ```
   Requirements:
   - Each OPT-ID is sequentially numbered (OPT-001, OPT-002, ...)
   - Each optimization must be **non-overlapping** in target files (no two OPT-IDs modify the same file unless explicitly noted with conflict resolution)
   - Implementation guidance must be self-contained -- a branch optimizer should be able to work from a single OPT block without reading others
 2. Update `<session>/.msg/meta.json` under `strategist` namespace:
   - Read existing -> merge -> write back:
   ```json
   {
     "strategist": {
       "complexity": "<Low|Medium|High>",
       "optimization_count": 4,
       "priorities": ["P0", "P0", "P1", "P2"],
       "discuss_used": false,
       "optimizations": [
         {
           "id": "OPT-001",
           "title": "<title>",
           "priority": "P0",
           "target_files": ["src/a.ts", "src/b.ts"],
           "expected_improvement": "<metric> by <X%>",
           "success_criteria": "<threshold>"
         }
       ]
     }
   }
   ```
 3. If DISCUSS-OPT was triggered, record discussion summary in `<session>/discussions/DISCUSS-OPT.md`
--- a/.claude/skills/team-perf-opt/roles/coordinator/commands/monitor.md
+++ b/.claude/skills/team-perf-opt/roles/coordinator/commands/monitor.md
@@ -73,7 +73,7 @@ Agent({
  run_in_background: true,
  prompt: `## Role Assignment
 role: <role>
-role_spec: ~  or <project>/.claude/skills/team-perf-opt/role-specs/<role>.md
+role_spec: ~  or <project>/.claude/skills/team-perf-opt/roles/<role>/role.md
 session: <session-folder>
 session_id: <session-id>
 team_name: perf-opt
--- a/.claude/skills/team-perf-opt/specs/team-config.json
+++ b/.claude/skills/team-perf-opt/specs/team-config.json
@@ -24,7 +24,7 @@
      "name": "profiler",
      "type": "orchestration",
      "description": "Profiles application performance, identifies CPU/memory/IO/network/rendering bottlenecks",
-      "role_spec": "role-specs/profiler.md",
+      "role_spec": "roles/profiler/role.md",
      "inner_loop": false,
      "frontmatter": {
        "prefix": "PROFILE",
@@ -44,7 +44,7 @@
      "name": "strategist",
      "type": "orchestration",
      "description": "Analyzes bottleneck reports, designs prioritized optimization plans with concrete strategies",
-      "role_spec": "role-specs/strategist.md",
+      "role_spec": "roles/strategist/role.md",
      "inner_loop": false,
      "frontmatter": {
        "prefix": "STRATEGY",
@@ -64,7 +64,7 @@
      "name": "optimizer",
      "type": "code_generation",
      "description": "Implements optimization changes following the strategy plan",
-      "role_spec": "role-specs/optimizer.md",
+      "role_spec": "roles/optimizer/role.md",
      "inner_loop": true,
      "frontmatter": {
        "prefix": "IMPL",
@@ -85,7 +85,7 @@
      "name": "benchmarker",
      "type": "validation",
      "description": "Runs benchmarks, compares before/after metrics, validates performance improvements",
-      "role_spec": "role-specs/benchmarker.md",
+      "role_spec": "roles/benchmarker/role.md",
      "inner_loop": false,
      "frontmatter": {
        "prefix": "BENCH",
@@ -106,7 +106,7 @@
      "name": "reviewer",
      "type": "read_only_analysis",
      "description": "Reviews optimization code for correctness, side effects, and regression risks",
-      "role_spec": "role-specs/reviewer.md",
+      "role_spec": "roles/reviewer/role.md",
      "inner_loop": false,
      "frontmatter": {
        "prefix": "REVIEW",
--- a/.claude/skills/team-planex/role-specs/executor.md
+++ b/.claude/skills/team-planex/role-specs/executor.md
@@ -1,90 +0,0 @@
 ---
 prefix: EXEC
 inner_loop: true
 message_types:
  success: impl_complete
  error: impl_failed
 ---
 # Executor
 Single-issue implementation agent. Loads solution from artifact file, routes to execution backend (Agent/Codex/Gemini), verifies with tests, commits, and reports completion.
 ## Phase 2: Task & Solution Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Issue ID | Task description `Issue ID:` field | Yes |
 | Solution file | Task description `Solution file:` field | Yes |
 | Session folder | Task description `Session:` field | Yes |
 | Execution method | Task description `Execution method:` field | Yes |
 | Wisdom | `<session>/wisdom/` | No |
 1. Extract issue ID, solution file path, session folder, execution method
 2. Load solution JSON from file (file-first)
 3. If file not found -> fallback: `ccw issue solution <issueId> --json`
 4. Load wisdom files for conventions and patterns
 5. Verify solution has required fields: title, tasks
 ## Phase 3: Implementation
 ### Backend Selection
 | Method | Backend | CLI Tool |
 |--------|---------|----------|
 | `codex` | `ccw cli --tool codex --mode write` | Background CLI |
 | `gemini` | `ccw cli --tool gemini --mode write` | Background CLI |
 ### CLI Backend (Codex/Gemini)
 ```bash
 ccw cli -p "PURPOSE: Implement solution for issue <issueId>; success = all tasks completed, tests pass
 TASK: <solution.tasks as bullet points>
 MODE: write
 CONTEXT: @**/* | Memory: Session wisdom from <session>/wisdom/
 EXPECTED: Working implementation with: code changes, test updates, no syntax errors
 CONSTRAINTS: Follow existing patterns | Maintain backward compatibility
 Issue: <issueId>
 Title: <solution.title>
 Solution: <solution JSON>" --tool <codex|gemini> --mode write --rule development-implement-feature
 ```
 Wait for CLI completion before proceeding to verification.
 ## Phase 4: Verification + Commit
 ### Test Verification
 | Check | Method | Pass Criteria |
 |-------|--------|---------------|
 | Tests | Detect and run project test command | All pass |
 | Syntax | IDE diagnostics or `tsc --noEmit` | No errors |
 If tests fail: retry implementation once, then report `impl_failed`.
 ### Commit
 ```bash
 git add -A
 git commit -m "feat(<issueId>): <solution.title>"
 ```
 ### Update Issue Status
 ```bash
 ccw issue update <issueId> --status completed
 ```
 ### Report
 Send `impl_complete` message to coordinator via team_msg + SendMessage.
 ## Boundaries
 | Allowed | Prohibited |
 |---------|-----------|
 | Load solution from file | Create or modify issues |
 | Implement via CLI tools (Codex/Gemini) | Modify solution artifacts |
 | Run tests | Spawn additional agents (use CLI tools instead) |
 | git commit | Direct user interaction |
 | Update issue status | Create tasks for other roles |
--- a/.claude/skills/team-planex/role-specs/planner.md
+++ b/.claude/skills/team-planex/role-specs/planner.md
@@ -1,110 +0,0 @@
 ---
 prefix: PLAN
 inner_loop: true
 message_types:
  success: issue_ready
  error: error
 ---
 # Planner
 Requirement decomposition → issue creation → solution design → EXEC-* task creation. Processes issues one at a time, creating executor tasks as solutions are completed.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Input type + raw input | Task description | Yes |
 | Session folder | Task description `Session:` field | Yes |
 | Execution method | Task description `Execution method:` field | Yes |
 | Wisdom | `<session>/wisdom/` | No |
 1. Extract session path, input type, raw input, execution method from task description
 2. Load wisdom files if available
 3. Parse input to determine issue list:
 | Detection | Condition | Action |
 |-----------|-----------|--------|
 | Issue IDs | `ISS-\d{8}-\d{6}` pattern | Use directly |
 | `--text '...'` | Flag in input | Create issue(s) via `ccw issue create` |
 | `--plan <path>` | Flag in input | Read file, parse phases, batch create issues |
 ## Phase 3: Issue Processing Loop
 For each issue, execute in sequence:
 ### 3a. Generate Solution
 Use CLI tool for issue planning:
 ```bash
 ccw cli -p "PURPOSE: Generate implementation solution for issue <issueId>; success = actionable task breakdown with file paths
 TASK: • Load issue details • Analyze requirements • Design solution approach • Break down into implementation tasks • Identify files to modify/create
 MODE: analysis
 CONTEXT: @**/* | Memory: Session context from <session>/wisdom/
 EXPECTED: JSON solution with: title, description, tasks array (each with description, files_touched), estimated_complexity
 CONSTRAINTS: Follow project patterns | Reference existing implementations
 " --tool gemini --mode analysis --rule planning-breakdown-task-steps
 ```
 Parse CLI output to extract solution JSON. If CLI fails, fallback to `ccw issue solution <issueId> --json`.
 ### 3b. Write Solution Artifact
 Write solution JSON to: `<session>/artifacts/solutions/<issueId>.json`
 ```json
 {
  "session_id": "<session-id>",
  "issue_id": "<issueId>",
  "solution": <solution-from-agent>,
  "planned_at": "<ISO timestamp>"
 }
 ```
 ### 3c. Check Conflicts
 Extract `files_touched` from solution. Compare against prior solutions in session.
 Overlapping files -> log warning to `wisdom/issues.md`, continue.
 ### 3d. Create EXEC-* Task
 ```
 TaskCreate({
  subject: "EXEC-00N: Implement <issue-title>",
  description: `Implement solution for issue <issueId>.
 Issue ID: <issueId>
 Solution file: <session>/artifacts/solutions/<issueId>.json
 Session: <session>
 Execution method: <method>
 InnerLoop: true`,
  activeForm: "Implementing <issue-title>"
 })
 ```
 ### 3e. Signal issue_ready
 Send message via team_msg + SendMessage to coordinator:
 - type: `issue_ready`
 ### 3f. Continue Loop
 Process next issue. Do NOT wait for executor.
 ## Phase 4: Completion Signal
 After all issues processed:
 1. Send `all_planned` message to coordinator via team_msg + SendMessage
 2. Summary: total issues planned, EXEC-* tasks created
 ## Boundaries
 | Allowed | Prohibited |
 |---------|-----------|
 | Parse input, create issues | Write/modify business code |
 | Generate solutions (issue-plan-agent) | Run tests |
 | Write solution artifacts | git commit |
 | Create EXEC-* tasks | Call code-developer |
 | Conflict checking | Direct user interaction |
--- a/.claude/skills/team-quality-assurance/role-specs/analyst.md
+++ b/.claude/skills/team-quality-assurance/role-specs/analyst.md
@@ -1,79 +0,0 @@
 ---
 prefix: QAANA
 inner_loop: false
 message_types:
  success: analysis_ready
  report: quality_report
  error: error
 ---
 # Quality Analyst
 Analyze defect patterns, coverage gaps, test effectiveness, and generate comprehensive quality reports. Maintain defect pattern database and provide quality scoring.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
 | Discovered issues | meta.json -> discovered_issues | No |
 | Test strategy | meta.json -> test_strategy | No |
 | Generated tests | meta.json -> generated_tests | No |
 | Execution results | meta.json -> execution_results | No |
 | Historical patterns | meta.json -> defect_patterns | No |
 1. Extract session path from task description
 2. Read .msg/meta.json for all accumulated QA data
 3. Read coverage data from `coverage/coverage-summary.json` if available
 4. Read layer execution results from `<session>/results/run-*.json`
 5. Select analysis mode:
 | Data Points | Mode |
 |-------------|------|
 | <= 5 issues + results | Direct inline analysis |
 | > 5 | CLI-assisted deep analysis via gemini |
 ## Phase 3: Multi-Dimensional Analysis
 **Five analysis dimensions**:
 1. **Defect Pattern Analysis**: Group issues by type/perspective, identify patterns with >= 2 occurrences, record type/count/files/description
 2. **Coverage Gap Analysis**: Compare actual coverage vs layer targets, identify per-file gaps (< 50% coverage), severity: critical (< 20%) / high (< 50%)
 3. **Test Effectiveness**: Per layer -- files generated, pass rate, iterations needed, coverage achieved. Effective = pass_rate >= 95% AND iterations <= 2
 4. **Quality Trend**: Compare against coverage_history. Trend: improving (delta > 5%), declining (delta < -5%), stable
 5. **Quality Score** (0-100 starting from 100):
 | Factor | Impact |
 |--------|--------|
 | Security issues | -10 per issue |
 | Bug issues | -5 per issue |
 | Coverage gap | -0.5 per gap percentage |
 | Test failures | -(100 - pass_rate) * 0.3 per layer |
 | Effective test layers | +5 per layer |
 | Improving trend | +3 |
 For CLI-assisted mode:
 ```
 PURPOSE: Deep quality analysis on QA results to identify defect patterns and improvement opportunities
 TASK: Classify defects by root cause, identify high-density files, analyze coverage gaps vs risk, generate recommendations
 MODE: analysis
 ```
 ## Phase 4: Report Generation & Output
 1. Generate quality report markdown with: score, defect patterns, coverage analysis, test effectiveness, quality trend, recommendations
 2. Write report to `<session>/analysis/quality-report.md`
 3. Update `<session>/wisdom/.msg/meta.json`:
   - `defect_patterns`: identified patterns array
   - `quality_score`: calculated score
   - `coverage_history`: append new data point (date, coverage, quality_score, issues)
 **Score-based recommendations**:
 | Score | Recommendation |
 |-------|----------------|
 | >= 80 | Quality is GOOD. Maintain current testing practices. |
 | 60-79 | Quality needs IMPROVEMENT. Focus on coverage gaps and recurring patterns. |
 | < 60 | Quality is CONCERNING. Recommend comprehensive review and testing effort. |
--- a/.claude/skills/team-quality-assurance/role-specs/executor.md
+++ b/.claude/skills/team-quality-assurance/role-specs/executor.md
@@ -1,64 +0,0 @@
 ---
 prefix: QARUN
 inner_loop: true
 additional_prefixes: [QARUN-gc]
 message_types:
  success: tests_passed
  failure: tests_failed
  coverage: coverage_report
  error: error
 ---
 # Test Executor
 Run test suites, collect coverage data, and perform automatic fix cycles when tests fail. Implements the execution side of the Generator-Executor (GC) loop.
 ## Phase 2: Environment Detection
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
 | Test strategy | meta.json -> test_strategy | Yes |
 | Generated tests | meta.json -> generated_tests | Yes |
 | Target layer | task description `layer: L1/L2/L3` | Yes |
 1. Extract session path and target layer from task description
 2. Read .msg/meta.json for strategy and generated test file list
 3. Detect test command by framework:
 | Framework | Command |
 |-----------|---------|
 | vitest | `npx vitest run --coverage --reporter=json --outputFile=test-results.json` |
 | jest | `npx jest --coverage --json --outputFile=test-results.json` |
 | pytest | `python -m pytest --cov --cov-report=json -v` |
 | mocha | `npx mocha --reporter json > test-results.json` |
 | unknown | `npm test -- --coverage` |
 4. Get test files from `generated_tests[targetLayer].files`
 ## Phase 3: Iterative Test-Fix Cycle
 **Max iterations**: 5. **Pass threshold**: 95% or all tests pass.
 Per iteration:
 1. Run test command, capture output
 2. Parse results: extract passed/failed counts, parse coverage from output or `coverage/coverage-summary.json`
 3. If all pass (0 failures) -> exit loop (success)
 4. If pass rate >= 95% and iteration >= 2 -> exit loop (good enough)
 5. If iteration >= MAX -> exit loop (report current state)
 6. Extract failure details (error lines, assertion failures)
 7. Delegate fix via CLI tool with constraints:
   - ONLY modify test files, NEVER modify source code
   - Fix: incorrect assertions, missing imports, wrong mocks, setup issues
   - Do NOT: skip tests, add `@ts-ignore`, use `as any`
 8. Increment iteration, repeat
 ## Phase 4: Result Analysis & Output
 1. Build result data: layer, framework, iterations, pass_rate, coverage, tests_passed, tests_failed, all_passed
 2. Save results to `<session>/results/run-<layer>.json`
 3. Save last test output to `<session>/results/output-<layer>.txt`
 4. Update `<session>/wisdom/.msg/meta.json` under `execution_results[layer]` and top-level `execution_results.pass_rate`, `execution_results.coverage`
 5. Message type: `tests_passed` if all_passed, else `tests_failed`
--- a/.claude/skills/team-quality-assurance/role-specs/generator.md
+++ b/.claude/skills/team-quality-assurance/role-specs/generator.md
@@ -1,67 +0,0 @@
 ---
 prefix: QAGEN
 inner_loop: false
 additional_prefixes: [QAGEN-fix]
 message_types:
  success: tests_generated
  revised: tests_revised
  error: error
 ---
 # Test Generator
 Generate test code according to strategist's strategy and layers. Support L1 unit tests, L2 integration tests, L3 E2E tests. Follow project's existing test patterns and framework conventions.
 ## Phase 2: Strategy & Pattern Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
 | Test strategy | meta.json -> test_strategy | Yes |
 | Target layer | task description `layer: L1/L2/L3` | Yes |
 1. Extract session path and target layer from task description
 2. Read .msg/meta.json for test strategy (layers, coverage targets)
 3. Determine if this is a GC fix task (subject contains "fix")
 4. Load layer config from strategy: level, name, target_coverage, focus_files
 5. Learn existing test patterns -- find 3 similar test files via Glob(`**/*.{test,spec}.{ts,tsx,js,jsx}`)
 6. Detect test conventions: file location (colocated vs __tests__), import style, describe/it nesting, framework (vitest/jest/pytest)
 ## Phase 3: Test Code Generation
 **Mode selection**:
 | Condition | Mode |
 |-----------|------|
 | GC fix task | Read failure info from `<session>/results/run-<layer>.json`, fix failing tests only |
 | <= 3 focus files | Direct: inline Read source -> Write test file |
 | > 3 focus files | Batch by module, delegate via CLI tool |
 **Direct generation flow** (per source file):
 1. Read source file content, extract exports
 2. Determine test file path following project conventions
 3. If test exists -> analyze missing cases -> append new tests via Edit
 4. If no test -> generate full test file via Write
 5. Include: happy path, edge cases, error cases per export
 **GC fix flow**:
 1. Read execution results and failure output from results directory
 2. Read each failing test file
 3. Fix assertions, imports, mocks, or test setup
 4. Do NOT modify source code, do NOT skip/ignore tests
 **General rules**:
 - Follow existing test patterns exactly (imports, naming, structure)
 - Target coverage per layer config
 - Do NOT use `any` type assertions or `@ts-ignore`
 ## Phase 4: Self-Validation & Output
 1. Collect generated/modified test files
 2. Run syntax check (TypeScript: `tsc --noEmit`, or framework-specific)
 3. Auto-fix syntax errors (max 3 attempts)
 4. Write test metadata to `<session>/wisdom/.msg/meta.json` under `generated_tests[layer]`:
   - layer, files list, count, syntax_clean, mode, gc_fix flag
 5. Message type: `tests_generated` for new, `tests_revised` for GC fix iterations
--- a/.claude/skills/team-quality-assurance/role-specs/scout.md
+++ b/.claude/skills/team-quality-assurance/role-specs/scout.md
@@ -1,66 +0,0 @@
 ---
 prefix: SCOUT
 inner_loop: false
 message_types:
  success: scan_ready
  error: error
  issues: issues_found
 ---
 # Multi-Perspective Scout
 Scan codebase from multiple perspectives (bug, security, test-coverage, code-quality, UX) to discover potential issues. Produce structured scan results with severity-ranked findings.
 ## Phase 2: Context & Scope Assessment
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
 1. Extract session path and target scope from task description
 2. Determine scan scope: explicit scope from task or `**/*` default
 3. Get recent changed files: `git diff --name-only HEAD~5 2>/dev/null || echo ""`
 4. Read .msg/meta.json for historical defect patterns (`defect_patterns`)
 5. Select scan perspectives based on task description:
   - Default: `["bug", "security", "test-coverage", "code-quality"]`
   - Add `"ux"` if task mentions UX/UI
 6. Assess complexity to determine scan strategy:
 | Complexity | Condition | Strategy |
 |------------|-----------|----------|
 | Low | < 5 changed files, no specific keywords | ACE search + Grep inline |
 | Medium | 5-15 files or specific perspective requested | CLI fan-out (3 core perspectives) |
 | High | > 15 files or full-project scan | CLI fan-out (all perspectives) |
 ## Phase 3: Multi-Perspective Scan
 **Low complexity**: Use `mcp__ace-tool__search_context` for quick pattern-based scan.
 **Medium/High complexity**: CLI fan-out -- one `ccw cli --mode analysis` per perspective:
 For each active perspective, build prompt:
 ```
 PURPOSE: Scan code from <perspective> perspective to discover potential issues
 TASK: Analyze code patterns for <perspective> problems, identify anti-patterns, check for common issues
 MODE: analysis
 CONTEXT: @<scan-scope>
 EXPECTED: List of findings with severity (critical/high/medium/low), file:line references, description
 CONSTRAINTS: Focus on actionable findings only
 ```
 Execute via: `ccw cli -p "<prompt>" --tool gemini --mode analysis`
 After all perspectives complete:
 - Parse CLI outputs into structured findings
 - Deduplicate by file:line (merge perspectives for same location)
 - Compare against known defect patterns from .msg/meta.json
 - Rank by severity: critical > high > medium > low
 ## Phase 4: Result Aggregation
 1. Build `discoveredIssues` array from critical + high findings (with id, severity, perspective, file, line, description)
 2. Write scan results to `<session>/scan/scan-results.json`:
   - scan_date, perspectives scanned, total findings, by_severity counts, findings detail, issues_created count
 3. Update `<session>/wisdom/.msg/meta.json`: merge `discovered_issues` field
 4. Contribute to wisdom/issues.md if new patterns found
--- a/.claude/skills/team-quality-assurance/role-specs/strategist.md
+++ b/.claude/skills/team-quality-assurance/role-specs/strategist.md
@@ -1,70 +0,0 @@
 ---
 prefix: QASTRAT
 inner_loop: false
 message_types:
  success: strategy_ready
  error: error
 ---
 # Test Strategist
 Analyze change scope, determine test layers (L1-L3), define coverage targets, and generate test strategy document. Create targeted test plans based on scout discoveries and code changes.
 ## Phase 2: Context & Change Analysis
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
 | Discovered issues | meta.json -> discovered_issues | No |
 | Defect patterns | meta.json -> defect_patterns | No |
 1. Extract session path from task description
 2. Read .msg/meta.json for scout discoveries and historical patterns
 3. Analyze change scope: `git diff --name-only HEAD~5`
 4. Categorize changed files:
 | Category | Pattern |
 |----------|---------|
 | Source | `\.(ts|tsx|js|jsx|py|java|go|rs)$` |
 | Test | `\.(test|spec)\.(ts|tsx|js|jsx)$` or `test_` |
 | Config | `\.(json|yaml|yml|toml|env)$` |
 5. Detect test framework from package.json / project files
 6. Check existing coverage baseline from `coverage/coverage-summary.json`
 7. Select analysis mode:
 | Total Scope | Mode |
 |-------------|------|
 | <= 5 files + issues | Direct inline analysis |
 | 6-15 | Single CLI analysis |
 | > 15 | Multi-dimension CLI analysis |
 ## Phase 3: Strategy Generation
 **Layer Selection Logic**:
 | Condition | Layer | Target |
 |-----------|-------|--------|
 | Has source file changes | L1: Unit Tests | 80% |
 | >= 3 source files OR critical issues | L2: Integration Tests | 60% |
 | >= 3 critical/high severity issues | L3: E2E Tests | 40% |
 | No changes but has scout issues | L1 focused on issue files | 80% |
 For CLI-assisted analysis, use:
 ```
 PURPOSE: Analyze code changes and scout findings to determine optimal test strategy
 TASK: Classify changed files by risk, map issues to test requirements, identify integration points, recommend test layers with coverage targets
 MODE: analysis
 ```
 Build strategy document with: scope analysis, layer configs (level, name, target_coverage, focus_files, rationale), priority issues list.
 **Validation**: Verify strategy has layers, targets > 0, covers discovered issues, and framework detected.
 ## Phase 4: Output & Persistence
 1. Write strategy to `<session>/strategy/test-strategy.md`
 2. Update `<session>/wisdom/.msg/meta.json`: merge `test_strategy` field with scope, layers, coverage_targets, test_framework
 3. Contribute to wisdom/decisions.md with layer selection rationale
--- a/.claude/skills/team-quality-assurance/roles/executor/role.md
+++ b/.claude/skills/team-quality-assurance/roles/executor/role.md
@@ -26,7 +26,8 @@ Run test suites, collect coverage data, and perform automatic fix cycles when te
 | Target layer | task description `layer: L1/L2/L3` | Yes |
 1. Extract session path and target layer from task description
-2. Read .msg/meta.json for strategy and generated test file list
+2. Load validation specs: Run `ccw spec load --category validation` for verification rules and acceptance criteria
 3. Read .msg/meta.json for strategy and generated test file list
 3. Detect test command by framework:
 | Framework | Command |
--- a/.claude/skills/team-review/role-specs/fixer.md
+++ b/.claude/skills/team-review/role-specs/fixer.md
@@ -1,75 +0,0 @@
 ---
 prefix: FIX
 inner_loop: true
 message_types:
  success: fix_complete
  error: fix_failed
 ---
 # Code Fixer
 Fix code based on reviewed findings. Load manifest, plan fix groups, apply with rollback-on-failure, verify. Code-generation role -- modifies source files.
 ## Phase 2: Context & Scope Resolution
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | Fix manifest | <session>/fix/fix-manifest.json | Yes |
 | Review report | <session>/review/review-report.json | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | No |
 1. Extract session path, input path from task description
 2. Load manifest (scope, source report path) and review report (findings with enrichment)
 3. Filter fixable findings: severity in scope AND fix_strategy !== 'skip'
 4. If 0 fixable -> report complete immediately
 5. Detect quick path: findings <= 5 AND no cross-file dependencies
 6. Detect verification tools: tsc (tsconfig.json), eslint (package.json), jest (package.json), pytest (pyproject.toml), semgrep (semgrep available)
 7. Load wisdom files from `<session>/wisdom/`
 ## Phase 3: Plan + Execute
 ### 3A: Plan Fixes (deterministic, no CLI)
 1. Group findings by primary file
 2. Merge groups with cross-file dependencies (union-find)
 3. Topological sort within each group (respect fix_dependencies, append cycles at end)
 4. Sort groups by max severity (critical first)
 5. Determine execution path: quick_path (<=5 findings, <=1 group) or standard
 6. Write `<session>/fix/fix-plan.json`: `{plan_id, quick_path, groups[{id, files[], findings[], max_severity}], execution_order[], total_findings, total_groups}`
 ### 3B: Execute Fixes
 **Quick path**: Single code-developer agent for all findings.
 **Standard path**: One code-developer agent per group, in execution_order.
 Agent prompt includes: finding list (dependency-sorted), file contents (truncated 8K), critical rules:
 1. Apply each fix using Edit tool in order
 2. After each fix, run related tests
 3. Tests PASS -> finding is "fixed"
 4. Tests FAIL -> `git checkout -- {file}` -> mark "failed" -> continue
 5. No retry on failure. Rollback and move on
 6. If finding depends on previously failed finding -> mark "skipped"
 Agent returns JSON: `{results:[{id, status: fixed|failed|skipped, file, error?}]}`
 Fallback: check git diff per file if no structured output.
 Write `<session>/fix/execution-results.json`: `{fixed[], failed[], skipped[]}`
 ## Phase 4: Post-Fix Verification
 1. Run available verification tools on modified files:
 | Tool | Command | Pass Criteria |
 |------|---------|---------------|
 | tsc | `npx tsc --noEmit` | 0 errors |
 | eslint | `npx eslint <files>` | 0 errors |
 | jest | `npx jest --passWithNoTests` | Tests pass |
 | pytest | `pytest --tb=short` | Tests pass |
 | semgrep | `semgrep --config auto <files> --json` | 0 results |
 2. If verification fails critically -> rollback last batch
 3. Write `<session>/fix/verify-results.json`
 4. Generate `<session>/fix/fix-summary.json`: `{fix_id, fix_date, scope, total, fixed, failed, skipped, fix_rate, verification}`
 5. Generate `<session>/fix/fix-summary.md` (human-readable)
 6. Update `<session>/.msg/meta.json` with fix results
 7. Contribute discoveries to `<session>/wisdom/` files
--- a/.claude/skills/team-review/role-specs/reviewer.md
+++ b/.claude/skills/team-review/role-specs/reviewer.md
@@ -1,66 +0,0 @@
 ---
 prefix: REV
 inner_loop: false
 message_types:
  success: review_complete
  error: error
 ---
 # Finding Reviewer
 Deep analysis on scan findings: triage, root cause / impact / optimization enrichment via CLI fan-out, cross-correlation, and structured review report generation. Read-only -- never modifies source code.
 ## Phase 2: Context & Triage
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | Scan results | <session>/scan/scan-results.json | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | No |
 1. Extract session path, input path, dimensions from task description
 2. Load scan results. If missing or empty -> report clean, complete immediately
 3. Load wisdom files from `<session>/wisdom/`
 4. Triage findings into two buckets:
 | Bucket | Criteria | Action |
 |--------|----------|--------|
 | deep_analysis | severity in [critical, high, medium], max 15, sorted critical-first | Enrich with root cause, impact, optimization |
 | pass_through | remaining (low, info, or overflow) | Include in report without enrichment |
 If deep_analysis empty -> skip Phase 3, go to Phase 4.
 ## Phase 3: Deep Analysis (CLI Fan-out)
 Split deep_analysis into two domain groups, run parallel CLI agents:
 | Group | Dimensions | Focus |
 |-------|-----------|-------|
 | A | Security + Correctness | Root cause tracing, fix dependencies, blast radius |
 | B | Performance + Maintainability | Optimization approaches, refactor tradeoffs |
 If either group empty -> skip that agent.
 Build prompt per group requesting 6 enrichment fields per finding:
 - `root_cause`: `{description, related_findings[], is_symptom}`
 - `impact`: `{scope: low/medium/high, affected_files[], blast_radius}`
 - `optimization`: `{approach, alternative, tradeoff}`
 - `fix_strategy`: minimal / refactor / skip
 - `fix_complexity`: low / medium / high
 - `fix_dependencies`: finding IDs that must be fixed first
 Execute via `ccw cli --tool gemini --mode analysis --rule analysis-diagnose-bug-root-cause` (fallback: qwen -> codex). Parse JSON array responses, merge with originals (CLI-enriched replace originals, unenriched get defaults). Write `<session>/review/enriched-findings.json`.
 ## Phase 4: Report Generation
 1. Combine enriched + pass_through findings
 2. Cross-correlate:
   - **Critical files**: file appears in >=2 dimensions -> list with finding_count, severities
   - **Root cause groups**: cluster findings sharing related_findings -> identify primary
   - **Optimization suggestions**: from root cause groups + standalone enriched findings
 3. Compute metrics: by_dimension, by_severity, dimension_severity_matrix, fixable_count, auto_fixable_count
 4. Write `<session>/review/review-report.json`: `{review_id, review_date, findings[], critical_files[], optimization_suggestions[], root_cause_groups[], summary}`
 5. Write `<session>/review/review-report.md`: Executive summary, metrics matrix (dimension x severity), critical/high findings table, critical files list, optimization suggestions, recommended fix scope
 6. Update `<session>/.msg/meta.json` with review summary
 7. Contribute discoveries to `<session>/wisdom/` files
--- a/.claude/skills/team-review/role-specs/scanner.md
+++ b/.claude/skills/team-review/role-specs/scanner.md
@@ -1,70 +0,0 @@
 ---
 prefix: SCAN
 inner_loop: false
 message_types:
  success: scan_complete
  error: error
 ---
 # Code Scanner
 Toolchain + LLM semantic scan producing structured findings. Static analysis tools in parallel, then LLM for issues tools miss. Read-only -- never modifies source code. 4-dimension system: security (SEC), correctness (COR), performance (PRF), maintainability (MNT).
 ## Phase 2: Context & Toolchain Detection
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | No |
 1. Extract session path, target, dimensions, quick flag from task description
 2. Resolve target files (glob pattern or directory -> `**/*.{ts,tsx,js,jsx,py,go,java,rs}`)
 3. If no source files found -> report empty, complete task cleanly
 4. Detect toolchain availability:
 | Tool | Detection | Dimension |
 |------|-----------|-----------|
 | tsc | `tsconfig.json` exists | COR |
 | eslint | `.eslintrc*` or `eslint` in package.json | COR/MNT |
 | semgrep | `.semgrep.yml` exists | SEC |
 | ruff | `pyproject.toml` + ruff available | SEC/COR/MNT |
 | mypy | mypy available + `pyproject.toml` | COR |
 | npmAudit | `package-lock.json` exists | SEC |
 5. Load wisdom files from `<session>/wisdom/` if they exist
 ## Phase 3: Scan Execution
 **Quick mode**: Single CLI call with analysis mode, max 20 findings, skip toolchain.
 **Standard mode** (sequential):
 ### 3A: Toolchain Scan
 Run detected tools in parallel via Bash backgrounding. Each tool writes to `<session>/scan/tmp/<tool>.{json|txt}`. After `wait`, parse each output into normalized findings:
 - tsc: `file(line,col): error TSxxxx: msg` -> dimension=correctness, source=tool:tsc
 - eslint: JSON array -> severity 2=correctness/high, else=maintainability/medium
 - semgrep: `{results[]}` -> dimension=security, severity from extra.severity
 - ruff: `[{code,message,filename}]` -> S*=security, F*/B*=correctness, else=maintainability
 - mypy: `file:line: error: msg [code]` -> dimension=correctness
 - npm audit: `{vulnerabilities:{}}` -> dimension=security, category=dependency
 Write `<session>/scan/toolchain-findings.json`.
 ### 3B: Semantic Scan (LLM via CLI)
 Build prompt with target file patterns, toolchain dedup summary, and per-dimension focus areas:
 - SEC: Business logic vulnerabilities, privilege escalation, sensitive data flow, auth bypass
 - COR: Logic errors, unhandled exception paths, state management bugs, race conditions
 - PRF: Algorithm complexity, N+1 queries, unnecessary sync, memory leaks, missing caching
 - MNT: Architectural coupling, abstraction leaks, convention violations, dead code
 Execute via `ccw cli --tool gemini --mode analysis --rule analysis-review-code-quality` (fallback: qwen -> codex). Parse JSON array response, validate required fields (dimension, title, location.file), enforce per-dimension limit (max 5 each), filter minimum severity (medium+). Write `<session>/scan/semantic-findings.json`.
 ## Phase 4: Aggregate & Output
 1. Merge toolchain + semantic findings, deduplicate (same file + line + dimension = duplicate)
 2. Assign dimension-prefixed IDs: SEC-001, COR-001, PRF-001, MNT-001
 3. Write `<session>/scan/scan-results.json` with schema: `{scan_date, target, dimensions, quick_mode, total_findings, by_severity, by_dimension, findings[]}`
 4. Each finding: `{id, dimension, category, severity, title, description, location:{file,line}, source, suggested_fix, effort, confidence}`
 5. Update `<session>/.msg/meta.json` with scan summary (findings_count, by_severity, by_dimension)
 6. Contribute discoveries to `<session>/wisdom/` files
--- a/.claude/skills/team-review/roles/reviewer/role.md
+++ b/.claude/skills/team-review/roles/reviewer/role.md
@@ -21,7 +21,8 @@ Deep analysis on scan findings: triage, root cause / impact / optimization enric
 | .msg/meta.json | <session>/.msg/meta.json | No |
 1. Extract session path, input path, dimensions from task description
-2. Load scan results. If missing or empty -> report clean, complete immediately
+2. Load review specs: Run `ccw spec load --category review` for review standards, checklists, and approval gates
 3. Load scan results. If missing or empty -> report clean, complete immediately
 3. Load wisdom files from `<session>/wisdom/`
 4. Triage findings into two buckets:
--- a/.claude/skills/team-roadmap-dev/role-specs/executor.md
+++ b/.claude/skills/team-roadmap-dev/role-specs/executor.md
@@ -1,71 +0,0 @@
 ---
 prefix: EXEC
 inner_loop: true
 cli_tools:
  - gemini --mode write
 message_types:
  success: exec_complete
  progress: exec_progress
  error: error
 ---
 # Executor
 Wave-based code implementation per phase. Reads IMPL-*.json task files, computes execution waves from the dependency graph, delegates each task to CLI tool for code generation. Produces summary-{IMPL-ID}.md per task.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Task JSONs | <session>/phase-{N}/.task/IMPL-*.json | Yes |
 | Prior summaries | <session>/phase-{1..N-1}/summary-*.md | No |
 | Wisdom | <session>/wisdom/ | No |
 1. Glob `<session>/phase-{N}/.task/IMPL-*.json`, error if none found
 2. Parse each task JSON: extract id, description, depends_on, files, convergence, implementation
 3. Compute execution waves from dependency graph:
   - Wave 1: tasks with no dependencies
   - Wave N: tasks whose all deps are in waves 1..N-1
   - Force-assign if circular (break at lowest-numbered task)
 4. Load prior phase summaries for cross-task context
 ## Phase 3: Wave-Based Implementation
 Execute waves sequentially, tasks within each wave can be parallel.
 **Strategy selection**:
 | Task Count | Strategy |
 |------------|----------|
 | <= 2 | Direct: inline Edit/Write |
 | 3-5 | Single CLI tool call for all |
 | > 5 | Batch: one CLI tool call per module group |
 **Per task**:
 1. Build prompt from task JSON: description, files, implementation steps, convergence criteria
 2. Include prior summaries and wisdom as context
 3. Delegate to CLI tool (`run_in_background: false`):
   ```
   Bash({
     command: `ccw cli -p "PURPOSE: Implement task ${taskId}: ${description}
   TASK: ${implementationSteps}
   MODE: write
   CONTEXT: @${files.join(' @')} | Memory: ${priorSummaries}
   EXPECTED: Working code changes matching convergence criteria
   CONSTRAINTS: ${convergenceCriteria}" --tool gemini --mode write`,
     run_in_background: false
   })
   ```
 4. Write `<session>/phase-{N}/summary-{IMPL-ID}.md` with: task ID, affected files, changes made, status
 **Between waves**: report wave progress via team_msg (type: exec_progress)
 ## Phase 4: Self-Validation
 | Check | Method | Pass Criteria |
 |-------|--------|---------------|
 | Affected files exist | `test -f <path>` for each file in summary | All present |
 | TypeScript syntax | `npx tsc --noEmit` (if tsconfig.json exists) | No errors |
 | Lint | `npm run lint` (best-effort) | No critical errors |
 Log errors via team_msg but do NOT fix — verifier handles gap detection.
--- a/.claude/skills/team-roadmap-dev/role-specs/planner.md
+++ b/.claude/skills/team-roadmap-dev/role-specs/planner.md
@@ -1,77 +0,0 @@
 ---
 prefix: PLAN
 inner_loop: true
 cli_tools:
  - gemini --mode analysis
 message_types:
  success: plan_ready
  progress: plan_progress
  error: error
 ---
 # Planner
 Research and plan creation per roadmap phase. Gathers codebase context via CLI exploration, then generates wave-based execution plans with convergence criteria via CLI planning tool.
 ## Phase 2: Context Loading + Research
 | Input | Source | Required |
 |-------|--------|----------|
 | roadmap.md | <session>/roadmap.md | Yes |
 | config.json | <session>/config.json | Yes |
 | Prior summaries | <session>/phase-{1..N-1}/summary-*.md | No |
 | Wisdom | <session>/wisdom/ | No |
 1. Read roadmap.md, extract phase goal, requirements (REQ-IDs), success criteria
 2. Read config.json for depth setting (quick/standard/comprehensive)
 3. Load prior phase summaries for dependency context
 4. Detect gap closure mode (task description contains "Gap closure")
 5. Launch CLI exploration with phase requirements as exploration query:
   ```
   Bash({
     command: `ccw cli -p "PURPOSE: Explore codebase for phase requirements
   TASK: • Identify files needing modification • Map patterns and dependencies • Assess test infrastructure • Identify risks
   MODE: analysis
   CONTEXT: @**/* | Memory: Phase goal: ${phaseGoal}
   EXPECTED: Structured exploration results with file lists, patterns, risks
   CONSTRAINTS: Read-only analysis" --tool gemini --mode analysis`,
     run_in_background: false
   })
   ```
   - Target: files needing modification, patterns, dependencies, test infrastructure, risks
 6. If depth=comprehensive: run Gemini CLI analysis (`--mode analysis --rule analysis-analyze-code-patterns`)
 7. Write `<session>/phase-{N}/context.md` combining roadmap requirements + exploration results
 ## Phase 3: Plan Creation
 1. Load context.md from Phase 2
 2. Create output directory: `<session>/phase-{N}/.task/`
 3. Delegate to CLI planning tool with:
   ```
   Bash({
     command: `ccw cli -p "PURPOSE: Generate wave-based execution plan for phase ${phaseNum}
   TASK: • Break down requirements into tasks • Define convergence criteria • Build dependency graph • Assign waves
   MODE: write
   CONTEXT: @${contextMd} | Memory: ${priorSummaries}
   EXPECTED: IMPL_PLAN.md + IMPL-*.json files + TODO_LIST.md
   CONSTRAINTS: <= 10 tasks | Valid DAG | Measurable convergence criteria" --tool gemini --mode write`,
     run_in_background: false
   })
   ```
 4. CLI tool produces: `IMPL_PLAN.md`, `.task/IMPL-*.json`, `TODO_LIST.md`
 5. If gap closure: only create tasks for gaps, starting from next available ID
 ## Phase 4: Self-Validation
 | Check | Pass Criteria | Action on Failure |
 |-------|---------------|-------------------|
 | Task JSON files exist | >= 1 IMPL-*.json found | Error to coordinator |
 | Required fields | id, title, description, files, implementation, convergence | Log warning |
 | Convergence criteria | Each task has >= 1 criterion | Log warning |
 | No self-dependency | task.id not in task.depends_on | Log error, remove cycle |
 | All deps valid | Every depends_on ID exists | Log warning |
 | IMPL_PLAN.md exists | File present | Generate minimal version from task JSONs |
 After validation, compute wave structure from dependency graph for reporting:
 - Wave count = topological layers of DAG
 - Report: task count, wave count, file list
--- a/.claude/skills/team-roadmap-dev/role-specs/verifier.md
+++ b/.claude/skills/team-roadmap-dev/role-specs/verifier.md
@@ -1,73 +0,0 @@
 ---
 prefix: VERIFY
 inner_loop: true
 cli_tools:
  - gemini --mode analysis
 message_types:
  success: verify_passed
  failure: gaps_found
  error: error
 ---
 # Verifier
 Goal-backward verification per phase. Reads convergence criteria from IMPL-*.json task files and checks against actual codebase state. Read-only — never modifies code. Produces verification.md with pass/fail and structured gap lists.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Task JSONs | <session>/phase-{N}/.task/IMPL-*.json | Yes |
 | Summaries | <session>/phase-{N}/summary-*.md | Yes |
 | Wisdom | <session>/wisdom/ | No |
 1. Glob IMPL-*.json files, extract convergence criteria from each task
 2. Glob summary-*.md files, parse frontmatter (task, affects, provides)
 3. If no task JSONs or summaries found → error to coordinator
 ## Phase 3: Goal-Backward Verification
 For each task's convergence criteria, execute appropriate check:
 | Criteria Type | Method |
 |---------------|--------|
 | File existence | `test -f <path>` |
 | Command execution | Run command, check exit code |
 | Pattern match | Grep for pattern in specified files |
 | Semantic check | Optional: Gemini CLI (`--mode analysis --rule analysis-review-code-quality`) |
 **Per task scoring**:
 | Result | Condition |
 |--------|-----------|
 | pass | All criteria met |
 | partial | Some criteria met |
 | fail | No criteria met or critical check failed |
 Collect all gaps from partial/failed tasks with structured format:
 - task ID, criteria type, expected value, actual value
 ## Phase 4: Compile Results
 1. Aggregate per-task results: count passed, partial, failed
 2. Determine overall status:
   - `passed` if gaps.length === 0
   - `gaps_found` otherwise
 3. Write `<session>/phase-{N}/verification.md`:
 ```yaml
 ---
 phase: <N>
 status: passed | gaps_found
 tasks_checked: <count>
 tasks_passed: <count>
 gaps:
  - task: "<task-id>"
    type: "<criteria-type>"
    item: "<description>"
    expected: "<expected>"
    actual: "<actual>"
 ---
 ```
 4. Update .msg/meta.json with verification summary
--- a/.claude/skills/team-tech-debt/role-specs/assessor.md
+++ b/.claude/skills/team-tech-debt/role-specs/assessor.md
@@ -1,70 +0,0 @@
 ---
 prefix: TDEVAL
 inner_loop: false
 message_types:
  success: assessment_complete
  error: error
 ---
 # Tech Debt Assessor
 Quantitative evaluator for tech debt items. Score each debt item on business impact (1-5) and fix cost (1-5), classify into priority quadrants, produce priority-matrix.json.
 ## Phase 2: Load Debt Inventory
 | Input | Source | Required |
 |-------|--------|----------|
 | Session path | task description (regex: `session:\s*(.+)`) | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | Yes |
 | Debt inventory | meta.json:debt_inventory OR <session>/scan/debt-inventory.json | Yes |
 1. Extract session path from task description
 2. Read .msg/meta.json for team context
 3. Load debt_inventory from shared memory or fallback to debt-inventory.json file
 4. If debt_inventory is empty -> report empty assessment and exit
 ## Phase 3: Evaluate Each Item
 **Strategy selection**:
 | Item Count | Strategy |
 |------------|----------|
 | <= 10 | Heuristic: severity-based impact + effort-based cost |
 | 11-50 | CLI batch: single gemini analysis call |
 | > 50 | CLI chunked: batches of 25 items |
 **Impact Score Mapping** (heuristic):
 | Severity | Impact Score |
 |----------|-------------|
 | critical | 5 |
 | high | 4 |
 | medium | 3 |
 | low | 1 |
 **Cost Score Mapping** (heuristic):
 | Estimated Effort | Cost Score |
 |------------------|------------|
 | small | 1 |
 | medium | 3 |
 | large | 5 |
 | unknown | 3 |
 **Priority Quadrant Classification**:
 | Impact | Cost | Quadrant |
 |--------|------|----------|
 | >= 4 | <= 2 | quick-win |
 | >= 4 | >= 3 | strategic |
 | <= 3 | <= 2 | backlog |
 | <= 3 | >= 3 | defer |
 For CLI mode, prompt gemini with full debt summary requesting JSON array of `{id, impact_score, cost_score, risk_if_unfixed, priority_quadrant}`. Unevaluated items fall back to heuristic scoring.
 ## Phase 4: Generate Priority Matrix
 1. Build matrix structure: evaluation_date, total_items, by_quadrant (grouped), summary (counts per quadrant)
 2. Sort within each quadrant by impact_score descending
 3. Write `<session>/assessment/priority-matrix.json`
 4. Update .msg/meta.json with `priority_matrix` summary and evaluated `debt_inventory`
--- a/.claude/skills/team-tech-debt/role-specs/executor.md
+++ b/.claude/skills/team-tech-debt/role-specs/executor.md
@@ -1,80 +0,0 @@
 ---
 prefix: TDFIX
 inner_loop: true
 cli_tools:
  - gemini --mode write
 message_types:
  success: fix_complete
  progress: fix_progress
  error: error
 ---
 # Tech Debt Executor
 Debt cleanup executor. Apply remediation plan actions in worktree: refactor code, update dependencies, add tests, add documentation. Batch-delegate to CLI tools, self-validate after each batch.
 ## Phase 2: Load Remediation Plan
 | Input | Source | Required |
 |-------|--------|----------|
 | Session path | task description (regex: `session:\s*(.+)`) | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | Yes |
 | Remediation plan | <session>/plan/remediation-plan.json | Yes |
 | Worktree info | meta.json:worktree.path, worktree.branch | Yes |
 | Context accumulator | From prior TDFIX tasks (inner loop) | Yes (inner loop) |
 1. Extract session path from task description
 2. Read .msg/meta.json for worktree path and branch
 3. Read remediation-plan.json, extract all actions from plan phases
 4. Group actions by type: refactor, restructure, add-tests, update-deps, add-docs
 5. Split large groups (> 10 items) into sub-batches of 10
 6. For inner loop (fix-verify cycle): load context_accumulator from prior TDFIX tasks, parse review/validation feedback for specific issues
 **Batch order**: refactor -> update-deps -> add-tests -> add-docs -> restructure
 ## Phase 3: Execute Fixes
 For each batch, use CLI tool for implementation:
 **Worktree constraint**: ALL file operations and commands must execute within worktree path. Use `cd "<worktree-path>" && ...` prefix for all Bash commands.
 **Per-batch delegation**:
 ```bash
 ccw cli -p "PURPOSE: Apply tech debt fixes in batch; success = all items fixed without breaking changes
 TASK: <batch-type-specific-tasks>
 MODE: write
 CONTEXT: @<worktree-path>/**/* | Memory: Remediation plan context
 EXPECTED: Code changes that fix debt items, maintain backward compatibility, pass existing tests
 CONSTRAINTS: Minimal changes only | No new features | No suppressions | Read files before modifying
 Batch type: <refactor|update-deps|add-tests|add-docs|restructure>
 Items: <list-of-items-with-file-paths-and-descriptions>" --tool gemini --mode write --cd "<worktree-path>"
 ```
 Wait for CLI completion before proceeding to next batch.
 **Fix Results Tracking**:
 | Field | Description |
 |-------|-------------|
 | items_fixed | Count of successfully fixed items |
 | items_failed | Count of failed items |
 | items_remaining | Remaining items count |
 | batches_completed | Completed batch count |
 | files_modified | Array of modified file paths |
 | errors | Array of error messages |
 After each batch, verify file modifications via `git diff --name-only` in worktree.
 ## Phase 4: Self-Validation
 All commands in worktree:
 | Check | Command | Pass Criteria |
 |-------|---------|---------------|
 | Syntax | `tsc --noEmit` or `python -m py_compile` | No new errors |
 | Lint | `eslint --no-error-on-unmatched-pattern` | No new errors |
 Write `<session>/fixes/fix-log.json` with fix results. Update .msg/meta.json with `fix_results`.
 Append to context_accumulator for next TDFIX task (inner loop): files modified, fixes applied, validation results, discovered caveats.
--- a/.claude/skills/team-tech-debt/role-specs/planner.md
+++ b/.claude/skills/team-tech-debt/role-specs/planner.md
@@ -1,71 +0,0 @@
 ---
 prefix: TDPLAN
 inner_loop: false
 message_types:
  success: plan_ready
  revision: plan_revision
  error: error
 ---
 # Tech Debt Planner
 Remediation plan designer. Create phased remediation plan from priority matrix: Phase 1 quick-wins (immediate), Phase 2 systematic (medium-term), Phase 3 prevention (long-term). Produce remediation-plan.md.
 ## Phase 2: Load Assessment Data
 | Input | Source | Required |
 |-------|--------|----------|
 | Session path | task description (regex: `session:\s*(.+)`) | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | Yes |
 | Priority matrix | <session>/assessment/priority-matrix.json | Yes |
 1. Extract session path from task description
 2. Read .msg/meta.json for debt_inventory
 3. Read priority-matrix.json for quadrant groupings
 4. Group items: quickWins (quick-win), strategic (strategic), backlog (backlog), deferred (defer)
 ## Phase 3: Create Remediation Plan
 **Strategy selection**:
 | Item Count (quick-win + strategic) | Strategy |
 |------------------------------------|----------|
 | <= 5 | Inline: generate steps from item data |
 | > 5 | CLI-assisted: gemini generates detailed remediation steps |
 **3-Phase Plan Structure**:
 | Phase | Name | Source Items | Focus |
 |-------|------|-------------|-------|
 | 1 | Quick Wins | quick-win quadrant | High impact, low cost -- immediate execution |
 | 2 | Systematic | strategic quadrant | High impact, high cost -- structured refactoring |
 | 3 | Prevention | Generated from dimension patterns | Long-term prevention mechanisms |
 **Action Type Mapping**:
 | Dimension | Action Type |
 |-----------|-------------|
 | code | refactor |
 | architecture | restructure |
 | testing | add-tests |
 | dependency | update-deps |
 | documentation | add-docs |
 **Prevention Actions** (generated when dimension has >= 3 items):
 | Dimension | Prevention Action |
 |-----------|-------------------|
 | code | Add linting rules for complexity thresholds and code smell detection |
 | architecture | Introduce module boundary checks in CI pipeline |
 | testing | Set minimum coverage thresholds in CI and add pre-commit test hooks |
 | dependency | Configure automated dependency update bot (Renovate/Dependabot) |
 | documentation | Add JSDoc/docstring enforcement in linting rules |
 For CLI-assisted mode, prompt gemini with debt summary requesting specific fix steps per item, grouped into phases, with dependencies and estimated time.
 ## Phase 4: Validate & Save
 1. Calculate validation metrics: total_actions, total_effort, files_affected, has_quick_wins, has_prevention
 2. Write `<session>/plan/remediation-plan.md` (markdown with per-item checklists)
 3. Write `<session>/plan/remediation-plan.json` (machine-readable)
 4. Update .msg/meta.json with `remediation_plan` summary
--- a/.claude/skills/team-tech-debt/role-specs/scanner.md
+++ b/.claude/skills/team-tech-debt/role-specs/scanner.md
@@ -1,85 +0,0 @@
 ---
 prefix: TDSCAN
 inner_loop: false
 cli_tools:
  - gemini --mode analysis
 message_types:
  success: scan_complete
  error: error
  info: debt_items_found
 ---
 # Tech Debt Scanner
 Multi-dimension tech debt scanner. Scan codebase across 5 dimensions (code, architecture, testing, dependency, documentation), produce structured debt inventory with severity rankings.
 ## Phase 2: Context & Environment Detection
 | Input | Source | Required |
 |-------|--------|----------|
 | Scan scope | task description (regex: `scope:\s*(.+)`) | No (default: `**/*`) |
 | Session path | task description (regex: `session:\s*(.+)`) | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | Yes |
 1. Extract session path and scan scope from task description
 2. Read .msg/meta.json for team context
 3. Detect project type and framework:
 | Signal File | Project Type |
 |-------------|-------------|
 | package.json + React/Vue/Angular | Frontend Node |
 | package.json + Express/Fastify/NestJS | Backend Node |
 | pyproject.toml / requirements.txt | Python |
 | go.mod | Go |
 | No detection | Generic |
 4. Determine scan dimensions (default: code, architecture, testing, dependency, documentation)
 5. Detect perspectives from task description:
 | Condition | Perspective |
 |-----------|-------------|
 | `security\|auth\|inject\|xss` | security |
 | `performance\|speed\|optimize` | performance |
 | `quality\|clean\|maintain\|debt` | code-quality |
 | `architect\|pattern\|structure` | architecture |
 | Default | code-quality + architecture |
 6. Assess complexity:
 | Score | Complexity | Strategy |
 |-------|------------|----------|
 | >= 4 | High | Triple Fan-out: CLI explore + CLI 5 dimensions + multi-perspective Gemini |
 | 2-3 | Medium | Dual Fan-out: CLI explore + CLI 3 dimensions |
 | 0-1 | Low | Inline: ACE search + Grep |
 ## Phase 3: Multi-Dimension Scan
 **Low Complexity** (inline):
 - Use `mcp__ace-tool__search_context` for code smells, TODO/FIXME, deprecated APIs, complex functions, dead code, missing tests
 - Classify findings into dimensions
 **Medium/High Complexity** (Fan-out):
 - Fan-out A: CLI exploration (structure, patterns, dependencies angles) via `ccw cli --tool gemini --mode analysis`
 - Fan-out B: CLI dimension analysis (parallel gemini per dimension -- code, architecture, testing, dependency, documentation)
 - Fan-out C (High only): Multi-perspective Gemini analysis (security, performance, code-quality, architecture)
 - Fan-in: Merge results, cross-deduplicate by file:line, boost severity for multi-source findings
 **Standardize each finding**:
 | Field | Description |
 |-------|-------------|
 | `id` | `TD-NNN` (sequential) |
 | `dimension` | code, architecture, testing, dependency, documentation |
 | `severity` | critical, high, medium, low |
 | `file` | File path |
 | `line` | Line number |
 | `description` | Issue description |
 | `suggestion` | Fix suggestion |
 | `estimated_effort` | small, medium, large, unknown |
 ## Phase 4: Aggregate & Save
 1. Deduplicate findings across Fan-out layers (file:line key), merge cross-references
 2. Sort by severity (cross-referenced items boosted)
 3. Write `<session>/scan/debt-inventory.json` with scan_date, dimensions, total_items, by_dimension, by_severity, items
 4. Update .msg/meta.json with `debt_inventory` array and `debt_score_before` count
--- a/.claude/skills/team-tech-debt/role-specs/validator.md
+++ b/.claude/skills/team-tech-debt/role-specs/validator.md
@@ -1,83 +0,0 @@
 ---
 prefix: TDVAL
 inner_loop: false
 message_types:
  success: validation_complete
  error: error
  fix: regression_found
 ---
 # Tech Debt Validator
 Cleanup result validator. Run test suite, type checks, lint checks, and quality analysis to verify debt cleanup introduced no regressions. Compare before/after debt scores, produce validation-report.json.
 ## Phase 2: Load Context
 | Input | Source | Required |
 |-------|--------|----------|
 | Session path | task description (regex: `session:\s*(.+)`) | Yes |
 | .msg/meta.json | <session>/.msg/meta.json | Yes |
 | Fix log | <session>/fixes/fix-log.json | No |
 1. Extract session path from task description
 2. Read .msg/meta.json for: worktree.path, debt_inventory, fix_results, debt_score_before
 3. Determine command prefix: `cd "<worktree-path>" && ` if worktree exists
 4. Read fix-log.json for modified files list
 5. Detect available validation tools in worktree:
 | Signal | Tool | Method |
 |--------|------|--------|
 | package.json + npm | npm test | Test suite |
 | pytest available | python -m pytest | Test suite |
 | npx tsc available | npx tsc --noEmit | Type check |
 | npx eslint available | npx eslint | Lint check |
 ## Phase 3: Run Validation Checks
 Execute 4-layer validation (all commands in worktree):
 **1. Test Suite**:
 - Run `npm test` or `python -m pytest` in worktree
 - PASS if no FAIL/error/failed keywords; FAIL with regression count otherwise
 - Skip with "no-tests" if no test runner available
 **2. Type Check**:
 - Run `npx tsc --noEmit` in worktree
 - Count `error TS` occurrences for error count
 **3. Lint Check**:
 - Run `npx eslint --no-error-on-unmatched-pattern <modified-files>` in worktree
 - Count error occurrences
 **4. Quality Analysis** (optional, when > 5 modified files):
 - Use gemini CLI to compare code quality before/after
 - Assess complexity, duplication, naming quality improvements
 **Debt Score Calculation**:
 - debt_score_after = debt items NOT in modified files (remaining unfixed items)
 - improvement_percentage = ((before - after) / before) * 100
 **Auto-fix attempt** (when total_regressions <= 3):
 - Use CLI tool to fix regressions in worktree:
  ```
  Bash({
    command: `cd "${worktreePath}" && ccw cli -p "PURPOSE: Fix regressions found in validation
  TASK: ${regressionDetails}
  MODE: write
  CONTEXT: @${modifiedFiles.join(' @')}
  EXPECTED: Fixed regressions
  CONSTRAINTS: Fix only regressions | Preserve debt cleanup changes | No suppressions" --tool gemini --mode write`,
    run_in_background: false
  })
  ```
 - Re-run validation checks after fix attempt
  })
  ```
 - Re-run checks after fix attempt
 ## Phase 4: Compare & Report
 1. Calculate: total_regressions = test_regressions + type_errors + lint_errors; passed = (total_regressions === 0)
 2. Write `<session>/validation/validation-report.json` with: validation_date, passed, regressions, checks (per-check status), debt_score_before, debt_score_after, improvement_percentage
 3. Update .msg/meta.json with `validation_results` and `debt_score_after`
 4. Select message type: `validation_complete` if passed, `regression_found` if not
--- a/.claude/skills/team-tech-debt/roles/scanner/role.md
+++ b/.claude/skills/team-tech-debt/roles/scanner/role.md
@@ -18,7 +18,8 @@ Multi-dimension tech debt scanner. Scan codebase across 5 dimensions (code, arch
 | .msg/meta.json | <session>/.msg/meta.json | Yes |
 1. Extract session path and scan scope from task description
-2. Read .msg/meta.json for team context
+2. Load debug specs: Run `ccw spec load --category debug` for known issues, workarounds, and root-cause notes
 3. Read .msg/meta.json for team context
 3. Detect project type and framework:
 | Signal File | Project Type |
--- a/.claude/skills/team-testing/role-specs/analyst.md
+++ b/.claude/skills/team-testing/role-specs/analyst.md
@@ -1,94 +0,0 @@
 ---
 prefix: TESTANA
 inner_loop: false
 message_types:
  success: analysis_ready
  error: error
 ---
 # Test Quality Analyst
 Analyze defect patterns, identify coverage gaps, assess GC loop effectiveness, and generate a quality report with actionable recommendations.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | Execution results | <session>/results/run-*.json | Yes |
 | Test strategy | <session>/strategy/test-strategy.md | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
 1. Extract session path from task description
 2. Read .msg/meta.json for execution context (executor, generator namespaces)
 3. Read all execution results:
 ```
 Glob("<session>/results/run-*.json")
 Read("<session>/results/run-001.json")
 ```
 4. Read test strategy:
 ```
 Read("<session>/strategy/test-strategy.md")
 ```
 5. Read test files for pattern analysis:
 ```
 Glob("<session>/tests/**/*")
 ```
 ## Phase 3: Quality Analysis
 **Analysis dimensions**:
 1. **Coverage Analysis** -- Aggregate coverage by layer:
 | Layer | Coverage | Target | Status |
 |-------|----------|--------|--------|
 | L1 | X% | Y% | Met/Below |
 2. **Defect Pattern Analysis** -- Frequency and severity:
 | Pattern | Frequency | Severity |
 |---------|-----------|----------|
 | pattern | count | HIGH (>=3) / MEDIUM (>=2) / LOW (<2) |
 3. **GC Loop Effectiveness**:
 | Metric | Value | Assessment |
 |--------|-------|------------|
 | Rounds | N | - |
 | Coverage Improvement | +/-X% | HIGH (>10%) / MEDIUM (>5%) / LOW (<=5%) |
 4. **Coverage Gaps** -- per module/feature:
   - Area, Current %, Gap %, Reason, Recommendation
 5. **Quality Score**:
 | Dimension | Score (1-10) | Weight |
 |-----------|-------------|--------|
 | Coverage Achievement | score | 30% |
 | Test Effectiveness | score | 25% |
 | Defect Detection | score | 25% |
 | GC Loop Efficiency | score | 20% |
 Write report to `<session>/analysis/quality-report.md`
 ## Phase 4: Trend Analysis & State Update
 **Historical comparison** (if multiple sessions exist):
 ```
 Glob(".workflow/.team/TST-*/.msg/meta.json")
 ```
 - Track coverage trends over time
 - Identify defect pattern evolution
 - Compare GC loop effectiveness across sessions
 Update `<session>/wisdom/.msg/meta.json` under `analyst` namespace:
 - Merge `{ "analyst": { quality_score, coverage_gaps, top_defect_patterns, gc_effectiveness, recommendations } }`
--- a/.claude/skills/team-testing/role-specs/executor.md
+++ b/.claude/skills/team-testing/role-specs/executor.md
@@ -1,97 +0,0 @@
 ---
 prefix: TESTRUN
 inner_loop: true
 message_types:
  success: tests_passed
  failure: tests_failed
  coverage: coverage_report
  error: error
 ---
 # Test Executor
 Execute tests, collect coverage, attempt auto-fix for failures. Acts as the Critic in the Generator-Critic loop. Reports pass rate and coverage for coordinator GC decisions.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | Test directory | Task description (Input: <path>) | Yes |
 | Coverage target | Task description (default: 80%) | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
 1. Extract session path and test directory from task description
 2. Extract coverage target (default: 80%)
 3. Read .msg/meta.json for framework info (from strategist namespace)
 4. Determine test framework:
 | Framework | Run Command |
 |-----------|-------------|
 | Jest | `npx jest --coverage --json --outputFile=<session>/results/jest-output.json` |
 | Pytest | `python -m pytest --cov --cov-report=json:<session>/results/coverage.json -v` |
 | Vitest | `npx vitest run --coverage --reporter=json` |
 5. Find test files to execute:
 ```
 Glob("<session>/<test-dir>/**/*")
 ```
 ## Phase 3: Test Execution + Fix Cycle
 **Iterative test-fix cycle** (max 3 iterations):
 | Step | Action |
 |------|--------|
 | 1 | Run test command |
 | 2 | Parse results: pass rate + coverage |
 | 3 | pass_rate >= 0.95 AND coverage >= target -> success, exit |
 | 4 | Extract failing test details |
 | 5 | Delegate fix to CLI tool (gemini write mode) |
 | 6 | Increment iteration; >= 3 -> exit with failures |
 ```
 Bash("<test-command> 2>&1 || true")
 ```
 **Auto-fix delegation** (on failure):
 ```
 Bash({
  command: `ccw cli -p "PURPOSE: Fix test failures to achieve pass rate >= 0.95; success = all tests pass
 TASK: • Analyze test failure output • Identify root causes • Fix test code only (not source) • Preserve test intent
 MODE: write
 CONTEXT: @<session>/<test-dir>/**/* | Memory: Test framework: <framework>, iteration <N>/3
 EXPECTED: Fixed test files with: corrected assertions, proper async handling, fixed imports, maintained coverage
 CONSTRAINTS: Only modify test files | Preserve test structure | No source code changes
 Test failures:
 <test-output>" --tool gemini --mode write --cd <session>`,
  run_in_background: false
 })
 ```
 **Save results**: `<session>/results/run-<N>.json`
 ## Phase 4: Defect Pattern Extraction & State Update
 **Extract defect patterns from failures**:
 | Pattern Type | Detection Keywords |
 |--------------|-------------------|
 | Null reference | "null", "undefined", "Cannot read property" |
 | Async timing | "timeout", "async", "await", "promise" |
 | Import errors | "Cannot find module", "import" |
 | Type mismatches | "type", "expected", "received" |
 **Record effective test patterns** (if pass_rate > 0.8):
 | Pattern | Detection |
 |---------|-----------|
 | Happy path | "should succeed", "valid input" |
 | Edge cases | "edge", "boundary", "limit" |
 | Error handling | "should fail", "error", "throw" |
 Update `<session>/wisdom/.msg/meta.json` under `executor` namespace:
 - Merge `{ "executor": { pass_rate, coverage, defect_patterns, effective_patterns, coverage_history_entry } }`
--- a/.claude/skills/team-testing/role-specs/generator.md
+++ b/.claude/skills/team-testing/role-specs/generator.md
@@ -1,96 +0,0 @@
 ---
 prefix: TESTGEN
 inner_loop: true
 message_types:
  success: tests_generated
  revision: tests_revised
  error: error
 ---
 # Test Generator
 Generate test code by layer (L1 unit / L2 integration / L3 E2E). Acts as the Generator in the Generator-Critic loop. Supports revision mode for GC loop iterations.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | Test strategy | <session>/strategy/test-strategy.md | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
 1. Extract session path and layer from task description
 2. Read test strategy:
 ```
 Read("<session>/strategy/test-strategy.md")
 ```
 3. Read source files to test (from strategy priority_files, limit 20)
 4. Read .msg/meta.json for framework and scope context
 5. Detect revision mode:
 | Condition | Mode |
 |-----------|------|
 | Task subject contains "fix" or "revised" | Revision -- load previous failures |
 | Otherwise | Fresh generation |
 For revision mode:
 - Read latest result file for failure details
 - Read effective test patterns from .msg/meta.json
 6. Read wisdom files if available
 ## Phase 3: Test Generation
 **Strategy selection by complexity**:
 | File Count | Strategy |
 |------------|----------|
 | <= 3 files | Direct: inline Write/Edit |
 | 3-5 files | Single code-developer agent |
 | > 5 files | Batch: group by module, one agent per batch |
 **Direct generation** (per source file):
 1. Generate test path: `<session>/tests/<layer>/<test-file>`
 2. Generate test code: happy path, edge cases, error handling
 3. Write test file
 **CLI delegation** (medium/high complexity):
 ```
 Bash({
  command: `ccw cli -p "PURPOSE: Generate <layer> tests using <framework> to achieve coverage target; success = all priority files covered with quality tests
 TASK: • Analyze source files • Generate test cases (happy path, edge cases, errors) • Write test files with proper structure • Ensure import resolution
 MODE: write
 CONTEXT: @<source-files> @<session>/strategy/test-strategy.md | Memory: Framework: <framework>, Layer: <layer>, Round: <round>
 <if-revision: Previous failures: <failure-details>
 Effective patterns: <patterns-from-meta>>
 EXPECTED: Test files in <session>/tests/<layer>/ with: proper test structure, comprehensive coverage, correct imports, framework conventions
 CONSTRAINTS: Follow test strategy priorities | Use framework best practices | <layer>-appropriate assertions
 Source files to test:
 <file-list-with-content>" --tool gemini --mode write --cd <session>`,
  run_in_background: false
 })
 ```
 **Output verification**:
 ```
 Glob("<session>/tests/<layer>/**/*")
 ```
 ## Phase 4: Self-Validation & State Update
 **Validation checks**:
 | Check | Method | Action on Fail |
 |-------|--------|----------------|
 | Syntax | `tsc --noEmit` or equivalent | Auto-fix imports/types |
 | File count | Count generated files | Report issue |
 | Import resolution | Check broken imports | Fix import paths |
 Update `<session>/wisdom/.msg/meta.json` under `generator` namespace:
 - Merge `{ "generator": { test_files, layer, round, is_revision } }`
--- a/.claude/skills/team-testing/role-specs/strategist.md
+++ b/.claude/skills/team-testing/role-specs/strategist.md
@@ -1,82 +0,0 @@
 ---
 prefix: STRATEGY
 inner_loop: false
 message_types:
  success: strategy_ready
  error: error
 ---
 # Test Strategist
 Analyze git diff, determine test layers, define coverage targets, and formulate test strategy with prioritized execution order.
 ## Phase 2: Context & Environment Detection
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
 1. Extract session path and scope from task description
 2. Get git diff for change analysis:
 ```
 Bash("git diff HEAD~1 --name-only 2>/dev/null || git diff --cached --name-only")
 Bash("git diff HEAD~1 -- <changed-files> 2>/dev/null || git diff --cached -- <changed-files>")
 ```
 3. Detect test framework from project files:
 | Signal File | Framework | Test Pattern |
 |-------------|-----------|-------------|
 | jest.config.js/ts | Jest | `**/*.test.{ts,tsx,js}` |
 | vitest.config.ts/js | Vitest | `**/*.test.{ts,tsx}` |
 | pytest.ini / pyproject.toml | Pytest | `**/test_*.py` |
 | No detection | Default | Jest patterns |
 4. Scan existing test patterns:
 ```
 Glob("**/*.test.*")
 Glob("**/*.spec.*")
 ```
 5. Read .msg/meta.json if exists for session context
 ## Phase 3: Strategy Formulation
 **Change analysis dimensions**:
 | Change Type | Analysis | Priority |
 |-------------|----------|----------|
 | New files | Need new tests | High |
 | Modified functions | Need updated tests | Medium |
 | Deleted files | Need test cleanup | Low |
 | Config changes | May need integration tests | Variable |
 **Strategy output structure**:
 1. **Change Analysis Table**: File, Change Type, Impact, Priority
 2. **Test Layer Recommendations**:
   - L1 Unit: Scope, Coverage Target, Priority Files, Patterns
   - L2 Integration: Scope, Coverage Target, Integration Points
   - L3 E2E: Scope, Coverage Target, User Scenarios
 3. **Risk Assessment**: Risk, Probability, Impact, Mitigation
 4. **Test Execution Order**: Prioritized sequence
 Write strategy to `<session>/strategy/test-strategy.md`
 **Self-validation**:
 | Check | Criteria | Fallback |
 |-------|----------|----------|
 | Has L1 scope | L1 scope not empty | Default to all changed files |
 | Has coverage targets | L1 target > 0 | Use defaults (80/60/40) |
 | Has priority files | List not empty | Use all changed files |
 ## Phase 4: Wisdom & State Update
 1. Write discoveries to `<session>/wisdom/conventions.md` (detected framework, patterns)
 2. Update `<session>/wisdom/.msg/meta.json` under `strategist` namespace:
   - Read existing -> merge `{ "strategist": { framework, layers, coverage_targets, priority_files, risks } }` -> write back
--- a/.claude/skills/team-testing/roles/executor/role.md
+++ b/.claude/skills/team-testing/roles/executor/role.md
@@ -24,7 +24,8 @@ Execute tests, collect coverage, attempt auto-fix for failures. Acts as the Crit
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
 1. Extract session path and test directory from task description
-2. Extract coverage target (default: 80%)
+2. Load test specs: Run `ccw spec load --category test` for test framework conventions and coverage targets
 3. Extract coverage target (default: 80%)
 3. Read .msg/meta.json for framework info (from strategist namespace)
 4. Determine test framework:
--- a/.claude/skills/team-testing/roles/generator/role.md
+++ b/.claude/skills/team-testing/roles/generator/role.md
@@ -22,7 +22,8 @@ Generate test code by layer (L1 unit / L2 integration / L3 E2E). Acts as the Gen
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
 1. Extract session path and layer from task description
-2. Read test strategy:
+2. Load test specs: Run `ccw spec load --category test` for test framework conventions and coverage targets
 3. Read test strategy:
 ```
 Read("<session>/strategy/test-strategy.md")
--- a/.claude/skills/team-uidesign/role-specs/designer.md
+++ b/.claude/skills/team-uidesign/role-specs/designer.md
@@ -1,72 +0,0 @@
 ---
 prefix: DESIGN
 inner_loop: false
 message_types:
  success: design_ready
  revision: design_revision
  progress: design_progress
  error: error
 ---
 # Design Token & Component Spec Author
 Define visual language through design tokens (W3C Design Tokens Format) and component specifications. Consume design intelligence from researcher. Act as Generator in the designer<->reviewer Generator-Critic loop.
 ## Phase 2: Context & Artifact Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Research artifacts | <session>/research/*.json | Yes |
 | Design intelligence | <session>/research/design-intelligence.json | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
 | Audit feedback | <session>/audit/audit-*.md | Only for GC fix tasks |
 1. Extract session path from task description
 2. Read research findings: design-system-analysis.json, component-inventory.json, accessibility-audit.json
 3. Read design intelligence: recommended colors/typography/style, anti-patterns, ux_guidelines
 4. Detect task type from subject: "token" -> Token design, "component" -> Component spec, "fix"/"revision" -> GC fix
 5. If GC fix task: read latest audit feedback from audit files
 ## Phase 3: Design Execution
 **Token System Design (DESIGN-001)**:
 - Define complete token system following W3C Design Tokens Format
 - Categories: Color (primary, secondary, background, surface, text, semantic), Typography (font-family, font-size, font-weight, line-height), Spacing (xs-2xl), Shadow (sm/md/lg), Border (radius, width), Breakpoint (mobile/tablet/desktop/wide)
 - All color tokens must have light/dark variants using `$value: { light: ..., dark: ... }`
 - Integrate design intelligence: recommended.colors -> color tokens, recommended.typography -> font stacks
 - Document anti-patterns from design intelligence for implementer reference
 - Output: `<session>/design/design-tokens.json`
 **Component Specification (DESIGN-002)**:
 - Define component specs consuming design tokens
 - Each spec contains: Overview (type: atom/molecule/organism, purpose), Design Tokens Consumed (token -> usage -> value reference), States (default/hover/focus/active/disabled), Responsive Behavior (changes per breakpoint), Accessibility (role, ARIA, keyboard, focus indicator, contrast), Variants, Anti-Patterns, Implementation Hints
 - All interactive states required: default, hover (background/opacity change), focus (outline 2px solid, offset 2px), active (pressed), disabled (opacity 0.5, cursor not-allowed)
 - Output: `<session>/design/component-specs/{component-name}.md`
 **GC Fix Mode (DESIGN-fix-N)**:
 - Parse audit feedback for specific issues
 - Re-read affected design artifacts; apply fixes (token value adjustments, missing states, accessibility gaps, naming fixes)
 - Re-write affected files; signal `design_revision` instead of `design_ready`
 ## Phase 4: Self-Validation & Output
 1. Token integrity checks:
 | Check | Pass Criteria |
 |-------|---------------|
 | tokens_valid | All $value fields non-empty |
 | theme_complete | Light/dark values for all color tokens |
 | values_parseable | Valid CSS-parseable values |
 | no_duplicates | No duplicate token definitions |
 2. Component spec checks:
 | Check | Pass Criteria |
 |-------|---------------|
 | states_complete | All 5 states (default/hover/focus/active/disabled) defined |
 | a11y_specified | Role, ARIA, keyboard behavior defined |
 | responsive_defined | At least mobile/desktop breakpoints |
 | token_refs_valid | All `{token.path}` references resolve to defined tokens |
 3. Update `<session>/wisdom/.msg/meta.json` under `designer` namespace:
   - Read existing -> merge `{ "designer": { task_type, token_categories, component_count, style_decisions } }` -> write back
--- a/.claude/skills/team-uidesign/role-specs/implementer.md
+++ b/.claude/skills/team-uidesign/role-specs/implementer.md
@@ -1,74 +0,0 @@
 ---
 prefix: BUILD
 inner_loop: false
 message_types:
  success: build_complete
  progress: build_progress
  error: error
 ---
 # Component Code Builder
 Translate design tokens and component specifications into production code. Generate CSS custom properties, TypeScript/JavaScript components, and accessibility implementations. Consume design intelligence stack guidelines for tech-specific patterns.
 ## Phase 2: Context & Artifact Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Design tokens | <session>/design/design-tokens.json | Yes (token build) |
 | Component specs | <session>/design/component-specs/*.md | Yes (component build) |
 | Design intelligence | <session>/research/design-intelligence.json | Yes |
 | Latest audit report | <session>/audit/audit-*.md | No |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
 1. Extract session path from task description
 2. Detect build type from subject: "token" -> Token implementation, "component" -> Component implementation
 3. Read design artifacts: design-tokens.json (token build), component-specs/*.md (component build)
 4. Read design intelligence: stack_guidelines (tech-specific patterns), anti_patterns (patterns to avoid), ux_guidelines
 5. Read latest audit report for approved changes and feedback
 6. Detect project tech stack from package.json
 ## Phase 3: Implementation Execution
 **Token Implementation (BUILD-001)**:
 - Convert design tokens to production code
 - Output files in `<session>/build/token-files/`:
  - `tokens.css`: CSS custom properties with `:root` (light) and `[data-theme="dark"]` selectors, plus `@media (prefers-color-scheme: dark)` fallback
  - `tokens.ts`: TypeScript constants and types for programmatic access with autocomplete support
  - `README.md`: Token usage guide
 - All color tokens must have both light and dark values
 - Semantic token names must match design token definitions
 **Component Implementation (BUILD-002)**:
 - Implement component code from design specifications
 - Per-component output in `<session>/build/component-files/`:
  - `{ComponentName}.tsx`: React/Vue/Svelte component (match detected stack)
  - `{ComponentName}.css`: Styles consuming tokens via `var(--token-name)` only
  - `{ComponentName}.test.tsx`: Basic render + state tests
  - `index.ts`: Re-export
 - Requirements: no hardcoded colors/spacing (use design tokens), implement all 5 states, add ARIA attributes per spec, support responsive breakpoints, follow project component patterns
 - Accessibility: keyboard navigation, screen reader support, visible focus indicators, WCAG AA contrast
 - Check implementation against design intelligence anti_patterns
 ## Phase 4: Validation & Output
 1. Token build validation:
 | Check | Pass Criteria |
 |-------|---------------|
 | File existence | tokens.css and tokens.ts exist |
 | Token coverage | All defined tokens present in CSS |
 | Theme support | Light/dark variants exist |
 2. Component build validation:
 | Check | Pass Criteria |
 |-------|---------------|
 | File existence | At least 3 files per component (component, style, index) |
 | No hardcoded values | No `#xxx` or `rgb()` in component CSS (only in tokens.css) |
 | Focus styles | `:focus` or `:focus-visible` defined |
 | Responsive | `@media` queries present |
 | Anti-pattern clean | No violations of design intelligence anti_patterns |
 3. Update `<session>/wisdom/.msg/meta.json` under `implementer` namespace:
   - Read existing -> merge `{ "implementer": { build_type, file_count, output_dir, components_built } }` -> write back
--- a/.claude/skills/team-uidesign/role-specs/researcher.md
+++ b/.claude/skills/team-uidesign/role-specs/researcher.md
@@ -1,84 +0,0 @@
 ---
 prefix: RESEARCH
 inner_loop: false
 message_types:
  success: research_ready
  progress: research_progress
  error: error
 ---
 # Design System Researcher
 Analyze existing design system, build component inventory, assess accessibility baseline, and retrieve industry-specific design intelligence via ui-ux-pro-max. Produce foundation data for downstream designer, reviewer, and implementer roles.
 ## Phase 2: Context & Environment Detection
 | Input | Source | Required |
 |-------|--------|----------|
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | No |
 1. Extract session path and target scope from task description
 2. Detect project type and tech stack from package.json or equivalent:
 | Package | Detected Stack |
 |---------|---------------|
 | next | nextjs |
 | react | react |
 | vue | vue |
 | svelte | svelte |
 | @shadcn/ui | shadcn |
 | (default) | html-tailwind |
 3. Use CLI tools (e.g., `ccw cli -p "..." --tool gemini --mode analysis`) or direct tools (Glob, Grep, mcp__ace-tool__search_context) to scan for existing design tokens, component files, styling patterns
 4. Read industry context from session config (industry, strictness, must-have features)
 ## Phase 3: Research Execution
 Execute 4 analysis streams:
 **Stream 1 -- Design System Analysis**:
 - Search for existing design tokens (CSS variables, theme configs, token files)
 - Identify styling patterns (CSS-in-JS, CSS modules, utility classes, SCSS)
 - Map color palette, typography scale, spacing system
 - Find component library usage (MUI, Ant Design, shadcn, custom)
 - Output: `<session>/research/design-system-analysis.json`
 **Stream 2 -- Component Inventory**:
 - Find all UI component files; identify props/API surface
 - Identify states supported (hover, focus, disabled, etc.)
 - Check accessibility attributes (ARIA labels, roles)
 - Map inter-component dependencies and usage counts
 - Output: `<session>/research/component-inventory.json`
 **Stream 3 -- Accessibility Baseline**:
 - Check ARIA attribute usage patterns, keyboard navigation support
 - Assess color contrast ratios (if design tokens found)
 - Find focus management and semantic HTML patterns
 - Output: `<session>/research/accessibility-audit.json`
 **Stream 4 -- Design Intelligence (ui-ux-pro-max)**:
 - Call `Skill(skill="ui-ux-pro-max", args="<industry> <keywords> --design-system")` for design system recommendations
 - Call `Skill(skill="ui-ux-pro-max", args="accessibility animation responsive --domain ux")` for UX guidelines
 - Call `Skill(skill="ui-ux-pro-max", args="<keywords> --stack <detected-stack>")` for stack guidelines
 - Degradation: when unavailable, use LLM general knowledge, mark `_source: "llm-general-knowledge"`
 - Output: `<session>/research/design-intelligence.json`
 Compile research summary metrics: design_system_exists, styling_approach, total_components, accessibility_level, design_intelligence_source, anti_patterns_count.
 ## Phase 4: Validation & Output
 1. Verify all 4 output files exist and contain valid JSON with required fields:
 | File | Required Fields |
 |------|----------------|
 | design-system-analysis.json | existing_tokens, styling_approach |
 | component-inventory.json | components array |
 | accessibility-audit.json | wcag_level |
 | design-intelligence.json | _source, design_system |
 2. If any file missing or invalid, re-run corresponding stream
 3. Update `<session>/wisdom/.msg/meta.json` under `researcher` namespace:
   - Read existing -> merge `{ "researcher": { detected_stack, component_count, wcag_level, di_source, scope } }` -> write back
--- a/.claude/skills/team-uidesign/role-specs/reviewer.md
+++ b/.claude/skills/team-uidesign/role-specs/reviewer.md
@@ -1,70 +0,0 @@
 ---
 prefix: AUDIT
 inner_loop: false
 message_types:
  success: audit_passed
  result: audit_result
  fix: fix_required
  error: error
 ---
 # Design Auditor
 Audit design tokens and component specs for consistency, accessibility compliance, completeness, quality, and industry best-practice adherence. Act as Critic in the designer<->reviewer Generator-Critic loop. Serve as sync point gatekeeper in dual-track pipelines.
 ## Phase 2: Context & Artifact Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Design artifacts | <session>/design/*.json, <session>/design/component-specs/*.md | Yes |
 | Design intelligence | <session>/research/design-intelligence.json | Yes |
 | Audit history | .msg/meta.json -> reviewer namespace | No |
 | Build artifacts | <session>/build/**/* | Only for final audit |
 | .msg/meta.json | <session>/wisdom/.msg/meta.json | Yes |
 1. Extract session path from task description
 2. Detect audit type from subject: "token" -> Token audit, "component" -> Component audit, "final" -> Final audit, "sync" -> Sync point audit
 3. Read design intelligence for anti-patterns and ux_guidelines
 4. Read design artifacts: design-tokens.json (token/component audit), component-specs/*.md (component/final audit), build/**/* (final audit only)
 5. Load audit_history from meta.json for trend analysis
 ## Phase 3: Audit Execution
 Score 5 dimensions on 1-10 scale:
 | Dimension | Weight | Focus |
 |-----------|--------|-------|
 | Consistency | 20% | Token usage, naming conventions, visual uniformity |
 | Accessibility | 25% | WCAG AA compliance, ARIA attributes, keyboard nav, contrast |
 | Completeness | 20% | All states defined, responsive specs, edge cases |
 | Quality | 15% | Token reference integrity, documentation clarity, maintainability |
 | Industry Compliance | 20% | Anti-pattern avoidance, UX best practices, design intelligence adherence |
 **Token Audit**: Naming convention (kebab-case, semantic names), value patterns (consistent units), theme completeness (light+dark for all colors), contrast ratios (text on background >= 4.5:1), minimum font sizes (>= 12px), all categories present, W3C $type metadata, no duplicates.
 **Component Audit**: Token references resolve, naming matches convention, ARIA roles defined, keyboard behavior specified, focus indicator defined, all 5 states present, responsive breakpoints specified, variants documented, clear descriptions.
 **Final Audit (cross-cutting)**: Token<->Component consistency (no hardcoded values), Code<->Design consistency (CSS variables match tokens, ARIA implemented as specified), cross-component consistency (spacing, color, interaction patterns).
 **Score calculation**: `overallScore = round(consistency*0.20 + accessibility*0.25 + completeness*0.20 + quality*0.15 + industryCompliance*0.20)`
 **Signal determination**:
 | Condition | Signal |
 |-----------|--------|
 | Score >= 8 AND critical_count === 0 | `audit_passed` (GC CONVERGED) |
 | Score >= 6 AND critical_count === 0 | `audit_result` (GC REVISION NEEDED) |
 | Score < 6 OR critical_count > 0 | `fix_required` (CRITICAL FIX NEEDED) |
 ## Phase 4: Report & Output
 1. Write audit report to `<session>/audit/audit-{NNN}.md`:
   - Summary: overall score, signal, critical/high/medium counts
   - Sync Point Status (if applicable): PASSED/BLOCKED
   - Dimension Scores table (score/weight/weighted per dimension)
   - Critical/High/Medium issues with descriptions, locations, fix suggestions
   - GC Loop Status: signal, action required
   - Trend analysis (if audit_history exists): improving/stable/declining
 2. Update `<session>/wisdom/.msg/meta.json` under `reviewer` namespace:
   - Read existing -> merge `{ "reviewer": { audit_id, score, critical_count, signal, is_sync_point, audit_type, timestamp } }` -> write back
--- a/.claude/skills/team-ultra-analyze/roles/explorer/role.md
+++ b/.claude/skills/team-ultra-analyze/roles/explorer/role.md
@@ -18,7 +18,8 @@ Explore codebase structure through cli-explore-agent, collecting structured cont
 | Task description | From task subject/description | Yes |
 | Session path | Extracted from task description | Yes |
-1. Extract session path, topic, perspective, dimensions from task description:
+1. Load debug specs: Run `ccw spec load --category debug` for known issues and root-cause notes
 2. Extract session path, topic, perspective, dimensions from task description:
 | Field | Pattern | Default |
 |-------|---------|---------|
--- a/.claude/skills/team-ux-improve/role-specs/designer.md
+++ b/.claude/skills/team-ux-improve/role-specs/designer.md
@@ -1,191 +0,0 @@
 ---
 prefix: DESIGN
 inner_loop: false
 message_types:
  success: design_complete
  error: error
 ---
 # UX Designer
 Design feedback mechanisms (loading/error/success states) and state management patterns (React/Vue reactive updates).
 ## Phase 2: Context & Pattern Loading
 1. Load diagnosis report from `<session>/artifacts/diagnosis.md`
 2. Load diagnoser state via `team_msg(operation="get_state", session_id=<session-id>, role="diagnoser")`
 3. Detect framework from project structure
 4. Load framework-specific patterns:
 | Framework | State Pattern | Event Pattern |
 |-----------|---------------|---------------|
 | React | useState, useRef | onClick, onChange |
 | Vue | ref, reactive | @click, @change |
 ### Wisdom Input
 1. Read `<session>/wisdom/patterns/ui-feedback.md` for established feedback design patterns
 2. Read `<session>/wisdom/patterns/state-management.md` for state handling patterns
 3. Read `<session>/wisdom/principles/general-ux.md` for UX design principles
 4. Apply patterns when designing solutions for identified issues
 ### Complex Design (use CLI)
 For complex multi-component solutions:
 ```
 Bash(`ccw cli -p "PURPOSE: Design comprehensive feedback mechanism for multi-step form
 CONTEXT: @<component-files>
 EXPECTED: Complete design with state flow diagram and code patterns
 CONSTRAINTS: Must support React hooks" --tool gemini --mode analysis`)
 ```
 ## Phase 3: Solution Design
 For each diagnosed issue, design solution:
 ### Feedback Mechanism Design
 | Issue Type | Solution Design |
 |------------|-----------------|
 | Missing loading | Add loading state + UI indicator (spinner, disabled button) |
 | Missing error | Add error state + error message display |
 | Missing success | Add success state + confirmation toast/message |
 | No empty state | Add conditional rendering for empty data |
 ### State Management Design
 **React Pattern**:
 ```typescript
 // Add state variables
 const [isLoading, setIsLoading] = useState(false);
 const [error, setError] = useState<string | null>(null);
 // Wrap async operation
 const handleSubmit = async (event: React.FormEvent) => {
  event.preventDefault();
  setIsLoading(true);
  setError(null);
  try {
    const response = await fetch('/api/upload', { method: 'POST', body: formData });
    if (!response.ok) throw new Error('Upload failed');
    // Success handling
  } catch (err: any) {
    setError(err.message || 'An error occurred');
  } finally {
    setIsLoading(false);
  }
 };
 // UI binding
 <button type="submit" disabled={isLoading}>
  {isLoading ? 'Uploading...' : 'Upload File'}
 </button>
 {error && <p style={{ color: 'red' }}>{error}</p>}
 ```
 **Vue Pattern**:
 ```typescript
 // Add reactive state
 const isLoading = ref(false);
 const error = ref<string | null>(null);
 // Wrap async operation
 const handleSubmit = async () => {
  isLoading.value = true;
  error.value = null;
  try {
    const response = await fetch('/api/upload', { method: 'POST', body: formData });
    if (!response.ok) throw new Error('Upload failed');
    // Success handling
  } catch (err: any) {
    error.value = err.message || 'An error occurred';
  } finally {
    isLoading.value = false;
  }
 };
 // UI binding
 <button @click="handleSubmit" :disabled="isLoading">
  {{ isLoading ? 'Uploading...' : 'Upload File' }}
 </button>
 <p v-if="error" style="color: red">{{ error }}</p>
 ```
 ### Input Control Design
 | Issue | Solution |
 |-------|----------|
 | Text input for file path | Add file picker: `<input type="file" />` |
 | Text input for folder path | Add directory picker: `<input type="file" webkitdirectory />` |
 | No validation | Add validation rules and error messages |
 ## Phase 4: Design Document Generation
 1. Generate implementation guide for each issue:
 ```markdown
 # Design Guide
 ## Issue #1: Upload form no loading state
 ### Solution Design
 Add loading state with UI feedback and error handling.
 ### State Variables (React)
 ```typescript
 const [isLoading, setIsLoading] = useState(false);
 const [error, setError] = useState<string | null>(null);
 ```
 ### Event Handler
 ```typescript
 const handleUpload = async (event: React.FormEvent) => {
  event.preventDefault();
  setIsLoading(true);
  setError(null);
  try {
    // API call
  } catch (err: any) {
    setError(err.message);
  } finally {
    setIsLoading(false);
  }
 };
 ```
 ### UI Binding
 ```tsx
 <button type="submit" disabled={isLoading}>
  {isLoading ? 'Uploading...' : 'Upload File'}
 </button>
 {error && <p className="error">{error}</p>}
 ```
 ### Acceptance Criteria
 - Loading state shows during upload
 - Button disabled during upload
 - Error message displays on failure
 - Success confirmation on completion
 ```
 2. Write guide to `<session>/artifacts/design-guide.md`
 ### Wisdom Contribution
 If novel design patterns created:
 1. Write new patterns to `<session>/wisdom/contributions/designer-pattern-<timestamp>.md`
 2. Format: Problem context, solution design, implementation hints, trade-offs
 3. Share state via team_msg:
   ```
   team_msg(operation="log", session_id=<session-id>, from="designer",
            type="state_update", data={
              designed_solutions: <count>,
              framework: <framework>,
              patterns_used: [<pattern-list>]
            })
   ```
--- a/.claude/skills/team-ux-improve/role-specs/diagnoser.md
+++ b/.claude/skills/team-ux-improve/role-specs/diagnoser.md
@@ -1,110 +0,0 @@
 ---
 prefix: DIAG
 inner_loop: false
 message_types:
  success: diag_complete
  error: error
 ---
 # State Diagnoser
 Diagnose root causes of UI issues: state management problems, event binding failures, async handling errors.
 ## Phase 2: Context & Complexity Assessment
 1. Load scan report from `<session>/artifacts/scan-report.md`
 2. Load scanner state via `team_msg(operation="get_state", session_id=<session-id>, role="scanner")`
 ### Wisdom Input
 1. Read `<session>/wisdom/patterns/ui-feedback.md` and `<session>/wisdom/patterns/state-management.md` if available
 2. Use patterns to identify root causes of UI interaction issues
 3. Reference `<session>/wisdom/anti-patterns/common-ux-pitfalls.md` for common causes
 3. Assess issue complexity:
 | Complexity | Criteria | Strategy |
 |------------|----------|----------|
 | High | 5+ issues, cross-component state | CLI delegation |
 | Medium | 2-4 issues, single component | CLI for analysis |
 | Low | 1 issue, simple pattern | Inline analysis |
 ### Complex Analysis (use CLI)
 For complex multi-file state management issues:
 ```
 Bash(`ccw cli -p "PURPOSE: Analyze state management patterns and identify root causes
 CONTEXT: @<issue-files>
 EXPECTED: Root cause analysis with fix recommendations
 CONSTRAINTS: Focus on reactive update patterns" --tool gemini --mode analysis`)
 ```
 ## Phase 3: Root Cause Analysis
 For each issue from scan report:
 ### State Management Diagnosis
 | Pattern | Root Cause | Fix Strategy |
 |---------|------------|--------------|
 | Array.splice/push | Direct mutation, no reactive trigger | Use filter/map/spread for new array |
 | Object property change | Direct mutation | Use spread operator or reactive API |
 | Missing useState/ref | No state tracking | Add state variable |
 | Stale closure | Captured old state value | Use functional setState or ref.current |
 ### Event Binding Diagnosis
 | Pattern | Root Cause | Fix Strategy |
 |---------|------------|--------------|
 | onClick without handler | Missing event binding | Add event handler function |
 | Async without await | Unhandled promise | Add async/await or .then() |
 | No error catching | Uncaught exceptions | Wrap in try/catch |
 | Event propagation issue | stopPropagation missing | Add event.stopPropagation() |
 ### Async Handling Diagnosis
 | Pattern | Root Cause | Fix Strategy |
 |---------|------------|--------------|
 | No loading state | Missing async state tracking | Add isLoading state |
 | No error handling | Missing catch block | Add try/catch with error state |
 | Race condition | Multiple concurrent requests | Add request cancellation or debounce |
 ## Phase 4: Diagnosis Report
 1. Generate root cause analysis for each issue:
 ```markdown
 # Diagnosis Report
 ## Issue #1: Upload form no loading state
 - **File**: src/components/Upload.tsx:45
 - **Root Cause**: Form submit handler is async but no loading state variable exists
 - **Pattern Type**: Missing async state tracking
 - **Fix Recommendation**:
  - Add `const [isLoading, setIsLoading] = useState(false)` (React)
  - Add `const isLoading = ref(false)` (Vue)
  - Wrap async call in try/finally with setIsLoading(true/false)
  - Disable button when isLoading is true
 ```
 2. Write report to `<session>/artifacts/diagnosis.md`
 ### Wisdom Contribution
 If new root cause patterns discovered:
 1. Write diagnosis patterns to `<session>/wisdom/contributions/diagnoser-patterns-<timestamp>.md`
 2. Format: Symptom, root cause, detection method, fix approach
 3. Share state via team_msg:
   ```
   team_msg(operation="log", session_id=<session-id>, from="diagnoser",
            type="state_update", data={
              diagnosed_issues: <count>,
              pattern_types: {
                state_management: <count>,
                event_binding: <count>,
                async_handling: <count>
              }
            })
   ```
--- a/.claude/skills/team-ux-improve/role-specs/explorer.md
+++ b/.claude/skills/team-ux-improve/role-specs/explorer.md
@@ -1,109 +0,0 @@
 ---
 prefix: EXPLORE
 inner_loop: false
 message_types:
  success: explore_complete
  error: error
 ---
 # Codebase Explorer
 Explore codebase for UI component patterns, state management conventions, and framework-specific patterns. Callable by coordinator only.
 ## Phase 2: Exploration Scope
 1. Parse exploration request from task description
 2. Determine file patterns based on framework:
 ### Wisdom Input
 1. Read `<session>/wisdom/patterns/ui-feedback.md` and `<session>/wisdom/patterns/state-management.md` if available
 2. Use known patterns as reference when exploring codebase for component structures
 3. Check `<session>/wisdom/anti-patterns/common-ux-pitfalls.md` to identify problematic patterns during exploration
 | Framework | Patterns |
 |-----------|----------|
 | React | `**/*.tsx`, `**/*.jsx`, `**/use*.ts`, `**/store*.ts` |
 | Vue | `**/*.vue`, `**/composables/*.ts`, `**/stores/*.ts` |
 3. Check exploration cache: `<session>/explorations/cache-index.json`
   - If cache hit and fresh -> return cached results
   - If cache miss or stale -> proceed to Phase 3
 ## Phase 3: Codebase Exploration
 Use ACE search for semantic queries:
 ```
 mcp__ace-tool__search_context(
  project_root_path="<project-path>",
  query="<exploration-query>"
 )
 ```
 Exploration dimensions:
 | Dimension | Query | Purpose |
 |-----------|-------|---------|
 | Component patterns | "UI components with user interactions" | Find interactive components |
 | State management | "State management patterns useState ref reactive" | Identify state conventions |
 | Event handling | "Event handlers onClick onChange onSubmit" | Map event patterns |
 | Error handling | "Error handling try catch error state" | Find error patterns |
 | Feedback mechanisms | "Loading state spinner progress indicator" | Find existing feedback |
 For each dimension, collect:
 - File paths
 - Pattern examples
 - Convention notes
 ## Phase 4: Exploration Summary
 1. Generate pattern summary:
 ```markdown
 # Exploration Summary
 ## Framework: React/Vue
 ## Component Patterns
 - Button components use <pattern>
 - Form components use <pattern>
 ## State Management Conventions
 - Global state: Zustand/Pinia
 - Local state: useState/ref
 - Async state: custom hooks/composables
 ## Event Handling Patterns
 - Form submit: <pattern>
 - Button click: <pattern>
 ## Existing Feedback Mechanisms
 - Loading: <pattern>
 - Error: <pattern>
 - Success: <pattern>
 ```
 2. Cache results to `<session>/explorations/cache-index.json`
 3. Write summary to `<session>/explorations/exploration-summary.md`
 ### Wisdom Contribution
 If new component patterns or framework conventions discovered:
 1. Write pattern summaries to `<session>/wisdom/contributions/explorer-patterns-<timestamp>.md`
 2. Format:
   - Pattern Name: Descriptive name
   - Framework: React/Vue/etc.
   - Use Case: When to apply this pattern
   - Code Example: Representative snippet
   - Adoption: How widely used in codebase
 4. Share state via team_msg:
   ```
   team_msg(operation="log", session_id=<session-id>, from="explorer",
            type="state_update", data={
              framework: <framework>,
              components_found: <count>,
              patterns_identified: [<pattern-list>]
            })
   ```
--- a/.claude/skills/team-ux-improve/role-specs/implementer.md
+++ b/.claude/skills/team-ux-improve/role-specs/implementer.md
@@ -1,164 +0,0 @@
 ---
 prefix: IMPL
 inner_loop: true
 message_types:
  success: impl_complete
  error: error
 ---
 # Code Implementer
 Generate executable fix code with proper state management, event handling, and UI feedback bindings.
 ## Phase 2: Task & Design Loading
 1. Extract session path from task description
 2. Read design guide: `<session>/artifacts/design-guide.md`
 3. Extract implementation tasks from design guide
 4. **Wisdom Input**:
   - Read `<session>/wisdom/patterns/state-management.md` for state handling patterns
   - Read `<session>/wisdom/patterns/ui-feedback.md` for UI feedback implementation patterns
   - Read `<session>/wisdom/principles/general-ux.md` for implementation principles
   - Load framework-specific conventions if available
   - Apply these patterns and principles when generating code to ensure consistency and quality
 5. **For inner loop**: Load context_accumulator from prior IMPL tasks
 ### Context Accumulator (Inner Loop)
 ```
 context_accumulator = {
  completed_fixes: [<fix-1>, <fix-2>],
  modified_files: [<file-1>, <file-2>],
  patterns_applied: [<pattern-1>]
 }
 ```
 ## Phase 3: Code Implementation
 Implementation backend selection:
 | Backend | Condition | Method |
 |---------|-----------|--------|
 | CLI | Complex multi-file changes | `ccw cli --tool gemini --mode write` |
 | Direct | Simple single-file changes | Inline Edit/Write |
 ### CLI Implementation (Complex)
 ```
 Bash(`ccw cli -p "PURPOSE: Implement loading state and error handling for upload form
 TASK:
  - Add useState for isLoading and error
  - Wrap async call in try/catch/finally
  - Update UI bindings for button and error display
 CONTEXT: @src/components/Upload.tsx
 EXPECTED: Modified Upload.tsx with complete implementation
 CONSTRAINTS: Maintain existing code style" --tool gemini --mode write`)
 ```
 ### Direct Implementation (Simple)
 For simple state variable additions or UI binding changes:
 ```
 Edit({
  file_path: "src/components/Upload.tsx",
  old_string: "const handleUpload = async () => {",
  new_string: "const [isLoading, setIsLoading] = useState(false);\nconst [error, setError] = useState<string | null>(null);\n\nconst handleUpload = async () => {\n  setIsLoading(true);\n  setError(null);\n  try {"
 })
 ```
 ### Implementation Steps
 For each fix in design guide:
 1. Read target file
 2. Determine complexity (simple vs complex)
 3. Apply fix using appropriate backend
 4. Verify syntax (no compilation errors)
 5. Append to context_accumulator
 ## Phase 4: Self-Validation
 | Check | Method | Pass Criteria |
 |-------|--------|---------------|
 | Syntax | IDE diagnostics or tsc --noEmit | No errors |
 | File existence | Verify planned files exist | All present |
 | Acceptance criteria | Match against design guide | All met |
 Validation steps:
 1. Run syntax check on modified files
 2. Verify all files from design guide exist
 3. Check acceptance criteria from design guide
 4. If validation fails -> attempt auto-fix (max 2 attempts)
 ### Context Accumulator Update
 Append to context_accumulator:
 ```
 {
  completed_fixes: [...prev, <current-fix>],
  modified_files: [...prev, <current-files>],
  patterns_applied: [...prev, <current-patterns>]
 }
 ```
 Write summary to `<session>/artifacts/fixes/README.md`:
 ```markdown
 # Implementation Summary
 ## Completed Fixes
 - Issue #1: Upload form loading state - DONE
 - Issue #2: Error handling - DONE
 ## Modified Files
 - src/components/Upload.tsx
 - src/components/Form.tsx
 ## Patterns Applied
 - React useState for loading/error states
 - try/catch/finally for async handling
 - Conditional rendering for error messages
 ```
 Share state via team_msg:
 ```
 team_msg(operation="log", session_id=<session-id>, from="implementer",
         type="state_update", data={
           completed_fixes: <count>,
           modified_files: [<file-list>],
           validation_passed: true
         })
 ```
 ### Wisdom Contribution
 If reusable code patterns or snippets created:
 1. Write code snippets to `<session>/wisdom/contributions/implementer-snippets-<timestamp>.md`
 2. Format: Use case, code snippet with comments, framework compatibility notes
 Example contribution format:
 ```markdown
 # Implementer Snippets - <timestamp>
 ## Loading State Pattern (React)
 ### Use Case
 Async operations requiring loading indicator
 ### Code Snippet
 ```tsx
 const [isLoading, setIsLoading] = useState(false);
 const handleAsyncAction = async () => {
  setIsLoading(true);
  try {
    await performAction();
  } finally {
    setIsLoading(false);
  }
 };
 ```
 ### Framework Compatibility
 - React 16.8+ (hooks)
 - Next.js compatible
 ```
--- a/.claude/skills/team-ux-improve/role-specs/scanner.md
+++ b/.claude/skills/team-ux-improve/role-specs/scanner.md
@@ -1,117 +0,0 @@
 ---
 prefix: SCAN
 inner_loop: false
 message_types:
  success: scan_complete
  error: error
 ---
 # UI Scanner
 Scan UI components to identify interaction issues: unresponsive buttons, missing feedback mechanisms, state not refreshing.
 ## Phase 2: Context Loading
 | Input | Source | Required |
 |-------|--------|----------|
 | Project path | Task description CONTEXT | Yes |
 | Framework | Task description CONTEXT | Yes |
 | Scan scope | Task description CONSTRAINTS | Yes |
 1. Extract session path and project path from task description
 2. Detect framework from project structure:
 | Signal | Framework |
 |--------|-----------|
 | package.json has "react" | React |
 | package.json has "vue" | Vue |
 | *.tsx files present | React |
 | *.vue files present | Vue |
 3. Build file pattern list for scanning:
   - React: `**/*.tsx`, `**/*.jsx`, `**/use*.ts`
   - Vue: `**/*.vue`, `**/composables/*.ts`
 ### Wisdom Input
 1. Read `<session>/wisdom/anti-patterns/common-ux-pitfalls.md` if available
 2. Use anti-patterns to identify known UX issues during scanning
 3. Check `<session>/wisdom/patterns/ui-feedback.md` for expected feedback patterns
 ### Complex Analysis (use CLI)
 For large projects with many components:
 ```
 Bash(`ccw cli -p "PURPOSE: Discover all UI components with user interactions
 CONTEXT: @<project-path>/**/*.tsx @<project-path>/**/*.vue
 EXPECTED: Component list with interaction types (click, submit, input, select)
 CONSTRAINTS: Focus on interactive components only" --tool gemini --mode analysis`)
 ```
 ## Phase 3: Component Scanning
 Scan strategy:
 | Category | Detection Pattern | Severity |
 |----------|-------------------|----------|
 | Unresponsive actions | onClick/\@click without async handling or error catching | High |
 | Missing loading state | Form submit without isLoading/loading ref | High |
 | State not refreshing | Array.splice/push without reactive reassignment | High |
 | Missing error feedback | try/catch without error state or user notification | Medium |
 | Missing success feedback | API call without success confirmation | Medium |
 | No empty state | Data list without empty state placeholder | Low |
 | Input without validation | Form input without validation rules | Low |
 | Missing file selector | Text input for file/folder path without picker | Medium |
 For each component file:
 1. Read file content
 2. Scan for interaction patterns using Grep
 3. Check for feedback mechanisms (loading, error, success states)
 4. Check state update patterns (mutation vs reactive)
 5. Record issues with file:line references
 ## Phase 4: Issue Report Generation
 1. Classify issues by severity (High/Medium/Low)
 2. Group by category (unresponsive, missing feedback, state issues, input UX)
 3. Generate structured report:
 ```markdown
 # UI Scan Report
 ## Summary
 - Total issues: <count>
 - High: <count> | Medium: <count> | Low: <count>
 ## Issues
 ### High Severity
 | # | File:Line | Component | Issue | Category |
 |---|-----------|-----------|-------|----------|
 | 1 | src/components/Upload.tsx:45 | UploadForm | No loading state on submit | Missing feedback |
 ### Medium Severity
 ...
 ### Low Severity
 ...
 ```
 4. Write report to `<session>/artifacts/scan-report.md`
 5. Share state via team_msg:
   ```
   team_msg(operation="log", session_id=<session-id>, from="scanner",
            type="state_update", data={
              total_issues: <count>,
              high: <count>, medium: <count>, low: <count>,
              categories: [<category-list>],
              scanned_files: <count>
            })
   ```
 ### Wisdom Contribution
 If novel UX issues discovered that aren't in anti-patterns:
 1. Write findings to `<session>/wisdom/contributions/scanner-issues-<timestamp>.md`
 2. Format: Issue description, detection criteria, affected components
--- a/.claude/skills/team-ux-improve/role-specs/tester.md
+++ b/.claude/skills/team-ux-improve/role-specs/tester.md
@@ -1,163 +0,0 @@
 ---
 prefix: TEST
 inner_loop: false
 message_types:
  success: test_complete
  error: error
  fix: fix_required
 ---
 # Test Engineer
 Generate and run tests to verify fixes (loading states, error handling, state updates).
 ## Phase 2: Environment Detection
 1. Detect test framework from project files:
 | Signal | Framework |
 |--------|-----------|
 | package.json has "jest" | Jest |
 | package.json has "vitest" | Vitest |
 | package.json has "@testing-library/react" | React Testing Library |
 | package.json has "@vue/test-utils" | Vue Test Utils |
 2. Get changed files from implementer state:
   ```
   team_msg(operation="get_state", session_id=<session-id>, role="implementer")
   ```
 3. Load test strategy from design guide
 ### Wisdom Input
 1. Read `<session>/wisdom/anti-patterns/common-ux-pitfalls.md` for common issues to test
 2. Read `<session>/wisdom/patterns/ui-feedback.md` for expected feedback behaviors to verify
 3. Use wisdom to design comprehensive test cases covering known edge cases
 ## Phase 3: Test Generation & Execution
 ### Test Generation
 For each modified file, generate test cases:
 **React Example**:
 ```typescript
 import { render, screen, fireEvent, waitFor } from '@testing-library/react';
 import Upload from '../Upload';
 describe('Upload Component', () => {
  it('shows loading state during upload', async () => {
    global.fetch = vi.fn(() => Promise.resolve({ ok: true }));
    render(<Upload />);
    const uploadButton = screen.getByRole('button', { name: /upload/i });
    fireEvent.click(uploadButton);
    // Check loading state
    await waitFor(() => {
      expect(screen.getByText(/uploading.../i)).toBeInTheDocument();
      expect(uploadButton).toBeDisabled();
    });
    // Check normal state restored
    await waitFor(() => {
      expect(uploadButton).not.toBeDisabled();
    });
  });
  it('displays error message on failure', async () => {
    global.fetch = vi.fn(() => Promise.reject(new Error('Upload failed')));
    render(<Upload />);
    fireEvent.click(screen.getByRole('button', { name: /upload/i }));
    await waitFor(() => {
      expect(screen.getByText(/upload failed/i)).toBeInTheDocument();
    });
  });
 });
 ```
 ### Test Execution
 Iterative test-fix cycle (max 5 iterations):
 1. Run tests: `npm test` or `npm run test:unit`
 2. Parse results -> calculate pass rate
 3. If pass rate >= 95% -> exit (success)
 4. If pass rate < 95% and iterations < 5:
   - Analyze failures
   - Use CLI to generate fixes:
     ```
     Bash(`ccw cli -p "PURPOSE: Fix test failures
     CONTEXT: @<test-file> @<source-file>
     EXPECTED: Fixed code that passes tests
     CONSTRAINTS: Maintain existing functionality" --tool gemini --mode write`)
     ```
   - Increment iteration counter
   - Loop to step 1
 5. If iterations >= 5 -> send fix_required message
 ## Phase 4: Test Report
 ### Wisdom Contribution
 If new edge cases or test patterns discovered:
 1. Write test findings to `<session>/wisdom/contributions/tester-edge-cases-<timestamp>.md`
 2. Format: Edge case description, test scenario, expected behavior, actual behavior
 Generate test report:
 ```markdown
 # Test Report
 ## Summary
 - Total tests: <count>
 - Passed: <count>
 - Failed: <count>
 - Pass rate: <percentage>%
 - Fix iterations: <count>
 ## Test Results
 ### Passed Tests
 - ✅ Upload Component > shows loading state during upload
 - ✅ Upload Component > displays error message on failure
 ### Failed Tests
 - ❌ Form Component > validates input before submit
  - Error: Expected validation message not found
 ## Coverage
 - Statements: 85%
 - Branches: 78%
 - Functions: 90%
 - Lines: 84%
 ## Remaining Issues
 - Form validation test failing (needs manual review)
 ```
 Write report to `<session>/artifacts/test-report.md`
 Share state via team_msg:
 ```
 team_msg(operation="log", session_id=<session-id>, from="tester",
         type="state_update", data={
           total_tests: <count>,
           passed: <count>,
           failed: <count>,
           pass_rate: <percentage>,
           fix_iterations: <count>
         })
 ```
 If pass rate < 95%, send fix_required message:
 ```
 SendMessage({
  to: "coordinator",
  message: "[tester] Test validation incomplete. Pass rate: <percentage>%. Manual review needed."
 })
 ```
--- a/.claude/skills/team-ux-improve/specs/team-config.json
+++ b/.claude/skills/team-ux-improve/specs/team-config.json
@@ -27,7 +27,7 @@
      "display_name": "UI Scanner",
      "type": "worker",
      "responsibility_type": "read_only_analysis",
-      "role_spec": "role-specs/scanner.md",
+      "role_spec": "roles/scanner/role.md",
      "task_prefix": "SCAN",
      "inner_loop": false,
      "allowed_tools": ["Read", "Grep", "Glob", "Bash", "mcp__ace-tool__search_context", "mcp__ccw-tools__read_file", "mcp__ccw-tools__team_msg", "TaskList", "TaskGet", "TaskUpdate", "SendMessage"],
@@ -46,7 +46,7 @@
      "display_name": "State Diagnoser",
      "type": "worker",
      "responsibility_type": "orchestration",
-      "role_spec": "role-specs/diagnoser.md",
+      "role_spec": "roles/diagnoser/role.md",
      "task_prefix": "DIAG",
      "inner_loop": false,
      "allowed_tools": ["Read", "Grep", "Bash", "mcp__ace-tool__search_context", "mcp__ccw-tools__read_file", "mcp__ccw-tools__team_msg", "TaskList", "TaskGet", "TaskUpdate", "SendMessage"],
@@ -65,7 +65,7 @@
      "display_name": "UX Designer",
      "type": "worker",
      "responsibility_type": "orchestration",
-      "role_spec": "role-specs/designer.md",
+      "role_spec": "roles/designer/role.md",
      "task_prefix": "DESIGN",
      "inner_loop": false,
      "allowed_tools": ["Read", "Write", "Bash", "mcp__ccw-tools__read_file", "mcp__ccw-tools__write_file", "mcp__ccw-tools__team_msg", "TaskList", "TaskGet", "TaskUpdate", "SendMessage"],
@@ -84,7 +84,7 @@
      "display_name": "Code Implementer",
      "type": "worker",
      "responsibility_type": "code_generation",
-      "role_spec": "role-specs/implementer.md",
+      "role_spec": "roles/implementer/role.md",
      "task_prefix": "IMPL",
      "inner_loop": true,
      "allowed_tools": ["Read", "Write", "Edit", "Bash", "mcp__ccw-tools__read_file", "mcp__ccw-tools__write_file", "mcp__ccw-tools__edit_file", "mcp__ccw-tools__team_msg", "TaskList", "TaskGet", "TaskUpdate", "SendMessage"],
@@ -103,7 +103,7 @@
      "display_name": "Test Engineer",
      "type": "worker",
      "responsibility_type": "validation",
-      "role_spec": "role-specs/tester.md",
+      "role_spec": "roles/tester/role.md",
      "task_prefix": "TEST",
      "inner_loop": false,
      "allowed_tools": ["Read", "Write", "Bash", "mcp__ccw-tools__read_file", "mcp__ccw-tools__write_file", "mcp__ccw-tools__team_msg", "TaskList", "TaskGet", "TaskUpdate", "SendMessage"],
@@ -123,7 +123,7 @@
    {
      "name": "explorer",
      "display_name": "Codebase Explorer",
-      "role_spec": "role-specs/explorer.md",
+      "role_spec": "roles/explorer/role.md",
      "callable_by": "coordinator",
      "purpose": "Explore codebase for UI component patterns, state management conventions, and framework-specific patterns",
      "allowed_tools": ["Read", "Grep", "Glob", "Bash", "mcp__ace-tool__search_context", "mcp__ccw-tools__read_file", "mcp__ccw-tools__team_msg"],
--- a/.claude/skills/workflow-execute/phases/06-review.md
+++ b/.claude/skills/workflow-execute/phases/06-review.md
@@ -93,7 +93,7 @@ rg "password|token|secret|auth" -g "*.{ts,js,py}"
 rg "eval|exec|innerHTML|dangerouslySetInnerHTML" -g "*.{ts,js,tsx}"
 # Gemini security analysis
-ccw spec load --category execution
+ccw spec load --category review
 ccw cli -p "
 PURPOSE: Security audit of completed implementation
 TASK: Review code for security vulnerabilities, insecure patterns, auth/authz issues
@@ -105,7 +105,7 @@ RULES: Focus on OWASP Top 10, authentication, authorization, data validation, in
 **Architecture Review** (`architecture`):
 ```bash
-ccw spec load --category execution
+ccw spec load --category review
 ccw cli -p "
 PURPOSE: Architecture compliance review
 TASK: Evaluate adherence to architectural patterns, identify technical debt, review design decisions
@@ -117,7 +117,7 @@ RULES: Check for patterns, separation of concerns, modularity, scalability
 **Quality Review** (`quality`):
 ```bash
-ccw spec load --category execution
+ccw spec load --category review
 ccw cli -p "
 PURPOSE: Code quality and best practices review
 TASK: Assess code readability, maintainability, adherence to best practices
@@ -139,7 +139,7 @@ for task_file in ${sessionPath}/.task/*.json; do
 done
 # Cross-check implementation against requirements
-ccw spec load --category execution
+ccw spec load --category review
 ccw cli -p "
 PURPOSE: Verify all requirements and acceptance criteria are met
 TASK: Cross-check implementation summaries against original requirements
--- a/.claude/skills/workflow-lite-test-review/SKILL.md
+++ b/.claude/skills/workflow-lite-test-review/SKILL.md
@@ -8,6 +8,8 @@ allowed-tools: Skill, Agent, AskUserQuestion, TodoWrite, Read, Write, Edit, Bash
 Test review and fix engine for lite-execute chain or standalone invocation.
 **Project Context**: Run `ccw spec load --category test` for test framework conventions, coverage targets, and fixtures.
 ---
 ## Usage
--- a/.claude/skills/workflow-plan/phases/05-plan-verify.md
+++ b/.claude/skills/workflow-plan/phases/05-plan-verify.md
@@ -34,6 +34,10 @@ You **MUST** consider the user input before proceeding (if not empty).
 ## Execution
 ### Step 5.0: Load Validation Context
 Run `ccw spec load --category validation` for verification rules and acceptance criteria.
 ### Step 5.1: Initialize Analysis Context
 ```bash
--- a/.claude/skills/workflow-tdd-plan/phases/02-context-gathering.md
+++ b/.claude/skills/workflow-tdd-plan/phases/02-context-gathering.md
@@ -222,6 +222,7 @@ Execute complete context-search-agent workflow for TDD implementation planning:
 ### Phase 1: Initialization & Pre-Analysis
 1. **Project State Loading**:
   - Run: \`ccw spec load --category execution\` to load project context, tech stack, and guidelines.
   - Run: \`ccw spec load --category test\` to load test framework conventions, coverage targets, and fixtures.
   - If files don't exist, proceed with fresh analysis.
 2. **Detection**: Check for existing context-package (early exit if valid)
 3. **Foundation**: Initialize CodexLens, get project structure, load docs
--- a/.claude/skills/workflow-tdd-plan/phases/05-tdd-task-generation.md
+++ b/.claude/skills/workflow-tdd-plan/phases/05-tdd-task-generation.md
@@ -237,13 +237,14 @@ MCP Capabilities: {exa_code, exa_web, code_index}
 These files provide project-level constraints that apply to ALL tasks:
 1. **ccw spec load --category execution** (project specs and tech analysis)
 2. **ccw spec load --category test** (test framework, coverage targets, conventions)
   - Contains: tech_stack, architecture_type, key_components, build_system, test_framework, coding_conventions, naming_rules, forbidden_patterns, quality_gates, custom_constraints
   - Usage: Populate plan.json shared_context, align task tech choices, set correct test commands
   - Apply as HARD CONSTRAINTS on all generated tasks — task implementation steps,
     acceptance criteria, and convergence.verification MUST respect these guidelines
   - If empty/missing: No additional constraints (proceed normally)
-Loading order: \`ccw spec load --category execution\` → planning-notes.md → context-package.json
+Loading order: \`ccw spec load --category execution\` → \`ccw spec load --category test\` → planning-notes.md → context-package.json
 ## USER CONFIGURATION (from Phase 0)
 Execution Method: ${userConfig.executionMethod}  // agent|hybrid|cli
--- a/.claude/skills/workflow-test-fix/phases/02-test-context-gather.md
+++ b/.claude/skills/workflow-test-fix/phases/02-test-context-gather.md
@@ -346,6 +346,7 @@ Execute complete context-search-agent workflow for implementation planning:
 ### Phase 1: Initialization & Pre-Analysis
 1. **Project State Loading**:
   - Run: \`ccw spec load --category execution\` to load project context, tech stack, and guidelines.
   - Run: \`ccw spec load --category test\` to load test framework conventions, coverage targets, and fixtures.
   - If files don't exist, proceed with fresh analysis.
 2. **Detection**: Check for existing context-package (early exit if valid)
 3. **Foundation**: Initialize CodexLens, get project structure, load docs
--- a/.claude/skills/workflow-test-fix/phases/05-test-cycle-execute.md
+++ b/.claude/skills/workflow-test-fix/phases/05-test-cycle-execute.md
@@ -249,7 +249,8 @@ Task(
    ${selectedStrategy} - ${strategyDescription}
    ## PROJECT CONTEXT (MANDATORY)
-    1. Run: \`ccw spec load --category execution\` (tech stack, test framework, build system, constraints)
+    1. Run: \`ccw spec load --category execution\` (tech stack, build system, constraints)
    2. Run: \`ccw spec load --category test\` (test framework, coverage targets, conventions)
    ## MANDATORY FIRST STEPS
    1. Read test results: ${session.test_results_path}
--- a/.claude/skills/workflow-tune/SKILL.md
+++ b/.claude/skills/workflow-tune/SKILL.md
@@ -0,0 +1,495 @@
 ---
 name: workflow-tune
 description: Workflow tuning skill for multi-command/skill pipelines. Executes each step sequentially, inspects artifacts after each command, analyzes quality via ccw cli resume, builds process documentation, and generates optimization suggestions. Triggers on "workflow tune", "tune workflow", "workflow optimization".
 allowed-tools: Skill, Agent, AskUserQuestion, TaskCreate, TaskUpdate, TaskList, Read, Write, Edit, Bash, Glob, Grep
 ---
 # Workflow Tune
 Tune multi-step workflows composed of commands or skills. Execute each step, inspect artifacts, analyze via ccw cli resume, build process documentation, and produce actionable optimization suggestions.
 ## Architecture Overview
 ```
 ┌──────────────────────────────────────────────────────────────────────┐
 │  Workflow Tune Orchestrator (SKILL.md)                                │
 │  → Parse → Decompose → Confirm → Setup → Step Loop → Synthesize     │
 └──────────────────────┬───────────────────────────────────────────────┘
                       │
   ┌───────────────────┼───────────────────────────────────┐
   ↓                   ↓                                   ↓
 ┌──────────┐   ┌─────────────────────────────┐     ┌──────────────┐
 │ Phase 1  │   │  Step Loop (2→3 per step)   │     │ Phase 4 + 5  │
 │ Setup    │   │  ┌─────┐  ┌─────┐           │     │ Synthesize + │
 │          │──→│  │ P2  │→ │ P3  │           │────→│ Report       │
 │ Parse +  │   │  │Exec │  │Anal │           │     │              │
 │ Decomp + │   │  └─────┘  └─────┘           │     └──────────────┘
 │ Confirm  │   │       ↑       │  next step  │
 └──────────┘   │       └───────┘             │
               └─────────────────────────────┘
 Phase 1 Detail:
  Input → [Format 1-3: direct parse] ──→ Command Doc → Confirm → Init
       → [Format 4: natural lang]   ──→ Semantic Decompose → Command Doc → Confirm → Init
 ```
 ## Key Design Principles
 1. **Test-First Evaluation**: Auto-generate per-step acceptance criteria before execution — judge results against concrete requirements, not vague quality
 2. **Step-by-Step Execution**: Each workflow step executes independently, artifacts inspected before proceeding
 3. **Resume-Based Analysis**: Uses ccw cli `--resume` to maintain analysis context across steps
 4. **Process Documentation**: Running `process-log.md` accumulates observations per step
 5. **Two-Tool Pipeline**: Claude/target tool (execute) + Gemini (analyze) = complementary perspectives
 6. **Pure Orchestrator**: SKILL.md coordinates only — execution detail in phase files
 7. **Progressive Phase Loading**: Phase docs read only when that phase executes
 ## Interactive Preference Collection
 ```javascript
 // ★ Auto mode detection
 const autoYes = /\b(-y|--yes)\b/.test($ARGUMENTS)
 if (autoYes) {
  workflowPreferences = {
    autoYes: true,
    analysisDepth: 'standard',
    autoFix: false
  }
 } else {
  const prefResponse = AskUserQuestion({
    questions: [
      {
        question: "选择 Workflow 调优配置：",
        header: "Tune Config",
        multiSelect: false,
        options: [
          { label: "Quick (轻量分析)", description: "每步简要检查，快速产出建议" },
          { label: "Standard (标准分析) (Recommended)", description: "每步详细分析，完整过程文档" },
          { label: "Deep (深度分析)", description: "每步深度审查，含性能和架构建议" }
        ]
      },
      {
        question: "是否自动应用优化建议？",
        header: "Auto Fix",
        multiSelect: false,
        options: [
          { label: "No (仅生成报告) (Recommended)", description: "只分析，不修改" },
          { label: "Yes (自动应用)", description: "分析后自动应用高优先级建议" }
        ]
      }
    ]
  })
  const depthMap = {
    "Quick": "quick",
    "Standard": "standard",
    "Deep": "deep"
  }
  const selectedDepth = Object.keys(depthMap).find(k =>
    prefResponse["Tune Config"].startsWith(k)
  ) || "Standard"
  workflowPreferences = {
    autoYes: false,
    analysisDepth: depthMap[selectedDepth],
    autoFix: prefResponse["Auto Fix"].startsWith("Yes")
  }
 }
 ```
 ## Input Processing
 ```
 $ARGUMENTS → Parse:
  ├─ Workflow definition: one of:
  │   ├─ Format 1: Inline steps — "step1 | step2 | step3" (pipe-separated commands)
  │   ├─ Format 2: Skill names — "skill-a,skill-b,skill-c" (comma-separated)
  │   ├─ Format 3: File path — "--file workflow.json" (JSON definition)
  │   └─ Format 4: Natural language — free-text description, auto-decomposed into steps
  ├─ Test context: --context "description of what the workflow should achieve"
  └─ Flags: --depth quick|standard|deep, -y/--yes, --auto-fix
 ```
 ### Format Detection Priority
 ```
 1. --file flag present         → Format 3 (JSON)
 2. Contains pipe "|"           → Format 1 (inline commands)
 3. Matches skill-name pattern  → Format 2 (comma-separated skills)
 4. Everything else             → Format 4 (natural language → semantic decomposition)
   4a. Contains file path?     → Read file as reference doc, extract workflow steps via LLM
   4b. Pure intent text        → Direct intent-verb matching (intentMap)
 ```
 ### Format 4: Semantic Decomposition (Natural Language)
 When input is free-text (e.g., "分析 src 目录代码质量，做代码评审，然后修复高优先级问题"), the orchestrator:
 #### 4a. Reference Document Mode (input contains file path)
 When the input contains a file path (e.g., `d:\maestro2\guide\command-usage-guide.md 提取核心工作流`), the orchestrator:
 1. **Detect File Path**: Extract file path from input via path pattern matching
 2. **Read Reference Doc**: Read the file content as workflow reference material
 3. **Extract Workflow via LLM**: Use Gemini to analyze the document + user intent, extract executable workflow steps
 4. **Step Chain Generation**: LLM returns structured step chain with commands, tools, and execution order
 5. **Command Doc + Confirmation**: Same as 4b below
 #### 4b. Pure Intent Mode (no file path)
 1. **Semantic Parse**: Identify intent verbs and targets → map to available skills/commands
 2. **Step Chain Generation**: Produce ordered step chain with tool/mode selection
 3. **Command Doc**: Generate formatted execution plan document
 4. **User Confirmation**: Display plan, ask user to confirm/edit before execution
 ```javascript
 // ★ 4a: If input contains file path → read file, extract workflow via LLM
 //   Detect paths: Windows (D:\path\file.md), Unix (/path/file.md), relative (./file.md)
 //   Read file content → send to Gemini with user intent → get executable step chain
 //   See phases/01-setup.md Step 1.1b Mode 4a
 // ★ 4b: Pure intent text → regex-based intent-to-tool mapping
 const intentMap = {
  '分析|analyze|审查|inspect|scan': { tool: 'gemini', mode: 'analysis', rule: 'analysis-analyze-code-patterns' },
  '评审|review|code review': { tool: 'gemini', mode: 'analysis', rule: 'analysis-review-code-quality' },
  // ... (full map in phases/01-setup.md Step 1.1b Mode 4b)
 };
 // Match input segments to intents, produce step chain
 // See phases/01-setup.md Step 1.1b for full algorithm
 ```
 ### Workflow JSON Format (when using --file)
 ```json
 {
  "name": "my-workflow",
  "description": "What this workflow achieves",
  "steps": [
    {
      "name": "step-1-name",
      "type": "skill|command|ccw-cli",
      "command": "/skill-name args" | "ccw cli -p '...' --tool gemini --mode analysis",
      "expected_artifacts": ["output.md", "report.json"],
      "success_criteria": "description of what success looks like"
    }
  ]
 }
 ```
 ### Inline Step Parsing
 ```javascript
 // Pipe-separated: each segment is a command
 // "ccw cli -p 'analyze' --tool gemini --mode analysis | /review-code src/ | ccw cli -p 'fix' --tool claude --mode write"
 const steps = input.split('|').map((cmd, i) => ({
  name: `step-${i + 1}`,
  type: cmd.trim().startsWith('/') ? 'skill' : 'command',
  command: cmd.trim(),
  expected_artifacts: [],
  success_criteria: ''
 }));
 ```
 ## Pre-Execution Confirmation
 After parsing (all formats) or decomposition (Format 4), generate a **Command Document** and ask for user confirmation before executing.
 ### Command Document Format
 ```markdown
 # Workflow Tune — Execution Plan
 **Workflow**: {name}
 **Goal**: {context}
 **Steps**: {count}
 **Analysis Depth**: {depth}
 ## Step Chain
 | # | Name | Type | Command | Tool | Mode |
 |---|------|------|---------|------|------|
 | 1 | {name} | {type} | {command} | {tool} | {mode} |
 | 2 | ... | ... | ... | ... | ... |
 ## Execution Flow
 ```
 Step 1: {name}
  → Command: {command}
  → Expected: {artifacts}
  → Feeds into: Step 2
       ↓
 Step 2: {name}
  → Command: {command}
  → Expected: {artifacts}
  → Feeds into: Step 3
       ↓
  ...
 ```
 ## Estimated Scope
 - Total CLI calls: {N} (execute) + {N} (analyze) + 1 (synthesize)
 - Analysis tool: gemini (--resume chain)
 - Process documentation: process-log.md (accumulated)
 ```
 ### Confirmation Flow
 ```javascript
 // ★ Skip confirmation only if -y/--yes flag
 if (!workflowPreferences.autoYes) {
  // Display command document to user
  // Output the formatted plan (direct text output, NOT a file)
  const confirmation = AskUserQuestion({
    questions: [{
      question: "确认执行以上 Workflow 调优计划？",
      header: "Confirm Execution",
      multiSelect: false,
      options: [
        { label: "Execute (确认执行)", description: "按计划开始执行" },
        { label: "Edit steps (修改步骤)", description: "我想调整某些步骤" },
        { label: "Cancel (取消)", description: "取消本次调优" }
      ]
    }]
  });
  if (confirmation["Confirm Execution"].startsWith("Cancel")) {
    // Abort workflow
    return;
  }
  if (confirmation["Confirm Execution"].startsWith("Edit")) {
    // Ask user for modifications
    const editResponse = AskUserQuestion({
      questions: [{
        question: "请描述要修改的内容（如：删除步骤2、步骤3改用codex、在步骤1后加入安全扫描）：",
        header: "Edit Steps"
      }]
    });
    // Apply user edits to steps[] → re-display command doc → re-confirm
    // (recursive confirmation loop until Execute or Cancel)
  }
 }
 ```
 ## Execution Flow
 > **COMPACT DIRECTIVE**: Context compression MUST check TaskUpdate phase status.
 > The phase currently marked `in_progress` is the active execution phase — preserve its FULL content.
 > Only compress phases marked `completed` or `pending`.
 ### Phase 1: Setup (one-time)
 Read and execute: `Ref: phases/01-setup.md`
 - Parse workflow steps from input (Format 1-3: direct parse, Format 4: semantic decomposition)
 - Generate Command Document (formatted execution plan)
 - **User Confirmation**: Display plan, wait for confirm/edit/cancel
 - **Generate Test Requirements**: Auto-create per-step acceptance criteria via Gemini (expected outputs, content signals, quality thresholds, pass/fail criteria, handoff contracts)
 - Create workspace at `.workflow/.scratchpad/workflow-tune-{ts}/`
 - Initialize workflow-state.json (with test_requirements per step)
 - Create process-log.md template
 Output: `workDir`, `steps[]` (with test_requirements), `workflowContext`, `commandDoc`, initialized state
 ### Step Loop (Phase 2 + Phase 3, per step)
 ```javascript
 // Track analysis session ID for resume chain
 let analysisSessionId = null;
 for (let stepIdx = 0; stepIdx < state.steps.length; stepIdx++) {
  const step = state.steps[stepIdx];
  TaskUpdate(stepLoopTask, {
    subject: `Step ${stepIdx + 1}/${state.steps.length}: ${step.name}`,
    status: 'in_progress'
  });
  // === Phase 2: Execute Step ===
  // Read: phases/02-step-execute.md
  // Execute command/skill → collect artifacts
  // Write step-{N}-artifacts-manifest.json
  // === Phase 3: Analyze Step ===
  // Read: phases/03-step-analyze.md
  // Inspect artifacts → ccw cli gemini --resume analysisSessionId
  // Write step-{N}-analysis.md → append to process-log.md
  // Update analysisSessionId for next step's resume
  // Update state
  state.steps[stepIdx].status = 'completed';
  Write(`${state.work_dir}/workflow-state.json`, JSON.stringify(state, null, 2));
 }
 ```
 ### Phase 2: Execute Step (per step)
 Read and execute: `Ref: phases/02-step-execute.md`
 - Create step working directory
 - Execute command/skill via ccw cli or Skill tool
 - Collect output artifacts
 - Write artifacts manifest
 ### Phase 3: Analyze Step (per step)
 Read and execute: `Ref: phases/03-step-analyze.md`
 - Inspect step artifacts (file list, content summary, quality signals)
 - **Compare actual output against test requirements** (pass_criteria, content_signals, fail_signals, handoff_contract)
 - Build analysis prompt with step context + test requirements + previous process log
 - Execute: `ccw cli --tool gemini --mode analysis [--resume sessionId]`
 - Parse analysis → write step-{N}-analysis.md (with requirement match: PASS/FAIL)
 - Append findings to process-log.md
 - Return analysis session ID for resume chain
 ### Phase 4: Synthesize (one-time)
 Read and execute: `Ref: phases/04-synthesize.md`
 - Read complete process-log.md + all step analyses
 - Build synthesis prompt with full workflow context
 - Execute: `ccw cli --tool gemini --mode analysis --resume analysisSessionId`
 - Generate cross-step optimization insights
 - Write synthesis.md
 ### Phase 5: Optimization Report (one-time)
 Read and execute: `Ref: phases/05-optimize-report.md`
 - Aggregate all analyses and synthesis
 - Generate structured optimization report
 - Optionally apply high-priority fixes (if autoFix enabled)
 - Write final-report.md
 - Display summary to user
 **Phase Reference Documents**:
 | Phase | Document | Purpose | Compact |
 |-------|----------|---------|---------|
 | 1 | [phases/01-setup.md](phases/01-setup.md) | Initialize workspace and state | TaskUpdate driven |
 | 2 | [phases/02-step-execute.md](phases/02-step-execute.md) | Execute workflow step | TaskUpdate driven |
 | 3 | [phases/03-step-analyze.md](phases/03-step-analyze.md) | Analyze step artifacts | TaskUpdate driven + resume |
 | 4 | [phases/04-synthesize.md](phases/04-synthesize.md) | Cross-step synthesis | TaskUpdate driven + resume |
 | 5 | [phases/05-optimize-report.md](phases/05-optimize-report.md) | Generate final report | TaskUpdate driven |
 ## Data Flow
 ```
 User Input (workflow steps / natural language + context)
    ↓
 Phase 1: Setup
    ├─ [Format 1-3] Direct parse → steps[]
    ├─ [Format 4a]  File path detected → Read doc → LLM extract → steps[]
    ├─ [Format 4b]  Pure intent text → Regex intent matching → steps[]
    ↓
    Command Document (formatted plan)
    ↓
    User Confirmation (Execute / Edit / Cancel)
    ↓ (Execute confirmed)
    ↓
    Generate Test Requirements (Gemini) → per-step acceptance criteria
    ↓ workDir, steps[] (with test_requirements), workflow-state.json, process-log.md
    ↓
 ┌─→ Phase 2: Execute Step N (ccw cli / Skill)
 │   ↓ step-N/ artifacts
 │   ↓
 │   Phase 3: Analyze Step N (ccw cli gemini --resume)
 │   ↓ step-N-analysis.md, process-log.md updated
 │   ↓ analysisSessionId carried forward
 │   ↓
 │   [More steps?]─── YES ──→ next step (Phase 2)
 │   ↓ NO
 │   ↓
 └───┘
    ↓
 Phase 4: Synthesize (ccw cli gemini --resume)
    ↓ synthesis.md
    ↓
 Phase 5: Report
    ↓ final-report.md + optional auto-fix
    ↓
 Done
 ```
 ## TaskUpdate Pattern
 ```javascript
 // Initial state
 TaskCreate({ subject: "Phase 1: Setup workspace", activeForm: "Parsing workflow" })
 TaskCreate({ subject: "Step Loop", activeForm: "Executing steps" })
 TaskCreate({ subject: "Phase 4-5: Synthesize & Report", activeForm: "Pending" })
 // Per-step tracking
 for (const step of state.steps) {
  TaskCreate({
    subject: `Step: ${step.name}`,
    activeForm: `Pending`,
    description: `${step.type}: ${step.command}`
  })
 }
 // During step execution
 TaskUpdate(stepTask, {
  subject: `Step: ${step.name} — Executing`,
  activeForm: `Running ${step.command}`
 })
 // After step analysis
 TaskUpdate(stepTask, {
  subject: `Step: ${step.name} — Analyzed`,
  activeForm: `Quality: ${stepQuality} | Issues: ${issueCount}`,
  status: 'completed'
 })
 // Final
 TaskUpdate(synthesisTask, {
  subject: `Synthesis & Report (${state.steps.length} steps, ${totalIssues} issues)`,
  status: 'completed'
 })
 ```
 ## Resume Chain Strategy
 ```
 Step 1 Execute → artifacts
 Step 1 Analyze → ccw cli gemini --mode analysis → sessionId_1
 Step 2 Execute → artifacts
 Step 2 Analyze → ccw cli gemini --mode analysis --resume sessionId_1 → sessionId_2
  ...
 Step N Analyze → sessionId_N
 Synthesize   → ccw cli gemini --mode analysis --resume sessionId_N → final
 ```
 Each analysis step resumes the previous session, maintaining full context of:
 - All prior step observations
 - Accumulated quality patterns
 - Cross-step dependency insights
 ## Error Handling
 | Phase | Error | Recovery |
 |-------|-------|----------|
 | 2: Execute | CLI timeout/crash | Retry once, then record failure and continue to next step |
 | 2: Execute | Skill not found | Skip step, note in process-log |
 | 3: Analyze | CLI fails | Retry without --resume, start fresh session |
 | 3: Analyze | Resume session not found | Start fresh analysis session |
 | 4: Synthesize | CLI fails | Generate report from individual step analyses only |
 | Any | 3+ consecutive errors | Terminate with partial report |
 **Error Budget**: Each step gets 1 retry. 3 consecutive failures triggers early termination.
 ## Core Rules
 1. **Start Immediately**: First action is preference collection → Phase 1 setup
 2. **Progressive Loading**: Read phase doc ONLY when that phase is about to execute
 3. **Inspect Before Proceed**: Always check step artifacts before moving to next step
 4. **Background CLI**: ccw cli runs in background, wait for hook callback before proceeding
 5. **Resume Chain**: Maintain analysis session continuity via --resume
 6. **Process Documentation**: Every step observation goes into process-log.md
 7. **Single State Source**: `workflow-state.json` is the only source of truth
 8. **DO NOT STOP**: Continuous execution until all steps processed
--- a/.claude/skills/workflow-tune/phases/01-setup.md
+++ b/.claude/skills/workflow-tune/phases/01-setup.md
@@ -0,0 +1,670 @@
 # Phase 1: Setup
 Initialize workspace, parse workflow definition, semantic decomposition for natural language, generate command document, user confirmation, create state and process log.
 ## Objective
 - Parse workflow steps from user input (Format 1-3: direct parse, Format 4: semantic decomposition)
 - Generate Command Document (formatted execution plan)
 - User confirmation: Execute / Edit steps / Cancel
 - Validate step commands/skill paths
 - Create isolated workspace directory
 - Initialize workflow-state.json and process-log.md
 ## Execution
 ### Step 1.1: Parse Input
 ```javascript
 const args = $ARGUMENTS.trim();
 // Detect input format
 let steps = [];
 let workflowName = 'unnamed-workflow';
 let workflowContext = '';
 // Format 1: JSON file (--file path)
 const fileMatch = args.match(/--file\s+"?([^\s"]+)"?/);
 if (fileMatch) {
  const wfDef = JSON.parse(Read(fileMatch[1]));
  workflowName = wfDef.name || 'unnamed-workflow';
  workflowContext = wfDef.description || '';
  steps = wfDef.steps;
 }
 // Format 2: Pipe-separated commands ("cmd1 | cmd2 | cmd3")
 else if (args.includes('|')) {
  const rawSteps = args.split(/(?:--context|--depth|-y|--yes|--auto-fix)\s+("[^"]*"|\S+)/)[0];
  steps = rawSteps.split('|').map((cmd, i) => ({
    name: `step-${i + 1}`,
    type: cmd.trim().startsWith('/') ? 'skill'
        : cmd.trim().startsWith('ccw cli') ? 'ccw-cli'
        : 'command',
    command: cmd.trim(),
    expected_artifacts: [],
    success_criteria: ''
  }));
 }
 // Format 3: Comma-separated skill names (matches pattern: word,word or word-word,word-word)
 else if (/^[\w-]+(,[\w-]+)+/.test(args.split(/\s/)[0])) {
  const skillPart = args.match(/^([^\s]+)/);
  const skillNames = skillPart ? skillPart[1].split(',') : [];
  steps = skillNames.map((name, i) => {
    const skillPath = name.startsWith('.claude/') ? name : `.claude/skills/${name}`;
    return {
      name: name.replace('.claude/skills/', ''),
      type: 'skill',
      command: `/${name.replace('.claude/skills/', '')}`,
      skill_path: skillPath,
      expected_artifacts: [],
      success_criteria: ''
    };
  });
 }
 // Format 4: Natural language → semantic decomposition
 else {
  inputFormat = 'natural-language';
  naturalLanguageInput = args.replace(/--\w+\s+"[^"]*"/g, '').replace(/--\w+\s+\S+/g, '').replace(/-y|--yes/g, '').trim();
  // ★ 4a: Detect file paths in input (Windows absolute, Unix absolute, or relative paths)
  const filePathPattern = /(?:[A-Za-z]:[\\\/][^\s,;，；、]+|\/[^\s,;，；、]+\.(?:md|txt|json|yaml|yml|toml)|\.\/?[^\s,;，；、]+\.(?:md|txt|json|yaml|yml|toml))/g;
  const detectedPaths = naturalLanguageInput.match(filePathPattern) || [];
  referenceDocContent = null;
  referenceDocPath = null;
  if (detectedPaths.length > 0) {
    // Read first detected file as reference document
    referenceDocPath = detectedPaths[0];
    try {
      referenceDocContent = Read(referenceDocPath);
      // Remove file path from natural language input to get pure user intent
      naturalLanguageInput = naturalLanguageInput.replace(referenceDocPath, '').trim();
    } catch (e) {
      // File not readable — fall through to 4b (pure intent mode)
      referenceDocContent = null;
    }
  }
  // Steps will be populated in Step 1.1b
  steps = [];
 }
 // Parse --context
 const contextMatch = args.match(/--context\s+"([^"]+)"/);
 workflowContext = contextMatch ? contextMatch[1] : workflowContext;
 // Parse --depth
 const depthMatch = args.match(/--depth\s+(quick|standard|deep)/);
 if (depthMatch) {
  workflowPreferences.analysisDepth = depthMatch[1];
 }
 // If no context provided, ask user
 if (!workflowContext) {
  const response = AskUserQuestion({
    questions: [{
      question: "请描述这个 workflow 的目标和预期效果：",
      header: "Workflow Context",
      multiSelect: false,
      options: [
        { label: "General quality check", description: "通用质量检查，评估步骤间衔接" },
        { label: "Custom description", description: "自定义描述 workflow 目标" }
      ]
    }]
  });
  workflowContext = response["Workflow Context"];
 }
 ```
 ### Step 1.1b: Semantic Decomposition (Format 4 only)
 > Skip this step if `inputFormat !== 'natural-language'`.
 Two sub-modes based on whether a reference document was detected in Step 1.1:
 #### Mode 4a: Reference Document → LLM Extraction
 When `referenceDocContent` is available, use Gemini to extract executable workflow steps from the document, guided by the user's intent text.
 ```javascript
 if (inputFormat === 'natural-language' && referenceDocContent) {
  // ★ 4a: Extract workflow steps from reference document via LLM
  const extractPrompt = `PURPOSE: Extract an executable workflow step chain from the reference document below, guided by the user's intent.
 USER INTENT: ${naturalLanguageInput}
 REFERENCE DOCUMENT PATH: ${referenceDocPath}
 REFERENCE DOCUMENT CONTENT:
 ${referenceDocContent}
 TASK:
 1. Read the document and identify the workflow/process it describes (commands, steps, phases, procedures)
 2. Filter by user intent — only extract the steps the user wants to test/tune
 3. For each step, determine:
   - The actual command to execute (shell command, CLI invocation, or skill name)
   - The execution order and dependencies
   - What tool to use (gemini/claude/codex/qwen) and mode (default: write)
 4. Generate a step chain that can be directly executed
 IMPORTANT:
 - Extract REAL executable commands from the document, not analysis tasks about the document
 - The user wants to RUN these workflow steps, not analyze the document itself
 - If the document describes CLI commands like "maestro init", "maestro plan", etc., those are the steps to extract
 - Preserve the original command syntax from the document
 - Map each command to appropriate tool/mode for ccw cli execution, OR mark as 'command' type for direct shell execution
 - Default mode to "write" — almost all steps produce output artifacts (files, reports, configs), even analysis steps need write permission to save results
 EXPECTED OUTPUT (strict JSON, no markdown):
 {
  "workflow_name": "<descriptive name>",
  "workflow_context": "<what this workflow achieves>",
  "steps": [
    {
      "name": "<step name>",
      "type": "command|ccw-cli|skill",
      "command": "<the actual command to execute>",
      "tool": "<gemini|claude|codex|qwen or null for shell commands>",
      "mode": "<write (default) | analysis (read-only, rare) | null for shell commands>",
      "rule": "<rule template or null>",
      "original_text": "<source text from document>",
      "expected_artifacts": ["<expected output files>"],
      "success_criteria": "<what success looks like>"
    }
  ]
 }
 CONSTRAINTS: Output ONLY valid JSON. Extract executable steps, NOT document analysis tasks.`;
  Bash({
    command: `ccw cli -p "${escapeForShell(extractPrompt)}" --tool gemini --mode analysis --rule universal-rigorous-style`,
    run_in_background: true,
    timeout: 300000
  });
  // STOP — wait for hook callback
  // After callback: parse JSON response into steps[]
  const extractOutput = /* CLI output from callback */;
  const extractJsonMatch = extractOutput.match(/\{[\s\S]*\}/);
  if (extractJsonMatch) {
    try {
      const extracted = JSON.parse(extractJsonMatch[0]);
      workflowName = extracted.workflow_name || 'doc-workflow';
      workflowContext = extracted.workflow_context || naturalLanguageInput;
      steps = (extracted.steps || []).map((s, i) => ({
        name: s.name || `step-${i + 1}`,
        type: s.type || 'command',
        command: s.command,
        tool: s.tool || null,
        mode: s.mode || null,
        rule: s.rule || null,
        original_text: s.original_text || '',
        expected_artifacts: s.expected_artifacts || [],
        success_criteria: s.success_criteria || ''
      }));
    } catch (e) {
      // JSON parse failed — fall through to 4b intent matching
      referenceDocContent = null;
    }
  }
  if (steps.length === 0) {
    // Extraction produced no steps — fall through to 4b
    referenceDocContent = null;
  }
 }
 ```
 #### Mode 4b: Pure Intent → Regex Matching
 When no reference document, or when 4a extraction failed, decompose by intent-verb matching.
 ```javascript
 if (inputFormat === 'natural-language' && !referenceDocContent) {
  // Intent-to-tool mapping (regex patterns → tool config)
  const intentMap = [
    { pattern: /分析|analyze|审查|inspect|scan/i, name: 'analyze', tool: 'gemini', mode: 'analysis', rule: 'analysis-analyze-code-patterns' },
    { pattern: /评审|review|code.?review/i, name: 'review', tool: 'gemini', mode: 'analysis', rule: 'analysis-review-code-quality' },
    { pattern: /诊断|debug|排查|diagnose/i, name: 'diagnose', tool: 'gemini', mode: 'analysis', rule: 'analysis-diagnose-bug-root-cause' },
    { pattern: /安全|security|漏洞|vulnerability/i, name: 'security-audit', tool: 'gemini', mode: 'analysis', rule: 'analysis-assess-security-risks' },
    { pattern: /性能|performance|perf/i, name: 'perf-analysis', tool: 'gemini', mode: 'analysis', rule: 'analysis-analyze-performance' },
    { pattern: /架构|architecture/i, name: 'arch-review', tool: 'gemini', mode: 'analysis', rule: 'analysis-review-architecture' },
    { pattern: /修复|fix|repair|解决/i, name: 'fix', tool: 'claude', mode: 'write', rule: 'development-debug-runtime-issues' },
    { pattern: /实现|implement|开发|create|新增/i, name: 'implement', tool: 'claude', mode: 'write', rule: 'development-implement-feature' },
    { pattern: /重构|refactor/i, name: 'refactor', tool: 'claude', mode: 'write', rule: 'development-refactor-codebase' },
    { pattern: /测试|test|generate.?test/i, name: 'test', tool: 'claude', mode: 'write', rule: 'development-generate-tests' },
    { pattern: /规划|plan|设计|design/i, name: 'plan', tool: 'gemini', mode: 'analysis', rule: 'planning-plan-architecture-design' },
    { pattern: /拆解|breakdown|分解/i, name: 'breakdown', tool: 'gemini', mode: 'analysis', rule: 'planning-breakdown-task-steps' },
  ];
  // Segment input by Chinese/English delimiters: 、，,；然后/接着/最后/之后 etc.
  const segments = naturalLanguageInput
    .split(/[，,；;、]|(?:然后|接着|之后|最后|再|并|and then|then|finally|next)\s*/i)
    .map(s => s.trim())
    .filter(Boolean);
  // Match each segment to an intent (with ambiguity resolution)
  steps = segments.map((segment, i) => {
    const allMatches = intentMap.filter(m => m.pattern.test(segment));
    let matched = allMatches[0] || null;
    // ★ Ambiguity resolution: if multiple intents match, ask user
    if (allMatches.length > 1) {
      const disambig = AskUserQuestion({
        questions: [{
          question: `"${segment}" 匹配到多个意图，请选择最符合的：`,
          header: `Disambiguate Step ${i + 1}`,
          multiSelect: false,
          options: allMatches.map(m => ({
            label: m.name,
            description: `Tool: ${m.tool}, Mode: ${m.mode}, Rule: ${m.rule}`
          }))
        }]
      });
      const chosen = disambig[`Disambiguate Step ${i + 1}`];
      matched = allMatches.find(m => m.name === chosen) || allMatches[0];
    }
    if (matched) {
      // Extract target scope from segment (e.g., "分析 src 目录" → scope = "src")
      const scopeMatch = segment.match(/(?:目录|文件|模块|directory|file|module)?\s*[：:]?\s*(\S+)/);
      const scope = scopeMatch ? scopeMatch[1].replace(/[的地得]$/, '') : '**/*';
      return {
        name: `${matched.name}`,
        type: 'ccw-cli',
        command: `ccw cli -p "${segment}" --tool ${matched.tool} --mode ${matched.mode} --rule ${matched.rule}`,
        tool: matched.tool,
        mode: matched.mode,
        rule: matched.rule,
        original_text: segment,
        expected_artifacts: [],
        success_criteria: ''
      };
    } else {
      // Unmatched segment → generic analysis step
      return {
        name: `step-${i + 1}`,
        type: 'ccw-cli',
        command: `ccw cli -p "${segment}" --tool gemini --mode analysis`,
        tool: 'gemini',
        mode: 'analysis',
        rule: 'universal-rigorous-style',
        original_text: segment,
        expected_artifacts: [],
        success_criteria: ''
      };
    }
  });
  // Deduplicate: if same intent name appears twice, suffix with index
  const nameCount = {};
  steps.forEach(s => {
    nameCount[s.name] = (nameCount[s.name] || 0) + 1;
    if (nameCount[s.name] > 1) {
      s.name = `${s.name}-${nameCount[s.name]}`;
    }
  });
 }
 // Common: set workflow context and name for Format 4
 if (inputFormat === 'natural-language') {
  if (!workflowContext) {
    workflowContext = naturalLanguageInput;
  }
  if (!workflowName || workflowName === 'unnamed-workflow') {
    workflowName = referenceDocPath ? 'doc-workflow' : 'nl-workflow';
  }
 }
 ```
 ### Step 1.1c: Generate Command Document
 Generate a formatted execution plan for user review. This runs for ALL input formats, not just Format 4.
 ```javascript
 function generateCommandDoc(steps, workflowName, workflowContext, analysisDepth) {
  const stepTable = steps.map((s, i) => {
    const tool = s.tool || (s.type === 'skill' ? '-' : 'claude');
    const mode = s.mode || (s.type === 'skill' ? '-' : 'write');
    const cmdPreview = s.command.length > 60 ? s.command.substring(0, 57) + '...' : s.command;
    return `| ${i + 1} | ${s.name} | ${s.type} | \`${cmdPreview}\` | ${tool} | ${mode} |`;
  }).join('\n');
  const flowDiagram = steps.map((s, i) => {
    const arrow = i < steps.length - 1 ? '\n     ↓' : '';
    const feedsInto = i < steps.length - 1 ? `Feeds into: Step ${i + 2} (${steps[i + 1].name})` : 'Final step';
    const originalText = s.original_text ? `\n  Source: "${s.original_text}"` : '';
    return `Step ${i + 1}: ${s.name}
  Command: ${s.command}
  Type: ${s.type} | Tool: ${s.tool || '-'} | Mode: ${s.mode || '-'}${originalText}
  ${feedsInto}${arrow}`;
  }).join('\n');
  const totalCli = steps.filter(s => s.type === 'ccw-cli').length;
  const totalSkill = steps.filter(s => s.type === 'skill').length;
  const totalCmd = steps.filter(s => s.type === 'command').length;
  return `# Workflow Tune — Execution Plan
 **Workflow**: ${workflowName}
 **Goal**: ${workflowContext}
 **Steps**: ${steps.length}
 **Analysis Depth**: ${analysisDepth}
 ## Step Chain
 | # | Name | Type | Command | Tool | Mode |
 |---|------|------|---------|------|------|
 ${stepTable}
 ## Execution Flow
 \`\`\`
 ${flowDiagram}
 \`\`\`
 ## Estimated Scope
 - CLI execute calls: ${totalCli}
 - Skill invocations: ${totalSkill}
 - Shell commands: ${totalCmd}
 - Analysis calls (gemini --resume chain): ${steps.length} (per-step) + 1 (synthesis)
 - Process documentation: process-log.md (accumulated)
 - Final output: final-report.md with optimization recommendations
 `;
 }
 const commandDoc = generateCommandDoc(steps, workflowName, workflowContext, workflowPreferences.analysisDepth);
 // Output command document to user (direct text output)
 // The orchestrator displays this as formatted text before confirmation
 ```
 ### Step 1.1d: Pre-Execution Confirmation
 ```javascript
 // ★ Skip confirmation if -y/--yes auto mode
 if (!workflowPreferences.autoYes) {
  // Display commandDoc to user as formatted text output
  // Then ask for confirmation
  const confirmation = AskUserQuestion({
    questions: [{
      question: "确认执行以上 Workflow 调优计划？",
      header: "Confirm Execution",
      multiSelect: false,
      options: [
        { label: "Execute (确认执行)", description: "按计划开始执行所有步骤" },
        { label: "Edit steps (修改步骤)", description: "调整步骤顺序、增删步骤、更换工具" },
        { label: "Cancel (取消)", description: "取消本次调优" }
      ]
    }]
  });
  const choice = confirmation["Confirm Execution"];
  if (choice.startsWith("Cancel")) {
    // Abort: no workspace created, no state written
    // Output: "Workflow tune cancelled."
    return;
  }
  if (choice.startsWith("Edit")) {
    // Enter edit loop: ask user what to change, apply, re-display, re-confirm
    let editing = true;
    while (editing) {
      const editResponse = AskUserQuestion({
        questions: [{
          question: "请描述要修改的内容：\n" +
            "  - 删除步骤: '删除步骤2' 或 'remove step 2'\n" +
            "  - 添加步骤: '在步骤1后加入安全扫描' 或 'add security scan after step 1'\n" +
            "  - 修改工具: '步骤3改用codex' 或 'step 3 use codex'\n" +
            "  - 调换顺序: '步骤2和步骤3互换' 或 'swap step 2 and 3'\n" +
            "  - 修改命令: '步骤1命令改为 ccw cli -p \"...\" --tool gemini'",
          header: "Edit Steps"
        }]
      });
      const editText = editResponse["Edit Steps"];
      // Apply edits to steps[] based on user instruction
      // The orchestrator interprets the edit instruction and modifies steps:
      //
      // Delete: filter out the specified step, re-index
      // Add: insert new step at specified position
      // Modify tool: update the step's tool/mode/command
      // Swap: exchange positions of two steps
      // Modify command: replace command string
      //
      // After applying edits, re-generate command doc and re-display
      const updatedCommandDoc = generateCommandDoc(steps, workflowName, workflowContext, workflowPreferences.analysisDepth);
      // Display updatedCommandDoc to user
      const reconfirm = AskUserQuestion({
        questions: [{
          question: "修改后的计划如上，是否确认？",
          header: "Confirm Execution",
          multiSelect: false,
          options: [
            { label: "Execute (确认执行)", description: "按修改后的计划执行" },
            { label: "Edit more (继续修改)", description: "还需要调整" },
            { label: "Cancel (取消)", description: "取消本次调优" }
          ]
        }]
      });
      const reChoice = reconfirm["Confirm Execution"];
      if (reChoice.startsWith("Execute")) {
        editing = false;
      } else if (reChoice.startsWith("Cancel")) {
        return;  // Abort
      }
      // else: continue editing loop
    }
  }
  // choice === "Execute" → proceed to workspace creation
 }
 // Save command doc for reference
 // Will be written to workspace after Step 1.3
 ```
 ### Step 1.2: Validate Steps
 ```javascript
 for (const step of steps) {
  if (step.type === 'skill' && step.skill_path) {
    const skillFiles = Glob(`${step.skill_path}/SKILL.md`);
    if (skillFiles.length === 0) {
      step.validation = 'warning';
      step.validation_msg = `Skill not found: ${step.skill_path}`;
    } else {
      step.validation = 'ok';
    }
  } else {
    // Command-type steps: basic validation (non-empty)
    step.validation = step.command && step.command.trim() ? 'ok' : 'invalid';
  }
 }
 const invalidSteps = steps.filter(s => s.validation === 'invalid');
 if (invalidSteps.length > 0) {
  throw new Error(`Invalid steps: ${invalidSteps.map(s => s.name).join(', ')}`);
 }
 ```
 ### Step 1.2b: Generate Test Requirements (Acceptance Criteria)
 > 调优的前提：为每一步生成跟任务匹配的验收标准。没有预期基准，就无法判断命令执行是否达标。
 用 Gemini 根据 step command + workflow context + 上下游关系，自动推断每步的验收标准。
 ```javascript
 // Build step chain description for context
 const stepChainDesc = steps.map((s, i) =>
  `Step ${i + 1}: ${s.name} (${s.type}) — ${s.command}`
 ).join('\n');
 const reqGenPrompt = `PURPOSE: Generate concrete acceptance criteria (test requirements) for each step in a workflow pipeline. These criteria will be used to objectively judge whether each step's execution succeeded or failed.
 WORKFLOW:
 Name: ${workflowName}
 Goal: ${workflowContext}
 STEP CHAIN:
 ${stepChainDesc}
 TASK:
 For each step, generate:
 1. **expected_outputs** — what files/artifacts should be produced (specific filenames or patterns)
 2. **content_signals** — what content patterns indicate success (keywords, structures, data shapes)
 3. **quality_thresholds** — minimum quality bar (e.g., "no empty files", "JSON must be parseable", "must contain at least N items")
 4. **pass_criteria** — 1-2 sentence description of what "pass" looks like for this step
 5. **fail_signals** — what patterns indicate failure (error messages, empty output, wrong format)
 6. **handoff_contract** — what this step must provide for the next step to work (data format, required fields)
 CONTEXT RULES:
 - Infer from the command what the step is supposed to do
 - Consider workflow goal when judging what "good enough" means
 - Each step's handoff_contract should match what the next step needs as input
 - Be specific: "report.md with ## Summary section" not "a report file"
 EXPECTED OUTPUT (strict JSON, no markdown):
 {
  "step_requirements": [
    {
      "step_index": 0,
      "step_name": "<name>",
      "expected_outputs": ["<file or pattern>"],
      "content_signals": ["<keyword or pattern that indicates success>"],
      "quality_thresholds": ["<minimum bar>"],
      "pass_criteria": "<what pass looks like>",
      "fail_signals": ["<pattern that indicates failure>"],
      "handoff_contract": "<what next step needs from this step>"
    }
  ]
 }
 CONSTRAINTS: Be specific to each command, output ONLY JSON`;
 Bash({
  command: `ccw cli -p "${escapeForShell(reqGenPrompt)}" --tool gemini --mode analysis --rule universal-rigorous-style`,
  run_in_background: true,
  timeout: 300000
 });
 // STOP — wait for hook callback
 // After callback: parse JSON, attach requirements to each step
 const reqOutput = /* CLI output from callback */;
 const reqJsonMatch = reqOutput.match(/\{[\s\S]*\}/);
 if (reqJsonMatch) {
  try {
    const reqData = JSON.parse(reqJsonMatch[0]);
    (reqData.step_requirements || []).forEach(req => {
      const idx = req.step_index;
      if (idx >= 0 && idx < steps.length) {
        steps[idx].test_requirements = {
          expected_outputs: req.expected_outputs || [],
          content_signals: req.content_signals || [],
          quality_thresholds: req.quality_thresholds || [],
          pass_criteria: req.pass_criteria || '',
          fail_signals: req.fail_signals || [],
          handoff_contract: req.handoff_contract || ''
        };
      }
    });
  } catch (e) {
    // Fallback: proceed without generated requirements
    // Steps will use any manually provided success_criteria
  }
 }
 // Capture session ID for resume chain start
 const reqSessionMatch = reqOutput.match(/\[CCW_EXEC_ID=([^\]]+)\]/);
 const reqSessionId = reqSessionMatch ? reqSessionMatch[1] : null;
 ```
 ### Step 1.3: Create Workspace
 ```javascript
 const ts = Date.now();
 const workDir = `.workflow/.scratchpad/workflow-tune-${ts}`;
 Bash(`mkdir -p "${workDir}/steps"`);
 // Create per-step directories
 for (let i = 0; i < steps.length; i++) {
  Bash(`mkdir -p "${workDir}/steps/step-${i + 1}/artifacts"`);
 }
 ```
 ### Step 1.3b: Save Command Document
 ```javascript
 // Save confirmed command doc to workspace for reference
 Write(`${workDir}/command-doc.md`, commandDoc);
 ```
 ### Step 1.4: Initialize State
 ```javascript
 const initialState = {
  status: 'running',
  started_at: new Date().toISOString(),
  updated_at: new Date().toISOString(),
  workflow_name: workflowName,
  workflow_context: workflowContext,
  analysis_depth: workflowPreferences.analysisDepth,
  auto_fix: workflowPreferences.autoFix,
  steps: steps.map((s, i) => ({
    ...s,
    index: i,
    status: 'pending',
    execution: null,
    analysis: null,
    test_requirements: s.test_requirements || null  // from Step 1.2b
  })),
  analysis_session_id: reqSessionId || null,  // resume chain starts from requirements generation
  process_log_entries: [],
  synthesis: null,
  errors: [],
  error_count: 0,
  max_errors: 3,
  work_dir: workDir
 };
 Write(`${workDir}/workflow-state.json`, JSON.stringify(initialState, null, 2));
 ```
 ### Step 1.5: Initialize Process Log
 ```javascript
 const processLog = `# Workflow Tune Process Log
 **Workflow**: ${workflowName}
 **Context**: ${workflowContext}
 **Steps**: ${steps.length}
 **Analysis Depth**: ${workflowPreferences.analysisDepth}
 **Started**: ${new Date().toISOString()}
 ---
 `;
 Write(`${workDir}/process-log.md`, processLog);
 ```
 ## Output
 - **Variables**: `workDir`, `steps[]`, `workflowContext`, `commandDoc`, initialized state
 - **Files**: `workflow-state.json`, `process-log.md`, `command-doc.md`, per-step directories
 - **User Confirmation**: Execution plan confirmed (or cancelled → abort)
 - **TaskUpdate**: Mark Phase 1 completed, start Step Loop
--- a/.claude/skills/workflow-tune/phases/02-step-execute.md
+++ b/.claude/skills/workflow-tune/phases/02-step-execute.md
@@ -0,0 +1,197 @@
 # Phase 2: Execute Step
 > **COMPACT SENTINEL [Phase 2: Execute Step]**
 > This phase contains 4 execution steps (Step 2.1 -- 2.4).
 > If you can read this sentinel but cannot find the full Step protocol below, context has been compressed.
 > Recovery: `Read("phases/02-step-execute.md")`
 Execute a single workflow step and collect its output artifacts.
 ## Objective
 - Determine step execution method (skill invoke / ccw cli / shell command)
 - Execute step with appropriate tool
 - Collect output artifacts into step directory
 - Write artifacts manifest
 ## Execution
 ### Step 2.1: Prepare Step Directory
 ```javascript
 const stepIdx = currentStepIndex;  // from orchestrator loop
 const step = state.steps[stepIdx];
 const stepDir = `${state.work_dir}/steps/step-${stepIdx + 1}`;
 const artifactsDir = `${stepDir}/artifacts`;
 // Capture pre-execution state (git status, file timestamps)
 const preGitStatus = Bash('git status --porcelain 2>/dev/null || echo "not a git repo"').stdout;
 // ★ Warn if dirty git working directory (first step only)
 if (stepIdx === 0 && preGitStatus.trim() && preGitStatus.trim() !== 'not a git repo') {
  const dirtyLines = preGitStatus.trim().split('\n').length;
  // Log warning — artifact collection via git diff may be unreliable
  // This is informational; does not block execution
  console.warn(`⚠ Dirty git working directory detected (${dirtyLines} changed files). Artifact collection via git diff may include pre-existing changes.`);
 }
 const preExecSnapshot = {
  timestamp: new Date().toISOString(),
  git_status: preGitStatus,
  working_files: Glob('**/*.{ts,js,md,json}').slice(0, 50)  // sample
 };
 Write(`${stepDir}/pre-exec-snapshot.json`, JSON.stringify(preExecSnapshot, null, 2));
 ```
 ### Step 2.2: Execute by Step Type
 ```javascript
 let executionResult = { success: false, method: '', output: '', duration: 0 };
 const startTime = Date.now();
 switch (step.type) {
  case 'skill': {
    // Skill invocation — use Skill tool
    // Extract skill name and arguments from command
    const skillCmd = step.command.replace(/^\//, '');
    const [skillName, ...skillArgs] = skillCmd.split(/\s+/);
    // Execute skill (this runs synchronously within current context)
    // Note: Skill execution produces artifacts in the working directory
    // We capture changes by comparing pre/post state
    Skill({
      name: skillName,
      arguments: skillArgs.join(' ')
    });
    executionResult.method = 'skill';
    executionResult.success = true;
    break;
  }
  case 'ccw-cli': {
    // Direct ccw cli command
    const cliCommand = step.command;
    Bash({
      command: cliCommand,
      run_in_background: true,
      timeout: 600000  // 10 minutes
    });
    // STOP — wait for hook callback
    // After callback:
    executionResult.method = 'ccw-cli';
    executionResult.success = true;
    break;
  }
  case 'command': {
    // Generic shell command
    const result = Bash({
      command: step.command,
      timeout: 300000  // 5 minutes
    });
    executionResult.method = 'command';
    executionResult.output = result.stdout || '';
    executionResult.success = result.exitCode === 0;
    // Save command output
    if (executionResult.output) {
      Write(`${artifactsDir}/command-output.txt`, executionResult.output);
    }
    if (result.stderr) {
      Write(`${artifactsDir}/command-stderr.txt`, result.stderr);
    }
    break;
  }
 }
 executionResult.duration = Date.now() - startTime;
 ```
 ### Step 2.3: Collect Artifacts
 ```javascript
 // Capture post-execution state
 const postExecSnapshot = {
  timestamp: new Date().toISOString(),
  git_status: Bash('git status --porcelain 2>/dev/null || echo "not a git repo"').stdout,
  working_files: Glob('**/*.{ts,js,md,json}').slice(0, 50)
 };
 // Detect changed/new files by comparing snapshots
 const preFiles = new Set(preExecSnapshot.working_files);
 const newOrChanged = postExecSnapshot.working_files.filter(f => !preFiles.has(f));
 // Also check git diff for modified files
 const gitDiff = Bash('git diff --name-only 2>/dev/null || true').stdout.trim().split('\n').filter(Boolean);
 // Collect all artifacts (new files + git-changed files + declared expected_artifacts)
 const declaredArtifacts = (step.expected_artifacts || []).filter(f => {
  // Verify declared artifacts actually exist
  const exists = Glob(f);
  return exists.length > 0;
 }).flatMap(f => Glob(f));
 const allArtifacts = [...new Set([...newOrChanged, ...gitDiff, ...declaredArtifacts])];
 // Copy detected artifacts to step artifacts dir (or record references)
 const artifactManifest = {
  step: step.name,
  step_index: stepIdx,
  execution_method: executionResult.method,
  success: executionResult.success,
  duration_ms: executionResult.duration,
  artifacts: allArtifacts.map(f => ({
    path: f,
    type: f.endsWith('.md') ? 'markdown' : f.endsWith('.json') ? 'json' : 'other',
    size: 'unknown'  // Can be filled by stat if needed
  })),
  // For skill type: also check .workflow/.scratchpad for generated files
  scratchpad_files: step.type === 'skill'
    ? Glob('.workflow/.scratchpad/**/*').filter(f => {
        // Only include files created after step started
        return true;  // Heuristic: include recent scratchpad files
      }).slice(0, 20)
    : [],
  collected_at: new Date().toISOString()
 };
 Write(`${stepDir}/artifacts-manifest.json`, JSON.stringify(artifactManifest, null, 2));
 ```
 ### Step 2.4: Update State
 ```javascript
 state.steps[stepIdx].status = 'executed';
 state.steps[stepIdx].execution = {
  method: executionResult.method,
  success: executionResult.success,
  duration_ms: executionResult.duration,
  artifacts_dir: artifactsDir,
  manifest_path: `${stepDir}/artifacts-manifest.json`,
  artifact_count: artifactManifest.artifacts.length,
  started_at: preExecSnapshot.timestamp,
  completed_at: new Date().toISOString()
 };
 state.updated_at = new Date().toISOString();
 Write(`${state.work_dir}/workflow-state.json`, JSON.stringify(state, null, 2));
 ```
 ## Error Handling
 | Error | Recovery |
 |-------|----------|
 | Skill not found | Record failure, set success=false, continue to Phase 3 |
 | CLI timeout (10min) | Retry once with shorter timeout, then record failure |
 | Command exit non-zero | Record stderr, set success=false, continue to Phase 3 |
 | No artifacts detected | Continue to Phase 3 — analysis evaluates step definition quality |
 ## Output
 - **Files**: `pre-exec-snapshot.json`, `artifacts-manifest.json`, `artifacts/` (if command type)
 - **State**: `steps[stepIdx].execution` updated
 - **Next**: Phase 3 (Analyze Step)
--- a/.claude/skills/workflow-tune/phases/03-step-analyze.md
+++ b/.claude/skills/workflow-tune/phases/03-step-analyze.md
@@ -0,0 +1,386 @@
 # Phase 3: Analyze Step
 > **COMPACT SENTINEL [Phase 3: Analyze Step]**
 > This phase contains 5 execution steps (Step 3.1 -- 3.5).
 > If you can read this sentinel but cannot find the full Step protocol below, context has been compressed.
 > Recovery: `Read("phases/03-step-analyze.md")`
 Analyze a completed step's artifacts and quality using `ccw cli --tool gemini --mode analysis`. Uses `--resume` to maintain context across step analyses, building a continuous analysis chain.
 ## Objective
 - Inspect step artifacts (file list, content, quality signals)
 - Build analysis prompt with step context + prior process log
 - Execute via ccw cli Gemini with resume chain
 - Parse analysis results → write step-{N}-analysis.md
 - Append findings to process-log.md
 - Return updated session ID for resume chain
 ## Execution
 ### Step 3.1: Inspect Artifacts
 ```javascript
 const stepIdx = currentStepIndex;
 const step = state.steps[stepIdx];
 const stepDir = `${state.work_dir}/steps/step-${stepIdx + 1}`;
 // Read artifacts manifest
 const manifest = JSON.parse(Read(`${stepDir}/artifacts-manifest.json`));
 // Build artifact summary based on analysis depth
 let artifactSummary = '';
 if (state.analysis_depth === 'quick') {
  // Quick: just file list and sizes
  artifactSummary = `Artifacts (${manifest.artifacts.length} files):\n` +
    manifest.artifacts.map(a => `- ${a.path} (${a.type})`).join('\n');
 } else {
  // Standard/Deep: include file content summaries
  artifactSummary = manifest.artifacts.map(a => {
    const maxLines = state.analysis_depth === 'deep' ? 300 : 150;
    try {
      const content = Read(a.path, { limit: maxLines });
      return `--- ${a.path} (${a.type}) ---\n${content}`;
    } catch {
      return `--- ${a.path} --- [unreadable]`;
    }
  }).join('\n\n');
  // Deep: also include scratchpad files
  if (state.analysis_depth === 'deep' && manifest.scratchpad_files?.length > 0) {
    artifactSummary += '\n\n--- Scratchpad Files ---\n' +
      manifest.scratchpad_files.slice(0, 5).map(f => {
        const content = Read(f, { limit: 100 });
        return `--- ${f} ---\n${content}`;
      }).join('\n\n');
  }
 }
 // Execution result summary
 const execSummary = `Execution: ${step.execution.method} | ` +
  `Success: ${step.execution.success} | ` +
  `Duration: ${step.execution.duration_ms}ms | ` +
  `Artifacts: ${manifest.artifacts.length} files`;
 ```
 ### Step 3.2: Build Prior Context
 ```javascript
 // Build accumulated process log context for this analysis
 const priorProcessLog = Read(`${state.work_dir}/process-log.md`);
 // Build step chain context (what came before, what comes after)
 const stepChainContext = state.steps.map((s, i) => {
  const status = i < stepIdx ? 'completed' : i === stepIdx ? 'CURRENT' : 'pending';
  const score = s.analysis?.quality_score || '-';
  return `${i + 1}. [${status}] ${s.name} (${s.type}) — Quality: ${score}`;
 }).join('\n');
 // Previous step handoff context (if not first step)
 let handoffContext = '';
 if (stepIdx > 0) {
  const prevStep = state.steps[stepIdx - 1];
  const prevAnalysis = prevStep.analysis;
  if (prevAnalysis) {
    handoffContext = `PREVIOUS STEP OUTPUT SUMMARY:
 Step "${prevStep.name}" produced ${prevStep.execution?.artifact_count || 0} artifacts.
 Quality: ${prevAnalysis.quality_score}/100
 Key outputs: ${prevAnalysis.key_outputs?.join(', ') || 'unknown'}
 Handoff notes: ${prevAnalysis.handoff_notes || 'none'}`;
  }
 }
 ```
 ### Step 3.3: Construct Analysis Prompt
 ```javascript
 // Ref: templates/step-analysis-prompt.md
 const depthInstructions = {
  quick: 'Provide brief assessment (3-5 bullet points). Focus on: execution success, output completeness, obvious issues.',
  standard: 'Provide detailed assessment. Cover: execution quality, output completeness, artifact quality, step-to-step handoff readiness, potential issues.',
  deep: 'Provide exhaustive assessment. Cover: execution quality, output completeness and correctness, artifact quality and structure, step-to-step handoff integrity, error handling, performance signals, architecture implications, edge cases.'
 };
 // ★ Build test requirements section (the evaluation baseline)
 const testReqs = step.test_requirements;
 let testReqSection = '';
 if (testReqs) {
  testReqSection = `
 TEST REQUIREMENTS (Acceptance Criteria — use these as the PRIMARY evaluation baseline):
  Pass Criteria: ${testReqs.pass_criteria}
  Expected Outputs: ${(testReqs.expected_outputs || []).join(', ') || 'not specified'}
  Content Signals (patterns that indicate success): ${(testReqs.content_signals || []).join(', ') || 'not specified'}
  Quality Thresholds: ${(testReqs.quality_thresholds || []).join(', ') || 'not specified'}
  Fail Signals (patterns that indicate failure): ${(testReqs.fail_signals || []).join(', ') || 'not specified'}
  Handoff Contract (what next step needs): ${testReqs.handoff_contract || 'not specified'}
 IMPORTANT: Score quality_score based on how well the actual output matches these test requirements.
  - 90-100: All pass_criteria met, all expected_outputs present, content_signals found, no fail_signals
  - 70-89: Most criteria met, minor gaps
  - 50-69: Partial match, significant gaps
  - 0-49: Fail — fail_signals present or pass_criteria not met`;
 } else {
  testReqSection = `
 NOTE: No pre-generated test requirements for this step. Evaluate based on general quality signals and workflow context.`;
 }
 const analysisPrompt = `PURPOSE: Evaluate workflow step "${step.name}" (step ${stepIdx + 1}/${state.steps.length}) against its acceptance criteria. Judge whether the command execution met the pre-defined test requirements.
 WORKFLOW CONTEXT:
 Name: ${state.workflow_name}
 Goal: ${state.workflow_context}
 Step Chain:
 ${stepChainContext}
 CURRENT STEP:
 Name: ${step.name}
 Type: ${step.type}
 Command: ${step.command}
 ${step.success_criteria ? `Success Criteria: ${step.success_criteria}` : ''}
 ${testReqSection}
 EXECUTION RESULT:
 ${execSummary}
 ${handoffContext}
 STEP ARTIFACTS:
 ${artifactSummary}
 ANALYSIS DEPTH: ${state.analysis_depth}
 ${depthInstructions[state.analysis_depth]}
 TASK:
 1. **Requirement Matching**: Compare actual output against test requirements (pass_criteria, expected_outputs, content_signals)
 2. **Fail Signal Detection**: Check for any fail_signals in the output
 3. **Handoff Contract Verification**: Does the output satisfy handoff_contract for the next step?
 4. **Gap Analysis**: What's missing between actual output and requirements?
 5. **Quality Score**: Rate 0-100 based on requirement fulfillment (NOT general quality)
 EXPECTED OUTPUT (strict JSON, no markdown):
 {
  "quality_score": <0-100>,
  "requirement_match": {
    "pass": <true|false>,
    "criteria_met": ["<which pass_criteria were satisfied>"],
    "criteria_missed": ["<which pass_criteria were NOT satisfied>"],
    "expected_outputs_found": ["<expected files that exist>"],
    "expected_outputs_missing": ["<expected files that are absent>"],
    "content_signals_found": ["<success patterns detected in output>"],
    "content_signals_missing": ["<success patterns NOT found>"],
    "fail_signals_detected": ["<failure patterns found, if any>"]
  },
  "execution_assessment": {
    "success": <true|false>,
    "completeness": "<complete|partial|failed>",
    "notes": "<brief assessment>"
  },
  "artifact_assessment": {
    "count": <number>,
    "quality": "<high|medium|low>",
    "key_outputs": ["<main output 1>", "<main output 2>"],
    "missing_outputs": ["<expected but missing>"]
  },
  "handoff_assessment": {
    "ready": <true|false>,
    "contract_satisfied": <true|false|null>,
    "next_step_compatible": <true|false|null>,
    "handoff_notes": "<what next step should know>"
  },
  "issues": [
    { "severity": "high|medium|low", "description": "<issue>", "suggestion": "<fix>" }
  ],
  "optimization_opportunities": [
    { "area": "<area>", "description": "<opportunity>", "impact": "high|medium|low" }
  ],
  "step_summary": "<1-2 sentence summary for process log>"
 }
 CONSTRAINTS: Be specific, reference artifact content where possible, score against requirements not general quality, output ONLY JSON`;
 ```
 ### Step 3.4: Execute via ccw cli Gemini with Resume
 ```javascript
 function escapeForShell(str) {
  return str.replace(/"/g, '\\"').replace(/\$/g, '\\$').replace(/`/g, '\\`');
 }
 // Build CLI command with optional resume
 let cliCommand = `ccw cli -p "${escapeForShell(analysisPrompt)}" --tool gemini --mode analysis`;
 // Resume from previous step's analysis session (maintains context chain)
 if (state.analysis_session_id) {
  cliCommand += ` --resume ${state.analysis_session_id}`;
 }
 Bash({
  command: cliCommand,
  run_in_background: true,
  timeout: 300000  // 5 minutes
 });
 // STOP — wait for hook callback
 ```
 ### Step 3.5: Parse Results and Update Process Log
 After CLI completes:
 ```javascript
 const rawOutput = /* CLI output from callback */;
 // Extract session ID from CLI output for resume chain
 const sessionIdMatch = rawOutput.match(/\[CCW_EXEC_ID=([^\]]+)\]/);
 if (sessionIdMatch) {
  state.analysis_session_id = sessionIdMatch[1];
 }
 // Parse JSON
 const jsonMatch = rawOutput.match(/\{[\s\S]*\}/);
 let analysis;
 if (jsonMatch) {
  try {
    analysis = JSON.parse(jsonMatch[0]);
  } catch (e) {
    // Fallback: extract score heuristically
    const scoreMatch = rawOutput.match(/"quality_score"\s*:\s*(\d+)/);
    analysis = {
      quality_score: scoreMatch ? parseInt(scoreMatch[1]) : 50,
      execution_assessment: { success: step.execution.success, completeness: 'unknown', notes: 'Parse failed' },
      artifact_assessment: { count: manifest.artifacts.length, quality: 'unknown', key_outputs: [], missing_outputs: [] },
      handoff_assessment: { ready: true, next_step_compatible: null, handoff_notes: '' },
      issues: [{ severity: 'low', description: 'Analysis output parsing failed', suggestion: 'Review raw output' }],
      optimization_opportunities: [],
      step_summary: 'Analysis parsing failed — raw output saved for manual review'
    };
  }
 } else {
  analysis = {
    quality_score: 50,
    step_summary: 'No structured analysis output received'
  };
 }
 // Write step analysis file
 const reqMatch = analysis.requirement_match;
 const reqMatchSection = reqMatch ? `
 ## Requirement Match — ${reqMatch.pass ? 'PASS ✓' : 'FAIL ✗'}
 ### Criteria Met
 ${(reqMatch.criteria_met || []).map(c => `- ✓ ${c}`).join('\n') || '- None'}
 ### Criteria Missed
 ${(reqMatch.criteria_missed || []).map(c => `- ✗ ${c}`).join('\n') || '- None'}
 ### Expected Outputs
 - Found: ${(reqMatch.expected_outputs_found || []).join(', ') || 'None'}
 - Missing: ${(reqMatch.expected_outputs_missing || []).join(', ') || 'None'}
 ### Content Signals
 - Detected: ${(reqMatch.content_signals_found || []).join(', ') || 'None'}
 - Missing: ${(reqMatch.content_signals_missing || []).join(', ') || 'None'}
 ### Fail Signals
 ${(reqMatch.fail_signals_detected || []).length > 0
  ? (reqMatch.fail_signals_detected || []).map(f => `- ⚠ ${f}`).join('\n')
  : '- None detected'}
 ` : '';
 const stepAnalysisReport = `# Step ${stepIdx + 1} Analysis: ${step.name}
 **Quality Score**: ${analysis.quality_score}/100
 **Requirement Match**: ${reqMatch ? (reqMatch.pass ? 'PASS' : 'FAIL') : 'N/A (no test requirements)'}
 **Date**: ${new Date().toISOString()}
 ${reqMatchSection}
 ## Execution
 - Success: ${analysis.execution_assessment?.success}
 - Completeness: ${analysis.execution_assessment?.completeness}
 - Notes: ${analysis.execution_assessment?.notes}
 ## Artifacts
 - Count: ${analysis.artifact_assessment?.count}
 - Quality: ${analysis.artifact_assessment?.quality}
 - Key Outputs: ${analysis.artifact_assessment?.key_outputs?.join(', ') || 'N/A'}
 - Missing: ${analysis.artifact_assessment?.missing_outputs?.join(', ') || 'None'}
 ## Handoff Readiness
 - Ready: ${analysis.handoff_assessment?.ready}
 - Contract Satisfied: ${analysis.handoff_assessment?.contract_satisfied}
 - Next Step Compatible: ${analysis.handoff_assessment?.next_step_compatible}
 - Notes: ${analysis.handoff_assessment?.handoff_notes}
 ## Issues
 ${(analysis.issues || []).map(i => `- [${i.severity}] ${i.description} → ${i.suggestion}`).join('\n') || 'None'}
 ## Optimization Opportunities
 ${(analysis.optimization_opportunities || []).map(o => `- [${o.impact}] ${o.area}: ${o.description}`).join('\n') || 'None'}
 `;
 Write(`${stepDir}/step-${stepIdx + 1}-analysis.md`, stepAnalysisReport);
 // Append to process log
 const reqPassStr = reqMatch ? (reqMatch.pass ? 'PASS' : 'FAIL') : 'N/A';
 const processLogEntry = `
 ## Step ${stepIdx + 1}: ${step.name} — Score: ${analysis.quality_score}/100 | Req: ${reqPassStr}
 **Command**: \`${step.command}\`
 **Result**: ${analysis.execution_assessment?.completeness || 'unknown'} | ${analysis.artifact_assessment?.count || 0} artifacts
 **Requirement Match**: ${reqPassStr}${reqMatch ? ` — Met: ${(reqMatch.criteria_met || []).length}, Missed: ${(reqMatch.criteria_missed || []).length}, Fail Signals: ${(reqMatch.fail_signals_detected || []).length}` : ''}
 **Summary**: ${analysis.step_summary || 'No summary'}
 **Issues**: ${(analysis.issues || []).filter(i => i.severity === 'high').map(i => i.description).join('; ') || 'None critical'}
 **Handoff**: ${analysis.handoff_assessment?.contract_satisfied ? 'Contract satisfied' : analysis.handoff_assessment?.handoff_notes || 'Ready'}
 ---
 `;
 // Append to process-log.md
 const currentLog = Read(`${state.work_dir}/process-log.md`);
 Write(`${state.work_dir}/process-log.md`, currentLog + processLogEntry);
 // Update state
 state.steps[stepIdx].analysis = {
  quality_score: analysis.quality_score,
  requirement_pass: reqMatch?.pass ?? null,
  criteria_met_count: (reqMatch?.criteria_met || []).length,
  criteria_missed_count: (reqMatch?.criteria_missed || []).length,
  fail_signals_count: (reqMatch?.fail_signals_detected || []).length,
  key_outputs: analysis.artifact_assessment?.key_outputs || [],
  handoff_notes: analysis.handoff_assessment?.handoff_notes || '',
  contract_satisfied: analysis.handoff_assessment?.contract_satisfied ?? null,
  issue_count: (analysis.issues || []).length,
  high_issues: (analysis.issues || []).filter(i => i.severity === 'high').length,
  optimization_count: (analysis.optimization_opportunities || []).length,
  analysis_file: `${stepDir}/step-${stepIdx + 1}-analysis.md`
 };
 state.steps[stepIdx].status = 'analyzed';
 state.process_log_entries.push({
  step_index: stepIdx,
  step_name: step.name,
  quality_score: analysis.quality_score,
  summary: analysis.step_summary,
  timestamp: new Date().toISOString()
 });
 state.updated_at = new Date().toISOString();
 Write(`${state.work_dir}/workflow-state.json`, JSON.stringify(state, null, 2));
 ```
 ## Error Handling
 | Error | Recovery |
 |-------|----------|
 | CLI timeout | Retry once without --resume (fresh session) |
 | Resume session not found | Start fresh analysis session, continue |
 | JSON parse fails | Extract score heuristically, save raw output |
 | No output | Default score 50, minimal process log entry |
 ## Output
 - **Files**: `step-{N}-analysis.md`, updated `process-log.md`
 - **State**: `steps[stepIdx].analysis` updated, `analysis_session_id` updated
 - **Next**: Phase 2 for next step, or Phase 4 (Synthesize) if all steps done
--- a/.claude/skills/workflow-tune/phases/04-synthesize.md
+++ b/.claude/skills/workflow-tune/phases/04-synthesize.md
@@ -0,0 +1,257 @@
 # Phase 4: Synthesize
 > **COMPACT SENTINEL [Phase 4: Synthesize]**
 > This phase contains 4 execution steps (Step 4.1 -- 4.4).
 > If you can read this sentinel but cannot find the full Step protocol below, context has been compressed.
 > Recovery: `Read("phases/04-synthesize.md")`
 Synthesize all step analyses into cross-step insights. Evaluates the workflow as a whole: step ordering, handoff quality, redundancy, bottlenecks, and overall coherence.
 ## Objective
 - Read complete process-log.md and all step analyses
 - Build synthesis prompt with full workflow context
 - Execute via ccw cli Gemini with resume chain
 - Generate cross-step optimization insights
 - Write synthesis.md
 ## Execution
 ### Step 4.1: Gather All Analyses
 ```javascript
 // Read process log
 const processLog = Read(`${state.work_dir}/process-log.md`);
 // Read all step analysis files
 const stepAnalyses = state.steps.map((step, i) => {
  const analysisFile = `${state.work_dir}/steps/step-${i + 1}/step-${i + 1}-analysis.md`;
  try {
    return { step: step.name, index: i, content: Read(analysisFile) };
  } catch {
    return { step: step.name, index: i, content: '[Analysis not available]' };
  }
 });
 // Build score summary
 const scoreSummary = state.steps.map((s, i) =>
  `Step ${i + 1} (${s.name}): ${s.analysis?.quality_score || '-'}/100 | Issues: ${s.analysis?.issue_count || 0} (${s.analysis?.high_issues || 0} high)`
 ).join('\n');
 // Compute aggregate stats
 const scores = state.steps.map(s => s.analysis?.quality_score).filter(Boolean);
 const avgScore = scores.length > 0 ? Math.round(scores.reduce((a, b) => a + b, 0) / scores.length) : 0;
 const minScore = scores.length > 0 ? Math.min(...scores) : 0;
 const totalIssues = state.steps.reduce((sum, s) => sum + (s.analysis?.issue_count || 0), 0);
 const totalHighIssues = state.steps.reduce((sum, s) => sum + (s.analysis?.high_issues || 0), 0);
 ```
 ### Step 4.2: Construct Synthesis Prompt
 ```javascript
 // Ref: templates/synthesis-prompt.md
 const synthesisPrompt = `PURPOSE: Synthesize all step analyses into a holistic workflow optimization assessment. Evaluate cross-step concerns: ordering, handoff quality, redundancy, bottlenecks, and overall workflow coherence.
 WORKFLOW OVERVIEW:
 Name: ${state.workflow_name}
 Goal: ${state.workflow_context}
 Steps: ${state.steps.length}
 Average Quality: ${avgScore}/100
 Weakest Step: ${minScore}/100
 Total Issues: ${totalIssues} (${totalHighIssues} high severity)
 SCORE SUMMARY:
 ${scoreSummary}
 COMPLETE PROCESS LOG:
 ${processLog}
 DETAILED STEP ANALYSES:
 ${stepAnalyses.map(a => `### ${a.step} (Step ${a.index + 1})\n${a.content}`).join('\n\n---\n\n')}
 TASK:
 1. **Workflow Coherence**: Do steps form a logical sequence? Any missing steps?
 2. **Handoff Quality**: Are step outputs well-consumed by subsequent steps? Data format mismatches?
 3. **Redundancy Detection**: Do any steps duplicate work? Overlapping concerns?
 4. **Bottleneck Identification**: Which steps are bottlenecks (lowest quality, longest duration)?
 5. **Step Ordering**: Would reordering steps improve outcomes?
 6. **Missing Steps**: Are there gaps in the pipeline that need additional steps?
 7. **Per-Step Optimization**: Top 3 improvements per underperforming step
 8. **Workflow-Level Optimization**: Structural changes to the overall pipeline
 EXPECTED OUTPUT (strict JSON, no markdown):
 {
  "workflow_score": <0-100>,
  "coherence": {
    "score": <0-100>,
    "assessment": "<logical flow evaluation>",
    "gaps": ["<missing step or transition>"]
  },
  "handoff_quality": {
    "score": <0-100>,
    "issues": [
      { "from_step": "<step name>", "to_step": "<step name>", "issue": "<description>", "fix": "<suggestion>" }
    ]
  },
  "redundancy": {
    "found": <true|false>,
    "items": [
      { "steps": ["<step1>", "<step2>"], "description": "<what overlaps>", "recommendation": "<merge or remove>" }
    ]
  },
  "bottlenecks": [
    { "step": "<step name>", "reason": "<why it's a bottleneck>", "impact": "high|medium|low", "fix": "<suggestion>" }
  ],
  "ordering_suggestions": [
    { "current": "<current order description>", "proposed": "<new order>", "rationale": "<why>" }
  ],
  "per_step_improvements": [
    { "step": "<step name>", "improvements": [
      { "priority": "high|medium|low", "description": "<what to change>", "rationale": "<why>" }
    ]}
  ],
  "workflow_improvements": [
    { "priority": "high|medium|low", "category": "structure|handoff|performance|quality", "description": "<change>", "rationale": "<why>", "affected_steps": ["<step names>"] }
  ],
  "summary": "<2-3 sentence executive summary of workflow health and top recommendations>"
 }
 CONSTRAINTS: Be specific, reference step names and artifact details, output ONLY JSON`;
 ```
 ### Step 4.3: Execute via ccw cli Gemini with Resume
 ```javascript
 function escapeForShell(str) {
  return str.replace(/"/g, '\\"').replace(/\$/g, '\\$').replace(/`/g, '\\`');
 }
 let cliCommand = `ccw cli -p "${escapeForShell(synthesisPrompt)}" --tool gemini --mode analysis`;
 // Resume from the last step's analysis session
 if (state.analysis_session_id) {
  cliCommand += ` --resume ${state.analysis_session_id}`;
 }
 Bash({
  command: cliCommand,
  run_in_background: true,
  timeout: 300000
 });
 // STOP — wait for hook callback
 ```
 ### Step 4.4: Parse Results and Write Synthesis
 After CLI completes:
 ```javascript
 const rawOutput = /* CLI output from callback */;
 const jsonMatch = rawOutput.match(/\{[\s\S]*\}/);
 let synthesis;
 if (jsonMatch) {
  try {
    synthesis = JSON.parse(jsonMatch[0]);
  } catch {
    synthesis = {
      workflow_score: avgScore,
      summary: 'Synthesis parsing failed — individual step analyses available',
      workflow_improvements: [],
      per_step_improvements: [],
      bottlenecks: [],
      handoff_quality: { score: 0, issues: [] },
      coherence: { score: 0, assessment: 'Parse error' },
      redundancy: { found: false, items: [] },
      ordering_suggestions: []
    };
  }
 } else {
  synthesis = {
    workflow_score: avgScore,
    summary: 'No synthesis output received',
    workflow_improvements: [],
    per_step_improvements: []
  };
 }
 // Write synthesis report
 const synthesisReport = `# Workflow Synthesis
 **Workflow Score**: ${synthesis.workflow_score}/100
 **Date**: ${new Date().toISOString()}
 ## Executive Summary
 ${synthesis.summary}
 ## Coherence (${synthesis.coherence?.score || '-'}/100)
 ${synthesis.coherence?.assessment || 'N/A'}
 ${(synthesis.coherence?.gaps || []).length > 0 ? '\n### Gaps\n' + synthesis.coherence.gaps.map(g => `- ${g}`).join('\n') : ''}
 ## Handoff Quality (${synthesis.handoff_quality?.score || '-'}/100)
 ${(synthesis.handoff_quality?.issues || []).map(i =>
  `- **${i.from_step} → ${i.to_step}**: ${i.issue}\n  Fix: ${i.fix}`
 ).join('\n') || 'No handoff issues'}
 ## Redundancy
 ${synthesis.redundancy?.found ? (synthesis.redundancy.items || []).map(r =>
  `- Steps ${r.steps.join(', ')}: ${r.description} → ${r.recommendation}`
 ).join('\n') : 'No redundancy detected'}
 ## Bottlenecks
 ${(synthesis.bottlenecks || []).map(b =>
  `- **${b.step}** [${b.impact}]: ${b.reason}\n  Fix: ${b.fix}`
 ).join('\n') || 'No bottlenecks'}
 ## Ordering Suggestions
 ${(synthesis.ordering_suggestions || []).map(o =>
  `- Current: ${o.current}\n  Proposed: ${o.proposed}\n  Rationale: ${o.rationale}`
 ).join('\n') || 'Current ordering is optimal'}
 ## Per-Step Improvements
 ${(synthesis.per_step_improvements || []).map(s =>
  `### ${s.step}\n` + (s.improvements || []).map(i =>
    `- [${i.priority}] ${i.description} — ${i.rationale}`
  ).join('\n')
 ).join('\n\n') || 'No per-step improvements'}
 ## Workflow-Level Improvements
 ${(synthesis.workflow_improvements || []).map((w, i) =>
  `### ${i + 1}. [${w.priority}] ${w.description}\n- Category: ${w.category}\n- Rationale: ${w.rationale}\n- Affected: ${(w.affected_steps || []).join(', ')}`
 ).join('\n\n') || 'No workflow-level improvements'}
 `;
 Write(`${state.work_dir}/synthesis.md`, synthesisReport);
 // Update state
 state.synthesis = {
  workflow_score: synthesis.workflow_score,
  summary: synthesis.summary,
  improvement_count: (synthesis.workflow_improvements || []).length +
    (synthesis.per_step_improvements || []).reduce((sum, s) => sum + (s.improvements || []).length, 0),
  high_priority_count: (synthesis.workflow_improvements || []).filter(w => w.priority === 'high').length,
  bottleneck_count: (synthesis.bottlenecks || []).length,
  handoff_issue_count: (synthesis.handoff_quality?.issues || []).length,
  synthesis_file: `${state.work_dir}/synthesis.md`,
  raw_data: synthesis
 };
 state.updated_at = new Date().toISOString();
 Write(`${state.work_dir}/workflow-state.json`, JSON.stringify(state, null, 2));
 ```
 ## Error Handling
 | Error | Recovery |
 |-------|----------|
 | CLI timeout | Generate synthesis from individual step analyses only (no cross-step) |
 | Resume fails | Start fresh analysis session |
 | JSON parse fails | Use step-level data to construct minimal synthesis |
 ## Output
 - **Files**: `synthesis.md`
 - **State**: `state.synthesis` updated
 - **Next**: Phase 5 (Optimization Report)
--- a/.claude/skills/workflow-tune/phases/05-optimize-report.md
+++ b/.claude/skills/workflow-tune/phases/05-optimize-report.md
@@ -0,0 +1,246 @@
 # Phase 5: Optimization Report
 > **COMPACT SENTINEL [Phase 5: Report]**
 > This phase contains 4 execution steps (Step 5.1 -- 5.4).
 > If you can read this sentinel but cannot find the full Step protocol below, context has been compressed.
 > Recovery: `Read("phases/05-optimize-report.md")`
 Generate the final optimization report and optionally apply high-priority fixes.
 ## Objective
 - Read complete state, process log, synthesis
 - Generate structured final report
 - Optionally apply auto-fix (if enabled)
 - Write final-report.md
 - Display summary to user
 ## Execution
 ### Step 5.1: Read Complete State
 ```javascript
 const state = JSON.parse(Read(`${state.work_dir}/workflow-state.json`));
 const processLog = Read(`${state.work_dir}/process-log.md`);
 const synthesis = state.synthesis;
 state.status = 'completed';
 state.updated_at = new Date().toISOString();
 ```
 ### Step 5.2: Generate Report
 ```javascript
 // Compute stats
 const scores = state.steps.map(s => s.analysis?.quality_score).filter(Boolean);
 const avgScore = scores.length > 0 ? Math.round(scores.reduce((a, b) => a + b, 0) / scores.length) : 0;
 const minStep = state.steps.reduce((min, s) =>
  (s.analysis?.quality_score || 100) < (min.analysis?.quality_score || 100) ? s : min
 , state.steps[0]);
 const totalIssues = state.steps.reduce((sum, s) => sum + (s.analysis?.issue_count || 0), 0);
 const totalHighIssues = state.steps.reduce((sum, s) => sum + (s.analysis?.high_issues || 0), 0);
 // Step quality table (with requirement match)
 const stepTable = state.steps.map((s, i) => {
  const reqPass = s.analysis?.requirement_pass;
  const reqStr = reqPass === true ? 'PASS' : reqPass === false ? 'FAIL' : '-';
  return `| ${i + 1} | ${s.name} | ${s.type} | ${s.execution?.success ? 'OK' : 'FAIL'} | ${reqStr} | ${s.analysis?.quality_score || '-'} | ${s.analysis?.issue_count || 0} | ${s.analysis?.high_issues || 0} |`;
 }).join('\n');
 // Collect all improvements (workflow-level + per-step)
 const allImprovements = [];
 if (synthesis?.raw_data?.workflow_improvements) {
  synthesis.raw_data.workflow_improvements.forEach(w => {
    allImprovements.push({
      scope: 'workflow',
      priority: w.priority,
      description: w.description,
      rationale: w.rationale,
      category: w.category,
      affected: w.affected_steps || []
    });
  });
 }
 if (synthesis?.raw_data?.per_step_improvements) {
  synthesis.raw_data.per_step_improvements.forEach(s => {
    (s.improvements || []).forEach(imp => {
      allImprovements.push({
        scope: s.step,
        priority: imp.priority,
        description: imp.description,
        rationale: imp.rationale,
        category: 'step',
        affected: [s.step]
      });
    });
  });
 }
 // Sort by priority
 const priorityOrder = { high: 0, medium: 1, low: 2 };
 allImprovements.sort((a, b) => (priorityOrder[a.priority] || 2) - (priorityOrder[b.priority] || 2));
 const report = `# Workflow Tune — Final Report
 ## Summary
 | Field | Value |
 |-------|-------|
 | **Workflow** | ${state.workflow_name} |
 | **Goal** | ${state.workflow_context} |
 | **Steps** | ${state.steps.length} |
 | **Workflow Score** | ${synthesis?.workflow_score || avgScore}/100 |
 | **Average Step Quality** | ${avgScore}/100 |
 | **Weakest Step** | ${minStep.name} (${minStep.analysis?.quality_score || '-'}/100) |
 | **Total Issues** | ${totalIssues} (${totalHighIssues} high severity) |
 | **Analysis Depth** | ${state.analysis_depth} |
 | **Started** | ${state.started_at} |
 | **Completed** | ${state.updated_at} |
 ## Step Quality Matrix
 | # | Step | Type | Exec | Req Match | Quality | Issues | High |
 |---|------|------|------|-----------|---------|--------|------|
 ${stepTable}
 ## Workflow Flow Assessment
 ### Coherence: ${synthesis?.raw_data?.coherence?.score || '-'}/100
 ${synthesis?.raw_data?.coherence?.assessment || 'Not evaluated'}
 ### Handoff Quality: ${synthesis?.raw_data?.handoff_quality?.score || '-'}/100
 ${(synthesis?.raw_data?.handoff_quality?.issues || []).map(i =>
  `- **${i.from_step} → ${i.to_step}**: ${i.issue}`
 ).join('\n') || 'No handoff issues'}
 ### Bottlenecks
 ${(synthesis?.raw_data?.bottlenecks || []).map(b =>
  `- **${b.step}** [${b.impact}]: ${b.reason}`
 ).join('\n') || 'No bottlenecks identified'}
 ## Optimization Recommendations
 ### Priority: HIGH
 ${allImprovements.filter(i => i.priority === 'high').map((i, idx) =>
  `${idx + 1}. **[${i.scope}]** ${i.description}\n   - Rationale: ${i.rationale}\n   - Affected: ${i.affected.join(', ')}`
 ).join('\n') || 'None'}
 ### Priority: MEDIUM
 ${allImprovements.filter(i => i.priority === 'medium').map((i, idx) =>
  `${idx + 1}. **[${i.scope}]** ${i.description}\n   - Rationale: ${i.rationale}`
 ).join('\n') || 'None'}
 ### Priority: LOW
 ${allImprovements.filter(i => i.priority === 'low').map((i, idx) =>
  `${idx + 1}. **[${i.scope}]** ${i.description}`
 ).join('\n') || 'None'}
 ## Process Documentation
 Full process log: \`${state.work_dir}/process-log.md\`
 Synthesis: \`${state.work_dir}/synthesis.md\`
 ### Per-Step Analysis Files
 | Step | Analysis File |
 |------|---------------|
 ${state.steps.map((s, i) =>
  `| ${s.name} | \`${state.work_dir}/steps/step-${i + 1}/step-${i + 1}-analysis.md\` |`
 ).join('\n')}
 ## Artifact Locations
 | Path | Description |
 |------|-------------|
 | \`${state.work_dir}/workflow-state.json\` | Complete state |
 | \`${state.work_dir}/process-log.md\` | Accumulated process log |
 | \`${state.work_dir}/synthesis.md\` | Cross-step synthesis |
 | \`${state.work_dir}/final-report.md\` | This report |
 `;
 Write(`${state.work_dir}/final-report.md`, report);
 ```
 ### Step 5.3: Optional Auto-Fix
 ```javascript
 if (state.auto_fix && allImprovements.filter(i => i.priority === 'high').length > 0) {
  const highPriorityFixes = allImprovements.filter(i => i.priority === 'high');
  // ★ Safety: confirm with user before applying auto-fixes
  const fixList = highPriorityFixes.map((f, i) =>
    `${i + 1}. [${f.scope}] ${f.description}\n   Affected: ${f.affected.join(', ')}`
  ).join('\n');
  const autoFixConfirm = AskUserQuestion({
    questions: [{
      question: `以下 ${highPriorityFixes.length} 项高优先级优化将被自动应用：\n\n${fixList}\n\n确认应用？`,
      header: "Auto-Fix Confirmation",
      multiSelect: false,
      options: [
        { label: "Apply (应用)", description: "自动应用以上高优先级修复" },
        { label: "Skip (跳过)", description: "跳过自动修复，仅保留报告" }
      ]
    }]
  });
  if (autoFixConfirm["Auto-Fix Confirmation"].startsWith("Skip")) {
    // Skip auto-fix, just log it
    state.auto_fix_skipped = true;
  } else {
  Agent({
    subagent_type: 'general-purpose',
    run_in_background: false,
    description: 'Apply high-priority workflow optimizations',
    prompt: `## Task: Apply High-Priority Workflow Optimizations
 You are applying the top optimization suggestions from a workflow analysis.
 ## Improvements to Apply (HIGH priority only)
 ${highPriorityFixes.map((f, i) =>
  `${i + 1}. [${f.scope}] ${f.description}\n   Rationale: ${f.rationale}\n   Affected: ${f.affected.join(', ')}`
 ).join('\n')}
 ## Workflow Steps
 ${state.steps.map((s, i) => `${i + 1}. ${s.name} (${s.type}): ${s.command}`).join('\n')}
 ## Rules
 1. Read each affected file BEFORE modifying
 2. Apply ONLY the high-priority suggestions
 3. Preserve existing code style
 4. Write a changes summary to: ${state.work_dir}/auto-fix-changes.md
 `
  });
  } // end Apply branch
 }
 ```
 ### Step 5.4: Display Summary
 Output to user:
 ```
 Workflow Tune Complete!
 Workflow: {name}
 Steps: {count}
 Workflow Score: {score}/100
 Average Step Quality: {avgScore}/100
 Weakest Step: {name} ({score}/100)
 Step Scores: {step1}={score1} → {step2}={score2} → ... → {stepN}={scoreN}
 Issues: {total} ({high} high priority)
 Improvements: {count} ({highCount} high priority)
 Full report: {workDir}/final-report.md
 Process log: {workDir}/process-log.md
 ```
 ## Output
 - **Files**: `final-report.md`, optionally `auto-fix-changes.md`
 - **State**: `status = completed`
 - **Next**: Workflow complete. Return control to user.
--- a/.claude/skills/workflow-tune/specs/workflow-eval-criteria.md
+++ b/.claude/skills/workflow-tune/specs/workflow-eval-criteria.md
@@ -0,0 +1,57 @@
 # Workflow Evaluation Criteria
 Workflow 调优评估标准，由 Phase 03 (Analyze Step) 和 Phase 04 (Synthesize) 引用。
 ## Per-Step Dimensions
 | Dimension | Description |
 |-----------|-------------|
 | Execution Success | 命令是否成功执行，退出码是否正确 |
 | Output Completeness | 产物是否齐全，预期文件是否生成 |
 | Artifact Quality | 产物内容质量 — 非空、格式正确、内容有意义 |
 | Handoff Readiness | 产物是否满足下一步的输入要求，格式兼容性 |
 ## Per-Step Scoring Guide
 | Range | Level | Description |
 |-------|-------|-------------|
 | 90-100 | Excellent | 执行完美，产物高质量，下游可直接消费 |
 | 80-89 | Good | 执行成功，产物基本完整，微调即可衔接 |
 | 70-79 | Adequate | 执行成功但产物有缺失或质量一般 |
 | 60-69 | Needs Work | 部分失败或产物质量差，衔接困难 |
 | 0-59 | Poor | 执行失败或产物无法使用 |
 ## Workflow-Level Dimensions
 | Dimension | Description |
 |-----------|-------------|
 | Coherence | 步骤间的逻辑顺序是否合理，是否形成完整流程 |
 | Handoff Quality | 步骤间的数据传递是否顺畅，格式是否匹配 |
 | Redundancy | 是否存在步骤间的工作重叠或重复 |
 | Efficiency | 整体流程是否高效，有无不必要的步骤 |
 | Completeness | 是否覆盖所有必要环节，有无遗漏 |
 ## Analysis Depth Profiles
 ### Quick
 - 每步 3-5 要点
 - 关注: 执行成功、产出完整、明显问题
 - 跨步骤: 基本衔接检查
 ### Standard
 - 每步详细评估
 - 关注: 执行质量、产出完整性、产物质量、衔接就绪度、潜在问题
 - 跨步骤: 衔接质量、冗余检测、瓶颈识别
 ### Deep
 - 每步深度审查
 - 关注: 执行质量、产出正确性、结构质量、衔接完整性、错误处理、性能信号、架构影响、边界情况
 - 跨步骤: 全面流程优化、重排建议、缺失步骤检测、架构改进
 ## Issue Severity Guide
 | Severity | Description | Example |
 |----------|-------------|---------|
 | High | 阻断流程或导致错误结果 | 步骤执行失败、产物格式不兼容、关键数据丢失 |
 | Medium | 影响质量但不阻断 | 产物不完整、衔接需手动调整、冗余步骤 |
 | Low | 可改进但不影响功能 | 输出格式不一致、可优化的步骤顺序 |
--- a/.claude/skills/workflow-tune/templates/step-analysis-prompt.md
+++ b/.claude/skills/workflow-tune/templates/step-analysis-prompt.md
@@ -0,0 +1,88 @@
 # Step Analysis Prompt Template
 Phase 03 使用此模板构造 ccw cli 提示词，让 Gemini 分析单个步骤的执行结果和产物质量。
 ## Template
 ```
 PURPOSE: Analyze the output of workflow step "${stepName}" (step ${stepIndex}/${totalSteps}) to assess quality, identify issues, and evaluate handoff readiness for the next step.
 WORKFLOW CONTEXT:
 Name: ${workflowName}
 Goal: ${workflowContext}
 Step Chain:
 ${stepChainContext}
 CURRENT STEP:
 Name: ${stepName}
 Type: ${stepType}
 Command: ${stepCommand}
 ${successCriteria}
 EXECUTION RESULT:
 ${execSummary}
 ${handoffContext}
 STEP ARTIFACTS:
 ${artifactSummary}
 ANALYSIS DEPTH: ${analysisDepth}
 ${depthInstructions}
 TASK:
 1. Assess step execution quality (did it succeed? complete output?)
 2. Evaluate artifact quality (content correctness, completeness, format)
 3. Check handoff readiness (can the next step consume this output?)
 4. Identify issues, risks, or optimization opportunities
 5. Rate overall step quality 0-100
 EXPECTED OUTPUT (strict JSON, no markdown):
 {
  "quality_score": <0-100>,
  "execution_assessment": {
    "success": <true|false>,
    "completeness": "<complete|partial|failed>",
    "notes": "<brief assessment>"
  },
  "artifact_assessment": {
    "count": <number>,
    "quality": "<high|medium|low>",
    "key_outputs": ["<main output 1>", "<main output 2>"],
    "missing_outputs": ["<expected but missing>"]
  },
  "handoff_assessment": {
    "ready": <true|false>,
    "next_step_compatible": <true|false|null>,
    "handoff_notes": "<what next step should know>"
  },
  "issues": [
    { "severity": "high|medium|low", "description": "<issue>", "suggestion": "<fix>" }
  ],
  "optimization_opportunities": [
    { "area": "<area>", "description": "<opportunity>", "impact": "high|medium|low" }
  ],
  "step_summary": "<1-2 sentence summary for process log>"
 }
 CONSTRAINTS: Be specific, reference artifact content where possible, output ONLY JSON
 ```
 ## Variable Substitution
 | Variable | Source | Description |
 |----------|--------|-------------|
 | `${stepName}` | workflow-state.json | 当前步骤名称 |
 | `${stepIndex}` | orchestrator loop | 当前步骤序号 (1-based) |
 | `${totalSteps}` | workflow-state.json | 总步骤数 |
 | `${workflowName}` | workflow-state.json | Workflow 名称 |
 | `${workflowContext}` | workflow-state.json | Workflow 目标描述 |
 | `${stepChainContext}` | Phase 03 builds | 所有步骤状态概览 |
 | `${stepType}` | workflow-state.json | 步骤类型 (skill/ccw-cli/command) |
 | `${stepCommand}` | workflow-state.json | 步骤命令 |
 | `${successCriteria}` | workflow-state.json | 成功标准 (如有) |
 | `${execSummary}` | Phase 03 builds | 执行结果摘要 |
 | `${handoffContext}` | Phase 03 builds | 上一步的产出摘要 (非首步) |
 | `${artifactSummary}` | Phase 03 builds | 产物内容摘要 |
 | `${analysisDepth}` | workflow-state.json | 分析深度 (quick/standard/deep) |
 | `${depthInstructions}` | Phase 03 maps | 对应深度的分析指令 |
--- a/.claude/skills/workflow-tune/templates/synthesis-prompt.md
+++ b/.claude/skills/workflow-tune/templates/synthesis-prompt.md
@@ -0,0 +1,90 @@
 # Synthesis Prompt Template
 Phase 04 使用此模板构造 ccw cli 提示词，让 Gemini 综合所有步骤分析，产出跨步骤优化建议。
 ## Template
 ```
 PURPOSE: Synthesize all step analyses into a holistic workflow optimization assessment. Evaluate cross-step concerns: ordering, handoff quality, redundancy, bottlenecks, and overall workflow coherence.
 WORKFLOW OVERVIEW:
 Name: ${workflowName}
 Goal: ${workflowContext}
 Steps: ${stepCount}
 Average Quality: ${avgScore}/100
 Weakest Step: ${minScore}/100
 Total Issues: ${totalIssues} (${totalHighIssues} high severity)
 SCORE SUMMARY:
 ${scoreSummary}
 COMPLETE PROCESS LOG:
 ${processLog}
 DETAILED STEP ANALYSES:
 ${stepAnalyses}
 TASK:
 1. **Workflow Coherence**: Do steps form a logical sequence? Any missing steps?
 2. **Handoff Quality**: Are step outputs well-consumed by subsequent steps? Data format mismatches?
 3. **Redundancy Detection**: Do any steps duplicate work? Overlapping concerns?
 4. **Bottleneck Identification**: Which steps are bottlenecks (lowest quality, longest duration)?
 5. **Step Ordering**: Would reordering steps improve outcomes?
 6. **Missing Steps**: Are there gaps in the pipeline that need additional steps?
 7. **Per-Step Optimization**: Top 3 improvements per underperforming step
 8. **Workflow-Level Optimization**: Structural changes to the overall pipeline
 EXPECTED OUTPUT (strict JSON, no markdown):
 {
  "workflow_score": <0-100>,
  "coherence": {
    "score": <0-100>,
    "assessment": "<logical flow evaluation>",
    "gaps": ["<missing step or transition>"]
  },
  "handoff_quality": {
    "score": <0-100>,
    "issues": [
      { "from_step": "<step name>", "to_step": "<step name>", "issue": "<description>", "fix": "<suggestion>" }
    ]
  },
  "redundancy": {
    "found": <true|false>,
    "items": [
      { "steps": ["<step1>", "<step2>"], "description": "<what overlaps>", "recommendation": "<merge or remove>" }
    ]
  },
  "bottlenecks": [
    { "step": "<step name>", "reason": "<why it's a bottleneck>", "impact": "high|medium|low", "fix": "<suggestion>" }
  ],
  "ordering_suggestions": [
    { "current": "<current order description>", "proposed": "<new order>", "rationale": "<why>" }
  ],
  "per_step_improvements": [
    { "step": "<step name>", "improvements": [
      { "priority": "high|medium|low", "description": "<what to change>", "rationale": "<why>" }
    ]}
  ],
  "workflow_improvements": [
    { "priority": "high|medium|low", "category": "structure|handoff|performance|quality", "description": "<change>", "rationale": "<why>", "affected_steps": ["<step names>"] }
  ],
  "summary": "<2-3 sentence executive summary of workflow health and top recommendations>"
 }
 CONSTRAINTS: Be specific, reference step names and artifact details, output ONLY JSON
 ```
 ## Variable Substitution
 | Variable | Source | Description |
 |----------|--------|-------------|
 | `${workflowName}` | workflow-state.json | Workflow 名称 |
 | `${workflowContext}` | workflow-state.json | Workflow 目标 |
 | `${stepCount}` | workflow-state.json | 总步骤数 |
 | `${avgScore}` | Phase 04 computes | 所有步骤平均分 |
 | `${minScore}` | Phase 04 computes | 最低步骤分 |
 | `${totalIssues}` | Phase 04 computes | 总问题数 |
 | `${totalHighIssues}` | Phase 04 computes | 高优先级问题数 |
 | `${scoreSummary}` | Phase 04 builds | 每步分数一行 |
 | `${processLog}` | process-log.md | 完整过程日志 |
 | `${stepAnalyses}` | Phase 04 reads | 所有 step-N-analysis.md 内容 |
--- a/.codex/skills/analyze-with-file/EXECUTE.md
+++ b/.codex/skills/analyze-with-file/EXECUTE.md
@@ -1,716 +0,0 @@
 # Analyze Task Generation & Execution Spec
 > **Purpose**: Quality standards for task generation + execution specification for Phase 5 of `analyze-with-file`.
 > **Consumer**: Phase 5 of `analyze-with-file` workflow.
 > **Scope**: Task generation quality + direct inline execution.
 ---
 ## Task Generation Flow
 > **Entry point**: Routed here from SKILL.md Phase 5 when complexity is `complex` (≥3 recommendations or high-priority with dependencies).
 ```
 Step 1: Load context → Step 2: Generate .task/*.json → Step 3: Pre-execution analysis
    → Step 4: User confirmation → Step 5: Serial execution → Step 6: Finalize
 ```
 **Input artifacts** (all from session folder):
 | Artifact | Required | Provides |
 |----------|----------|----------|
 | `conclusions.json` | Yes | `recommendations[]` with action, rationale, priority, evidence_refs |
 | `exploration-codebase.json` | No | `relevant_files[]`, `patterns[]`, `constraints[]`, `integration_points[]` — primary source for file resolution |
 | `explorations.json` | No | `sources[]`, `key_findings[]` — fallback for file resolution |
 | `perspectives.json` | No | Multi-perspective findings — alternative to explorations.json |
 ---
 ## File Resolution Algorithm
 Target files are resolved with a 3-priority fallback chain. Recommendations carry only `evidence_refs` — file resolution is EXECUTE.md's responsibility:
 ```javascript
 function resolveTargetFiles(rec, codebaseContext, explorations) {
  // Priority 1: Extract file paths from evidence_refs (e.g., "src/auth/token.ts:89")
  if (rec.evidence_refs?.length) {
    const filePaths = [...new Set(
      rec.evidence_refs
        .filter(ref => ref.includes('/') || ref.includes('.'))
        .map(ref => ref.split(':')[0])
    )]
    if (filePaths.length) {
      return filePaths.map(path => ({
        path,
        action: 'modify',
        target: null,
        changes: []
      }))
    }
  }
  // Priority 2: Match from exploration-codebase.json relevant_files
  if (codebaseContext?.relevant_files?.length) {
    const keywords = extractKeywords(rec.action + ' ' + rec.rationale)
    const matched = codebaseContext.relevant_files.filter(f =>
      keywords.some(kw =>
        f.path.toLowerCase().includes(kw) ||
        f.summary?.toLowerCase().includes(kw) ||
        f.relevance?.toLowerCase().includes(kw)
      )
    )
    if (matched.length) {
      return matched.map(f => ({
        path: f.path,
        action: 'modify',
        target: null,
        changes: rec.changes || []
      }))
    }
  }
  // Priority 3: Match from explorations.json sources
  if (explorations?.sources?.length) {
    const actionVerb = rec.action.split(' ')[0].toLowerCase()
    const matched = explorations.sources.filter(s =>
      s.summary?.toLowerCase().includes(actionVerb) ||
      s.file?.includes(actionVerb)
    )
    if (matched.length) {
      return matched.map(s => ({
        path: s.file,
        action: 'modify',
        target: null,
        changes: []
      }))
    }
  }
  // Fallback: empty array — task relies on description + implementation for guidance
  return []
 }
 function extractKeywords(text) {
  return text
    .toLowerCase()
    .replace(/[^a-z0-9\u4e00-\u9fa5\s]/g, ' ')
    .split(/\s+/)
    .filter(w => w.length > 2)
    .filter(w => !['the', 'and', 'for', 'with', 'from', 'that', 'this'].includes(w))
 }
 ```
 ---
 ## Task Type Inference
 | Recommendation Pattern | Inferred Type |
 |------------------------|---------------|
 | fix, resolve, repair, patch, correct | `fix` |
 | refactor, restructure, extract, reorganize, decouple | `refactor` |
 | add, implement, create, build, introduce | `feature` |
 | improve, optimize, enhance, upgrade, streamline | `enhancement` |
 | test, coverage, validate, verify, assert | `testing` |
 ```javascript
 function inferTaskType(rec) {
  const text = (rec.action + ' ' + rec.rationale).toLowerCase()
  const patterns = [
    { type: 'fix',         keywords: ['fix', 'resolve', 'repair', 'patch', 'correct', 'bug'] },
    { type: 'refactor',    keywords: ['refactor', 'restructure', 'extract', 'reorganize', 'decouple'] },
    { type: 'feature',     keywords: ['add', 'implement', 'create', 'build', 'introduce'] },
    { type: 'enhancement', keywords: ['improve', 'optimize', 'enhance', 'upgrade', 'streamline'] },
    { type: 'testing',     keywords: ['test', 'coverage', 'validate', 'verify', 'assert'] }
  ]
  for (const p of patterns) {
    if (p.keywords.some(kw => text.includes(kw))) return p.type
  }
  return 'enhancement'  // safe default
 }
 ```
 ## Effort Inference
 | Signal | Effort |
 |--------|--------|
 | priority=high AND files >= 3 | `large` |
 | priority=high OR files=2 | `medium` |
 | priority=medium AND files <= 1 | `medium` |
 | priority=low OR single file | `small` |
 ```javascript
 function inferEffort(rec, targetFiles) {
  const fileCount = targetFiles?.length || 0
  if (rec.priority === 'high' && fileCount >= 3) return 'large'
  if (rec.priority === 'high' || fileCount >= 2) return 'medium'
  if (rec.priority === 'low' || fileCount <= 1) return 'small'
  return 'medium'
 }
 ```
 ---
 ## Convergence Quality Validation
 Every task's `convergence` MUST pass quality gates before writing to disk.
 ### Quality Rules
 | Field | Requirement | Validation |
 |-------|-------------|------------|
 | `criteria[]` | **Testable** — assertions or concrete manual steps | Reject vague patterns; each criterion must reference observable behavior |
 | `verification` | **Executable** — shell command or explicit step sequence | Must contain a runnable command or step-by-step verification procedure |
 | `definition_of_done` | **Business language** — non-technical stakeholder can judge | Must NOT contain technical commands (jest, tsc, npm, build) |
 ### Vague Pattern Detection
 ```javascript
 const VAGUE_PATTERNS = /正常|正确|好|可以|没问题|works|fine|good|correct|properly|as expected/i
 const TECHNICAL_IN_DOD = /compile|build|lint|npm|npx|jest|tsc|eslint|cargo|pytest|go test/i
 function validateConvergenceQuality(tasks) {
  const issues = []
  tasks.forEach(task => {
    // Rule 1: No vague criteria
    task.convergence.criteria.forEach((c, i) => {
      if (VAGUE_PATTERNS.test(c) && c.length < 20) {
        issues.push({
          task: task.id, field: `criteria[${i}]`,
          problem: 'Vague criterion', value: c,
          fix: 'Replace with specific observable condition from evidence'
        })
      }
    })
    // Rule 2: Verification should be executable
    if (task.convergence.verification && task.convergence.verification.length < 5) {
      issues.push({
        task: task.id, field: 'verification',
        problem: 'Too short to be executable', value: task.convergence.verification,
        fix: 'Provide shell command or numbered step sequence'
      })
    }
    // Rule 3: DoD should be business language
    if (TECHNICAL_IN_DOD.test(task.convergence.definition_of_done)) {
      issues.push({
        task: task.id, field: 'definition_of_done',
        problem: 'Contains technical commands', value: task.convergence.definition_of_done,
        fix: 'Rewrite in business language describing user/system outcome'
      })
    }
    // Rule 4: files[].changes should not be empty when files exist
    task.files?.forEach((f, i) => {
      if (f.action === 'modify' && (!f.changes || f.changes.length === 0) && !f.change) {
        issues.push({
          task: task.id, field: `files[${i}].changes`,
          problem: 'No change description for modify action', value: f.path,
          fix: 'Describe what specifically changes in this file'
        })
      }
    })
    // Rule 5: implementation steps should exist
    if (!task.implementation || task.implementation.length === 0) {
      issues.push({
        task: task.id, field: 'implementation',
        problem: 'No implementation steps',
        fix: 'Add at least one step describing how to realize this task'
      })
    }
  })
  // Auto-fix where possible, log remaining issues
  issues.forEach(issue => {
    // Attempt auto-fix based on available evidence
    // If unfixable, log warning — task still generated but flagged
  })
  return issues
 }
 ```
 ### Good vs Bad Examples
 **Criteria**:
 | Bad | Good |
 |-----|------|
 | `"Code works correctly"` | `"refreshToken() returns a new JWT with >0 expiry when called with expired token"` |
 | `"No errors"` | `"Error handler at auth.ts:45 returns 401 status with { error: 'token_expired' } body"` |
 | `"Performance is good"` | `"API response time < 200ms at p95 for /api/users endpoint under 100 concurrent requests"` |
 **Verification**:
 | Bad | Good |
 |-----|------|
 | `"Check it"` | `"jest --testPathPattern=auth.test.ts && npx tsc --noEmit"` |
 | `"Run tests"` | `"1. Run npm test -- --grep 'token refresh' 2. Verify no TypeScript errors with npx tsc --noEmit"` |
 **Definition of Done**:
 | Bad | Good |
 |-----|------|
 | `"jest passes"` | `"Users remain logged in across token expiration without manual re-login"` |
 | `"No TypeScript errors"` | `"Authentication flow handles all user-facing error scenarios with clear error messages"` |
 ---
 ## Required Task Fields (analyze-with-file producer)
 SKILL.md produces minimal recommendations `{action, rationale, priority, evidence_refs}`. EXECUTE.md enriches these into full task JSON. The final `.task/*.json` MUST populate:
 | Block | Fields | Required |
 |-------|--------|----------|
 | IDENTITY | `id`, `title`, `description` | Yes |
 | CLASSIFICATION | `type`, `priority`, `effort` | Yes |
 | DEPENDENCIES | `depends_on` | Yes (empty array if none) |
 | CONVERGENCE | `convergence.criteria[]`, `convergence.verification`, `convergence.definition_of_done` | Yes |
 | FILES | `files[].path`, `files[].action`, `files[].changes`/`files[].change` | Yes (if files identified) |
 | IMPLEMENTATION | `implementation[]` with step + description | Yes |
 | CONTEXT | `evidence`, `source.tool`, `source.session_id`, `source.original_id` | Yes |
 ### Task JSON Example
 ```json
 {
  "id": "TASK-001",
  "title": "Fix authentication token refresh",
  "description": "Token refresh fails silently when JWT expires, causing users to be logged out unexpectedly",
  "type": "fix",
  "priority": "high",
  "effort": "medium",
  "files": [
    {
      "path": "src/auth/token.ts",
      "action": "modify",
      "target": "refreshToken",
      "changes": [
        "Add await to refreshToken() call at line 89",
        "Add error propagation for refresh failure"
      ],
      "change": "Add await to refreshToken() call and propagate errors"
    },
    {
      "path": "src/middleware/auth.ts",
      "action": "modify",
      "target": "authMiddleware",
      "changes": [
        "Update error handler at line 45 to distinguish refresh failures from auth failures"
      ],
      "change": "Update error handler to propagate refresh failures"
    }
  ],
  "depends_on": [],
  "convergence": {
    "criteria": [
      "refreshToken() returns new valid JWT when called with expired token",
      "Expired token triggers automatic refresh without user action",
      "Failed refresh returns 401 with { error: 'token_expired' } body"
    ],
    "verification": "jest --testPathPattern=token.test.ts && npx tsc --noEmit",
    "definition_of_done": "Users remain logged in across token expiration without manual re-login"
  },
  "implementation": [
    {
      "step": "1",
      "description": "Add await to refreshToken() call in token.ts",
      "actions": ["Read token.ts", "Add await keyword at line 89", "Verify async chain"]
    },
    {
      "step": "2",
      "description": "Update error handler in auth middleware",
      "actions": ["Read auth.ts", "Modify error handler at line 45", "Add refresh-specific error type"]
    }
  ],
  "evidence": ["src/auth/token.ts:89", "src/middleware/auth.ts:45"],
  "source": {
    "tool": "analyze-with-file",
    "session_id": "ANL-auth-token-refresh-2025-01-21",
    "original_id": "TASK-001"
  }
 }
 ```
 ---
 ## Step 1: Load All Context Sources
 Phase 2-4 already loaded and processed these artifacts. If data is still in conversation memory, skip disk reads.
 ```javascript
 // Skip loading if already in memory from Phase 2-4
 // Only read from disk when entering EXECUTE.md from a fresh/resumed session
 if (!conclusions) {
  conclusions = JSON.parse(Read(`${sessionFolder}/conclusions.json`))
 }
 if (!codebaseContext) {
  codebaseContext = file_exists(`${sessionFolder}/exploration-codebase.json`)
    ? JSON.parse(Read(`${sessionFolder}/exploration-codebase.json`))
    : null
 }
 if (!explorations) {
  explorations = file_exists(`${sessionFolder}/explorations.json`)
    ? JSON.parse(Read(`${sessionFolder}/explorations.json`))
    : file_exists(`${sessionFolder}/perspectives.json`)
      ? JSON.parse(Read(`${sessionFolder}/perspectives.json`))
      : null
 }
 ```
 ## Step 2: Enrich Recommendations & Generate .task/*.json
 SKILL.md Phase 4 produces minimal recommendations: `{action, rationale, priority, evidence_refs}`.
 This step enriches each recommendation with execution-specific details using codebase context, then generates individual task JSON files.
 **Enrichment pipeline**: `rec (minimal) + codebaseContext + explorations → task JSON (full)`
 ```javascript
 const tasks = conclusions.recommendations.map((rec, index) => {
  const taskId = `TASK-${String(index + 1).padStart(3, '0')}`
  // 1. ENRICH: Resolve target files from codebase context (not from rec)
  const targetFiles = resolveTargetFiles(rec, codebaseContext, explorations)
  // 2. ENRICH: Generate implementation steps from action + context
  const implSteps = generateImplementationSteps(rec, targetFiles, codebaseContext)
  // 3. ENRICH: Derive change descriptions per file
  const enrichedFiles = targetFiles.map(f => ({
    path: f.path,
    action: f.action || 'modify',
    target: f.target || null,
    changes: deriveChanges(rec, f, codebaseContext) || [],
    change: rec.action
  }))
  return {
    id: taskId,
    title: rec.action,
    description: rec.rationale,
    type: inferTaskType(rec),
    priority: rec.priority,
    effort: inferEffort(rec, targetFiles),
    files: enrichedFiles,
    depends_on: [],
    // CONVERGENCE (must pass quality validation)
    convergence: {
      criteria: generateCriteria(rec),
      verification: generateVerification(rec),
      definition_of_done: generateDoD(rec)
    },
    // IMPLEMENTATION steps (generated here, not from SKILL.md)
    implementation: implSteps,
    // CONTEXT
    evidence: rec.evidence_refs || [],
    source: {
      tool: 'analyze-with-file',
      session_id: sessionId,
      original_id: taskId
    }
  }
 })
 // Quality validation
 validateConvergenceQuality(tasks)
 // Write each task as individual JSON file
 Bash(`mkdir -p ${sessionFolder}/.task`)
 tasks.forEach(task => {
  Write(`${sessionFolder}/.task/${task.id}.json`, JSON.stringify(task, null, 2))
 })
 ```
 **Enrichment Functions**:
 ```javascript
 // Generate implementation steps from action + resolved files
 function generateImplementationSteps(rec, targetFiles, codebaseContext) {
  // 1. Parse rec.action into atomic steps
  // 2. Map steps to target files
  // 3. Add context from codebaseContext.patterns if applicable
  // Return: [{step: '1', description: '...', actions: [...]}]
  return [{
    step: '1',
    description: rec.action,
    actions: targetFiles.map(f => `Modify ${f.path}`)
  }]
 }
 // Derive specific change descriptions for a file
 function deriveChanges(rec, file, codebaseContext) {
  // 1. Match rec.action keywords to file content patterns
  // 2. Use codebaseContext.patterns for context-aware change descriptions
  // 3. Use rec.evidence_refs to locate specific modification points
  // Return: ['specific change 1', 'specific change 2']
  return [rec.action]
 }
 ```
 ## Step 3-6: Execution Steps
 After `.task/*.json` generation, validate and execute tasks directly inline.
 ### Step 3: Pre-Execution Analysis
 ```javascript
 const taskFiles = Glob(`${sessionFolder}/.task/*.json`)
 const tasks = taskFiles.map(f => JSON.parse(Read(f)))
 // 1. Dependency validation
 const taskIds = new Set(tasks.map(t => t.id))
 const errors = []
 tasks.forEach(task => {
  task.depends_on.forEach(dep => {
    if (!taskIds.has(dep)) errors.push(`${task.id}: depends on unknown task ${dep}`)
  })
 })
 // 2. Circular dependency detection (DFS)
 function detectCycles(tasks) {
  const graph = new Map(tasks.map(t => [t.id, t.depends_on]))
  const visited = new Set(), inStack = new Set(), cycles = []
  function dfs(node, path) {
    if (inStack.has(node)) { cycles.push([...path, node].join(' → ')); return }
    if (visited.has(node)) return
    visited.add(node); inStack.add(node)
    ;(graph.get(node) || []).forEach(dep => dfs(dep, [...path, node]))
    inStack.delete(node)
  }
  tasks.forEach(t => { if (!visited.has(t.id)) dfs(t.id, []) })
  return cycles
 }
 // 3. Topological sort for execution order
 function topoSort(tasks) {
  const inDegree = new Map(tasks.map(t => [t.id, 0]))
  tasks.forEach(t => t.depends_on.forEach(dep => {
    inDegree.set(t.id, inDegree.get(t.id) + 1)
  }))
  const queue = tasks.filter(t => inDegree.get(t.id) === 0).map(t => t.id)
  const order = []
  while (queue.length) {
    const id = queue.shift()
    order.push(id)
    tasks.forEach(t => {
      if (t.depends_on.includes(id)) {
        inDegree.set(t.id, inDegree.get(t.id) - 1)
        if (inDegree.get(t.id) === 0) queue.push(t.id)
      }
    })
  }
  return order
 }
 // 4. File conflict detection
 const fileTaskMap = new Map()
 tasks.forEach(task => {
  (task.files || []).forEach(f => {
    if (!fileTaskMap.has(f.path)) fileTaskMap.set(f.path, [])
    fileTaskMap.get(f.path).push(task.id)
  })
 })
 const conflicts = []
 fileTaskMap.forEach((taskIds, file) => {
  if (taskIds.length > 1) conflicts.push({ file, tasks: taskIds })
 })
 ```
 ### Step 4: Initialize Execution Artifacts
 ```javascript
 // execution.md — overview with task table
 const executionMd = `# Execution Overview
 ## Session Info
 - **Session ID**: ${sessionId}
 - **Plan Source**: .task/*.json (from analysis conclusions)
 - **Started**: ${getUtc8ISOString()}
 - **Total Tasks**: ${tasks.length}
 ## Task Overview
 | # | ID | Title | Type | Priority | Status |
 |---|-----|-------|------|----------|--------|
 ${tasks.map((t, i) => `| ${i+1} | ${t.id} | ${t.title} | ${t.type} | ${t.priority} | pending |`).join('\n')}
 ## Pre-Execution Analysis
 ${conflicts.length
  ? `### File Conflicts\n${conflicts.map(c => `- **${c.file}**: ${c.tasks.join(', ')}`).join('\n')}`
  : 'No file conflicts detected.'}
 ## Execution Timeline
 > Updated as tasks complete
 `
 Write(`${sessionFolder}/execution.md`, executionMd)
 // execution-events.md — chronological event log
 Write(`${sessionFolder}/execution-events.md`,
  `# Execution Events\n\n**Session**: ${sessionId}\n**Started**: ${getUtc8ISOString()}\n\n---\n\n`)
 ```
 ### Step 5: Task Execution Loop
 **User Confirmation** before execution:
 ```javascript
 if (!autoYes) {
  const action = AskUserQuestion({
    questions: [{
      question: `Execute ${tasks.length} tasks?\n${tasks.map(t => `  ${t.id}: ${t.title} (${t.priority})`).join('\n')}`,
      header: "Confirm",
      multiSelect: false,
      options: [
        { label: "Start", description: "Execute all tasks serially" },
        { label: "Adjust", description: "Modify .task/*.json before execution" },
        { label: "Skip", description: "Keep .task/*.json, skip execution" }
      ]
    }]
  })
  // "Adjust": user edits task files, then resumes
  // "Skip": end — user can execute later separately
 }
 ```
 Execute tasks serially using `task.implementation` steps and `task.files[].changes` as guidance.
 ```
 For each taskId in executionOrder:
  ├─ Load task from .task/{taskId}.json
  ├─ Check dependencies satisfied
  ├─ Record START event → execution-events.md
  ├─ Execute using task.implementation + task.files[].changes:
  │   ├─ Read target files listed in task.files[]
  │   ├─ Apply modifications described in files[].changes / files[].change
  │   ├─ Follow implementation[].actions sequence
  │   └─ Use Edit (preferred), Write (new files), Bash (build/test)
  ├─ Verify convergence:
  │   ├─ Check each convergence.criteria[] item
  │   ├─ Run convergence.verification (if executable command)
  │   └─ Record verification results
  ├─ Record COMPLETE/FAIL event → execution-events.md
  ├─ Update execution.md task status
  └─ Continue to next task
 ```
 **Execution Guidance Priority** — what the AI follows when executing each task:
 | Priority | Source | Example |
 |----------|--------|---------|
 | 1 | `files[].changes` / `files[].change` | "Add await to refreshToken() call at line 89" |
 | 2 | `implementation[].actions` | ["Read token.ts", "Add await keyword at line 89"] |
 | 3 | `implementation[].description` | "Add await to refreshToken() call in token.ts" |
 | 4 | `task.description` | "Token refresh fails silently..." |
 When `files[].changes` is populated, the AI has concrete instructions. When empty, it falls back to `implementation` steps, then to `description`.
 ### Step 5.1: Failure Handling
 ```javascript
 // On task failure, ask user how to proceed
 if (!autoYes) {
  AskUserQuestion({
    questions: [{
      question: `Task ${task.id} failed: ${errorMessage}\nHow to proceed?`,
      header: "Failure",
      multiSelect: false,
      options: [
        { label: "Skip & Continue", description: "Skip this task, continue with next" },
        { label: "Retry", description: "Retry this task" },
        { label: "Abort", description: "Stop execution, keep progress" }
      ]
    }]
  })
 }
 ```
 ### Step 6: Finalize
 After all tasks complete:
 1. Append execution summary to `execution.md` (statistics, task results table)
 2. Append session footer to `execution-events.md`
 3. Write back `_execution` state to each `.task/*.json`:
 ```javascript
 tasks.forEach(task => {
  const updated = {
    ...task,
    status: task._status,           // "completed" | "failed" | "skipped"
    executed_at: task._executed_at,
    result: {
      success: task._status === 'completed',
      files_modified: task._result?.files_modified || [],
      summary: task._result?.summary || '',
      error: task._result?.error || null,
      convergence_verified: task._result?.convergence_verified || []
    }
  }
  Write(`${sessionFolder}/.task/${task.id}.json`, JSON.stringify(updated, null, 2))
 })
 ```
 ### Step 6.1: Post-Execution Options
 ```javascript
 if (!autoYes) {
  AskUserQuestion({
    questions: [{
      question: `Execution complete: ${completedTasks.size}/${tasks.length} succeeded. Next:`,
      header: "Post-Execute",
      multiSelect: false,
      options: [
        { label: "Retry Failed", description: `Re-execute ${failedTasks.size} failed tasks` },
        { label: "View Events", description: "Display execution-events.md" },
        { label: "Create Issue", description: "Create issue from failed tasks" },
        { label: "Done", description: "End workflow" }
      ]
    }]
  })
 }
 ```
 ---
 ## Output Structure
 ```
 {sessionFolder}/
 ├── .task/                     # Individual task JSON files (with _execution state after completion)
 │   ├── TASK-001.json
 │   └── ...
 ├── execution.md               # Execution overview + task table + summary
 └── execution-events.md        # Chronological event log
 ```
 ## execution-events.md Event Format
 ```markdown
 ## {timestamp} — {task.id}: {task.title}
 **Type**: {task.type} | **Priority**: {task.priority}
 **Status**: IN PROGRESS
 **Files**: {task.files[].path}
 ### Execution Log
 - Read {file} ({lines} lines)
 - Applied: {change description}
 - ...
 **Status**: COMPLETED / FAILED
 **Files Modified**: {list}
 #### Convergence Verification
 - [x/] {criterion 1}
 - [x/] {criterion 2}
 - **Verification**: {command} → PASS/FAIL
 ---
 ```
--- a/.codex/skills/analyze-with-file/SKILL.md
+++ b/.codex/skills/analyze-with-file/SKILL.md
@@ -10,14 +10,14 @@ argument-hint: "TOPIC=\"<question or topic>\" [--depth=quick|standard|deep] [--c
 Interactive collaborative analysis workflow with **documented discussion process**. Records understanding evolution, facilitates multi-round Q&A, and uses inline search tools for deep exploration.
-**Core workflow**: Topic → Explore → Discuss → Document → Refine → Conclude → (Optional) Quick Execute
+**Core workflow**: Topic → Explore → Discuss → Document → Refine → Conclude → Plan Checklist
 **Key features**:
 - **Documented discussion timeline**: Captures understanding evolution across all phases
 - **Decision recording at every critical point**: Mandatory recording of key findings, direction changes, and trade-offs
 - **Multi-perspective analysis**: Supports up to 4 analysis perspectives (serial, inline)
 - **Interactive discussion**: Multi-round Q&A with user feedback and direction adjustments
- **Quick execute**: Convert conclusions directly to executable tasks
+- **Plan output**: Generate structured plan checklist for downstream execution (e.g., `$csv-wave-pipeline`)
 ### Decision Recording Protocol
@@ -96,6 +96,7 @@ Step 1: Topic Understanding
 Step 2: Exploration (Inline, No Agents)
   ├─ Detect codebase → search relevant modules, patterns
   │   ├─ Run `ccw spec load --category exploration` (if spec system available)
   │   ├─ Run `ccw spec load --category debug` (known issues and root-cause notes)
   │   └─ Use Grep, Glob, Read, mcp__ace-tool__search_context
   ├─ Multi-perspective analysis (if selected, serial)
   │   ├─ Single: Comprehensive analysis
@@ -128,17 +129,11 @@ Step 4: Synthesis & Conclusion
   ├─ Consolidate all insights → conclusions.json (with steps[] per recommendation)
   ├─ Update discussion.md with final synthesis
   ├─ Interactive Recommendation Review (per-recommendation confirm/modify/reject)
-   └─ Offer options: quick execute / create issue / generate task / export / done
+   └─ Offer options: generate plan / create issue / export / done
-Step 5: Execute (Optional - user selects, routes by complexity)
+Step 5: Plan Generation (Optional - produces plan only, NO code modifications)
-   ├─ Simple (≤2 recs): Direct inline execution → summary in discussion.md
+   ├─ Generate inline plan checklist → appended to discussion.md
-   └─ Complex (≥3 recs): EXECUTE.md pipeline
+   └─ Remind user to execute via $csv-wave-pipeline
      ├─ Enrich recommendations → generate .task/TASK-*.json
      ├─ Pre-execution analysis (dependencies, file conflicts, execution order)
      ├─ User confirmation
      ├─ Direct inline execution (Read/Edit/Write/Grep/Glob/Bash)
      ├─ Record events → execution-events.md, update execution.md
      └─ Report completion summary
 ```
 ## Configuration
@@ -326,6 +321,7 @@ const hasCodebase = Bash(`
 if (hasCodebase !== 'none') {
  // 1. Read project metadata (if exists)
  //    - Run `ccw spec load --category exploration` (load project specs)
  //    - Run `ccw spec load --category debug` (known issues and root-cause notes)
  //    - .workflow/specs/*.md (project conventions)
  // 2. Search codebase for relevant content
@@ -817,7 +813,7 @@ for (const [index, rec] of sortedRecs.entries()) {
 ##### Step 4.4: Post-Completion Options
-**Complexity Assessment** — determine whether .task/*.json generation is warranted:
+**Complexity Assessment** — determine available options:
 ```javascript
 // Assess recommendation complexity to decide available options
@@ -833,9 +829,9 @@ function assessComplexity(recs) {
 // Complexity → available options mapping:
 //   none:    Done | Create Issue | Export Report
-//   simple:  Done | Create Issue | Export Report (no task generation — overkill)
+//   simple:  Done | Create Issue | Export Report
-//   moderate: Done | Generate Task | Create Issue | Export Report
+//   moderate: Generate Plan | Create Issue | Export Report | Done
-//   complex:  Quick Execute | Generate Task | Create Issue | Export Report | Done
+//   complex:  Generate Plan | Create Issue | Export Report | Done
 ```
 ```javascript
@@ -850,9 +846,9 @@ if (!autoYes) {
    }]
  })
 } else {
-  // Auto mode: generate .task/*.json only for moderate/complex, skip for simple/none
+  // Auto mode: generate plan only for moderate/complex, skip for simple/none
  if (complexity === 'complex' || complexity === 'moderate') {
-    // → Phase 5 Step 5.1-5.2 (task generation only, no execution)
+    // → Phase 5 (plan generation only, NO code modifications)
  } else {
    // → Done (conclusions.json is sufficient output)
  }
@@ -865,14 +861,13 @@ if (!autoYes) {
 |------------|-------------------|-----------|
 | `none` | Done, Create Issue, Export Report | No actionable recommendations |
 | `simple` | Done, Create Issue, Export Report | 1-2 low-priority items don't warrant formal task JSON |
-| `moderate` | Generate Task, Create Issue, Export Report, Done | Task structure helpful but execution not urgent |
+| `moderate` | Generate Plan, Create Issue, Export Report, Done | Task structure helpful for downstream execution |
-| `complex` | Quick Execute, Generate Task, Create Issue, Export Report, Done | Full pipeline justified |
+| `complex` | Generate Plan, Create Issue, Export Report, Done | Full plan generation justified |
 | Selection | Action |
 |-----------|--------|
-| Quick Execute | Jump to Phase 5 (only reviewed recs with status accepted/modified) |
+| Generate Plan | Jump to Phase 5 (plan generation only, NO code modifications) |
 | Create Issue | `Skill(skill="issue:new", args="...")` (only reviewed recs) |
 | Generate Task | Jump to Phase 5 Step 5.1-5.2 only (generate .task/*.json, no execution) |
 | Export Report | Copy discussion.md + conclusions.json to user-specified location |
 | Done | Display artifact paths, end |
@@ -883,96 +878,64 @@ if (!autoYes) {
 - User offered meaningful next step options
 - **Complete decision trail** documented and traceable from initial scoping to final conclusions
-### Phase 5: Execute (Optional)
+### Phase 5: Plan Generation (Optional — NO code modifications)
-**Objective**: Execute analysis recommendations — route by complexity.
+**Objective**: Generate structured plan checklist from analysis recommendations. **This phase produces plans only — it does NOT modify any source code.**
-**Trigger**: User selects "Quick Execute" in Phase 4. In auto mode, triggered only for `moderate`/`complex` recommendations.
+**Trigger**: User selects "Generate Plan" in Phase 4. In auto mode, triggered only for `moderate`/`complex` recommendations.
 **Routing Logic**:
 ```
 complexity assessment (from Phase 4.3)
  ├─ simple/moderate (≤2 recommendations, clear changes)
  │   └─ Direct inline execution — no .task/*.json overhead
  └─ complex (≥3 recommendations, or high-priority with dependencies)
      └─ Route to EXECUTE.md — full pipeline (task generation → execution)
 ```
 ##### Step 5.1: Route by Complexity
 ```javascript
 const recs = conclusions.recommendations || []
-if (recs.length >= 3 || recs.some(r => r.priority === 'high')) {
+// Build plan checklist from all accepted/modified recommendations
-  // COMPLEX PATH → EXECUTE.md pipeline
+const planChecklist = recs
-  // Full specification: EXECUTE.md
+  .filter(r => r.review_status !== 'rejected')
-  // Flow: load all context → generate .task/*.json → pre-execution analysis → serial execution → finalize
+  .map((rec, index) => {
-} else {
+    const files = rec.evidence_refs
-  // SIMPLE PATH → direct inline execution (below)
+      ?.filter(ref => ref.includes(':'))
-}
+      .map(ref => ref.split(':')[0]) || []
 ```
-##### Step 5.2: Simple Path — Direct Inline Execution
+    return `### ${index + 1}. ${rec.action}
-
+- **Priority**: ${rec.priority}
 For simple/moderate recommendations, execute directly without .task/*.json ceremony:
 ```javascript
 // For each recommendation:
 recs.forEach((rec, index) => {
  // 1. Locate relevant files from evidence_refs or codebase search
  const files = rec.evidence_refs
    ?.filter(ref => ref.includes(':'))
    .map(ref => ref.split(':')[0]) || []
  // 2. Read each target file
  files.forEach(filePath => Read(filePath))
  // 3. Apply changes based on rec.action + rec.rationale
  //    Use Edit (preferred) for modifications, Write for new files
  // 4. Log to discussion.md — append execution summary
 })
 // Append execution summary to discussion.md
 appendToDiscussion(`
 ## Quick Execution Summary
 - **Recommendations executed**: ${recs.length}
 - **Completed**: ${getUtc8ISOString()}
 ${recs.map((rec, i) => `### ${i+1}. ${rec.action}
 - **Status**: completed/failed
 - **Rationale**: ${rec.rationale}
 - **Target files**: ${files.join(', ') || 'TBD'}
 - **Evidence**: ${rec.evidence_refs?.join(', ') || 'N/A'}
-`).join('\n')}
+- [ ] Ready for execution`
  }).join('\n\n')
 // Append plan checklist to discussion.md
 appendToDiscussion(`
 ## Plan Checklist
 > **This is a plan only — no code was modified.**
 > To execute, use: \`$csv-wave-pipeline "<requirement summary>"\`
 - **Recommendations**: ${recs.length}
 - **Generated**: ${getUtc8ISOString()}
 ${planChecklist}
 ---
 ### Next Step: Execute
 Run \`$csv-wave-pipeline\` to execute these recommendations as wave-based batch tasks:
 \`\`\`bash
 $csv-wave-pipeline "${topic}"
 \`\`\`
 `)
 ```
-**Simple path characteristics**:
+**Characteristics**:
- No `.task/*.json` generation
+- Plan checklist appended directly to `discussion.md`
- No `execution.md` / `execution-events.md`
+- **No code modifications** — plan output only
- Execution summary appended directly to `discussion.md`
+- Reminds user to use `$csv-wave-pipeline` for execution
 - Suitable for 1-2 clear, low-risk recommendations
 ##### Step 5.3: Complex Path — EXECUTE.md Pipeline
 For complex recommendations, follow the full specification in `EXECUTE.md`:
 1. **Load context sources**: Reuse in-memory artifacts or read from disk
 2. **Enrich recommendations**: Resolve target files, generate implementation steps, build convergence criteria
 3. **Generate `.task/*.json`**: Individual task files with full execution context
 4. **Pre-execution analysis**: Dependency validation, file conflicts, topological sort
 5. **User confirmation**: Present task list, allow adjustment
 6. **Serial execution**: Execute each task following generated implementation steps
 7. **Finalize**: Update task states, write execution artifacts
 **Full specification**: `EXECUTE.md`
 **Success Criteria**:
- Simple path: recommendations executed, summary in discussion.md
+- Plan checklist in discussion.md with all accepted recommendations
- Complex path: `.task/*.json` generated with quality validation, execution tracked via execution.md + execution-events.md
+- User reminded about `$csv-wave-pipeline` for execution
- Execution route chosen correctly based on complexity assessment
+- **No source code modified** — strictly plan output
 ## Output Structure
@@ -989,11 +952,11 @@ For complex recommendations, follow the full specification in `EXECUTE.md`:
 └── conclusions.json           # Phase 4: Final synthesis with recommendations
 ```
-> **Phase 5 complex path** adds `.task/`, `execution.md`, `execution-events.md` — see `EXECUTE.md` for structure.
+> **Phase 5** appends a plan checklist to `discussion.md`. No additional files are generated.
 | File | Phase | Description |
 |------|-------|-------------|
-| `discussion.md` | 1-4 | Session metadata → discussion timeline → conclusions. Simple execution summary appended here. |
+| `discussion.md` | 1-5 | Session metadata → discussion timeline → conclusions. Plan checklist appended here (simple path). |
 | `exploration-codebase.json` | 2 | Codebase context: relevant files, patterns, constraints |
 | `explorations/*.json` | 2 | Per-perspective exploration results (multi only) |
 | `explorations.json` | 2 | Single perspective aggregated findings |
@@ -1158,16 +1121,13 @@ Remaining questions or areas for investigation
 | User timeout in discussion | Save state, show resume command | Use `--continue` to resume |
 | Max rounds reached (5) | Force synthesis phase | Highlight remaining questions in conclusions |
 | Session folder conflict | Append timestamp suffix | Create unique folder and continue |
-| Quick execute: task fails | Record failure, ask user | Retry, skip, or abort (see EXECUTE.md) |
+| Plan generation: no recommendations | No plan to generate | Inform user, suggest lite-plan |
 | Quick execute: verification fails | Mark as unverified | Note in events, manual check |
 | Quick execute: no recommendations | Cannot generate .task/*.json | Inform user, suggest lite-plan |
 | Quick execute: simple recommendations | Complexity too low for .task/*.json | Direct inline execution (no task generation) |
 ## Best Practices
 ### Core Principles
-1. **Explicit user confirmation required before code modifications**: The analysis phase is strictly read-only. Any code changes (Phase 5 quick execute) require user approval.
+1. **No code modifications**: This skill is strictly read-only and plan-only. Phase 5 generates plan checklists in `discussion.md` but does NOT modify source code. Use `$csv-wave-pipeline` for execution.
 ### Before Starting Analysis
@@ -1204,10 +1164,11 @@ Remaining questions or areas for investigation
 - Building shared understanding before implementation
 - Want to document how understanding evolved
-**Use Quick Execute (Phase 5) when:**
+**Use Plan Generation (Phase 5) when:**
 - Analysis conclusions contain clear, actionable recommendations
- Simple: 1-2 clear changes → direct inline execution (no .task/ overhead)
+- Simple: 1-2 items → inline plan checklist in discussion.md
- Complex: 3+ recommendations with dependencies → EXECUTE.md pipeline (.task/*.json → serial execution)
+- Complex: 3+ recommendations → detailed plan checklist
 - **Then execute via**: `$csv-wave-pipeline` for wave-based batch execution
 **Consider alternatives when:**
 - Specific bug diagnosis needed → use `debug-with-file`
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
catlog22	7ef47c3d47	feat: enhance workflow-tune skill with reference document extraction and intent matching improvements	2026-03-20 15:27:05 +08:00
catlog22	9c49a32cd9	feat: enhance workflow-tune skill with test requirements and analysis improvements	2026-03-20 15:11:32 +08:00
catlog22	d843112094	feat: enhance spec loading capabilities and add new categories - Added support for loading specs from new categories: debug, test, review, and validation. - Updated various agents and skills to include instructions for loading project context from the new spec categories. - Introduced new spec documents for test conventions, review standards, and validation rules to improve project guidelines. - Enhanced the frontend to support new watcher settings and display auto-watch status. - Improved the spec index builder to accommodate new categories and ensure proper loading of specifications.	2026-03-20 15:06:57 +08:00
catlog22	2b43b6be7b	fix: resolve all 92 frontend test failures across 13 test files - Fix locale index: remove wrong 'cli-manager' prefix from flattenMessages (production i18n bug) - Fix DialogStyleContext test: correct expected style for multi-select ('modal' not 'drawer') - Fix useNotifications test: correct FIFO toast ordering (store appends, not prepends) - Fix CcwToolsMcpCard test: add missing fetchRootDirectories mock - Fix TickerMarquee test: add IntlProvider, handle duplicate marquee elements - Fix useCommands test: add QueryClient/IntlProvider wrappers and workflowStore mock - Fix Header test: remove obsolete LanguageSwitcher tests, add DialogStyleContext mock - Fix QueuePage test: add non-empty mock data to prevent empty state rendering - Fix EndpointsPage test: handle multiple matching elements with getAllByText - Rewrite chartHooksIntegration test: mock fetch() instead of api.get(), add workflowStore - Rewrite DashboardIntegration test: match current HomePage widget structure - Fix A2UI components test: add DialogStyleContext mock - Fix AnalysisPage/DiscoveryPage tests: add missing hook mocks Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 14:59:57 +08:00
catlog22	d5b6480528	feat: add workflow-tune skill for multi-step workflow pipeline optimization New skill that executes workflow step chains sequentially, inspects artifacts, analyzes quality via ccw cli resume chain (Gemini), and generates optimization reports. Supports 4 input formats: pipe-separated commands, comma-separated skills, JSON file, and natural language with semantic decomposition. Key features: - Orchestrator + 5-phase progressive loading architecture - Intent-to-tool mapping with ambiguity resolution for NL input - Command document generation with pre-execution confirmation loop - Per-step analysis + cross-step synthesis via resume chain - Auto-fix with user confirmation safety gate Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 14:55:47 +08:00
catlog22	26a7371a20	fix: resolve team worker task discovery failures and clean up legacy role-specs - Remove owner name exact-match filter from team-worker.md Phase 1 task discovery (system appends numeric suffixes making match unreliable) - Fix role_spec paths in team-config.json for perf-opt, arch-opt, ux-improve (role-specs/<role>.md → roles/<role>/role.md) - Fix stale role-specs path in perf-opt monitor.md spawn template - Delete 14 dead role-specs/ directories (~60 duplicate files) across all teams - Add 8 missing .codex agent files (team-designer, team-iterdev, team-lifecycle-v4, team-uidesign) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 12:11:51 +08:00
catlog22	b6c763fd1b	chore: bump version to 7.2.10, remove analyze-with-file EXECUTE.md - analyze-with-file: plan-only output, no code modifications - roadmap-with-file: handoff to csv-wave-pipeline instead of team-planex Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-19 22:47:41 +08:00
catlog22	ab280afd8e	chore: update SKILL.md for analyze-with-file and roadmap-with-file to reflect changes in execution process and handoff to csv-wave-pipeline	2026-03-19 22:40:23 +08:00
catlog22	f1a30e1272	chore: update version to 7.2.8 and remove obsolete tool execution section from documentation	2026-03-19 20:48:41 +08:00
catlog22	cf321ea1ac	fix: update CodexLens MCP template with AST support defaults - Add [ast] extra to uvx install args (codexlens-search[mcp,ast]) - Add CODEXLENS_AST_CHUNKING to env defaults - Auto-inject AST_CHUNKING in buildMcpServerConfig Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-19 20:45:14 +08:00