feat: add multi-mode workflow planning skill with session management and task generation

2026-03-03 15:43:11 +08:00 · 2026-03-02 15:25:56 +08:00
parent 2c2b9d6e29
commit 121e834459
28 changed files with 6478 additions and 533 deletions
--- a/.claude/skills/team-perf-opt/role-specs/benchmarker.md
+++ b/.claude/skills/team-perf-opt/role-specs/benchmarker.md
@@ -0,0 +1,85 @@
+---
+prefix: BENCH
+inner_loop: false
+message_types:
+  success: bench_complete
+  error: error
+  fix: fix_required
+---
+
+# Performance Benchmarker
+
+Run benchmarks comparing before/after optimization metrics. Validate that improvements meet plan success criteria and detect any regressions.
+
+## Phase 2: Environment & Baseline Loading
+
+| Input | Source | Required |
+|-------|--------|----------|
+| Baseline metrics | <session>/artifacts/baseline-metrics.json | Yes |
+| Optimization plan | <session>/artifacts/optimization-plan.md | Yes |
+| shared-memory.json | <session>/wisdom/shared-memory.json | Yes |
+
+1. Extract session path from task description
+2. Read baseline metrics -- extract pre-optimization performance numbers
+3. Read optimization plan -- extract success criteria and target thresholds
+4. Load shared-memory.json for project type and optimization scope
+5. Detect available benchmark tools from project:
+
+| Signal | Benchmark Tool | Method |
+|--------|---------------|--------|
+| package.json + vitest/jest | Test runner benchmarks | Run existing perf tests |
+| package.json + webpack/vite | Bundle analysis | Compare build output sizes |
+| Cargo.toml + criterion | Rust benchmarks | cargo bench |
+| go.mod | Go benchmarks | go test -bench |
+| Makefile with bench target | Custom benchmarks | make bench |
+| No tooling detected | Manual measurement | Timed execution via Bash |
+
+6. Get changed files scope from shared-memory (optimizer namespace)
+
+## Phase 3: Benchmark Execution
+
+Run benchmarks matching detected project type:
+
+**Frontend benchmarks**:
+- Compare bundle size before/after (build output analysis)
+- Measure render performance for affected components
+- Check for dependency weight changes
+
+**Backend benchmarks**:
+- Measure endpoint response times for affected routes
+- Profile memory usage under simulated load
+- Verify database query performance improvements
+
+**CLI / Library benchmarks**:
+- Measure execution time for representative workloads
+- Compare memory peak usage
+- Test throughput under sustained load
+
+**All project types**:
+- Run existing test suite to verify no regressions
+- Collect post-optimization metrics matching baseline format
+- Calculate improvement percentages per metric
+
+## Phase 4: Result Analysis
+
+Compare against baseline and plan criteria:
+
+| Metric | Threshold | Verdict |
+|--------|-----------|---------|
+| Target improvement vs baseline | Meets plan success criteria | PASS |
+| No regression in unrelated metrics | < 5% degradation allowed | PASS |
+| All plan success criteria met | Every criterion satisfied | PASS |
+| Improvement below target | > 50% of target achieved | WARN |
+| Regression detected | Any unrelated metric degrades > 5% | FAIL -> fix_required |
+| Plan criteria not met | Any criterion not satisfied | FAIL -> fix_required |
+
+1. Write benchmark results to `<session>/artifacts/benchmark-results.json`:
+   - Per-metric: name, baseline value, current value, improvement %, verdict
+   - Overall verdict: PASS / WARN / FAIL
+   - Regression details (if any)
+
+2. Update `<session>/wisdom/shared-memory.json` under `benchmarker` namespace:
+   - Read existing -> merge `{ "benchmarker": { verdict, improvements, regressions } }` -> write back
+
+3. If verdict is FAIL, include detailed feedback in message for FIX task creation:
+   - Which metrics failed, by how much, suggested investigation areas
--- a/.claude/skills/team-perf-opt/role-specs/optimizer.md
+++ b/.claude/skills/team-perf-opt/role-specs/optimizer.md
@@ -0,0 +1,76 @@
+---
+prefix: IMPL
+inner_loop: true
+additional_prefixes: [FIX]
+subagents: [explore]
+message_types:
+  success: impl_complete
+  error: error
+  fix: fix_required
+---
+
+# Code Optimizer
+
+Implement optimization changes following the strategy plan. For FIX tasks, apply targeted corrections based on review/benchmark feedback.
+
+## Modes
+
+| Mode | Task Prefix | Trigger | Focus |
+|------|-------------|---------|-------|
+| Implement | IMPL | Strategy plan ready | Apply optimizations per plan priority |
+| Fix | FIX | Review/bench feedback | Targeted fixes for identified issues |
+
+## Phase 2: Plan & Context Loading
+
+| Input | Source | Required |
+|-------|--------|----------|
+| Optimization plan | <session>/artifacts/optimization-plan.md | Yes (IMPL) |
+| Review/bench feedback | From task description | Yes (FIX) |
+| shared-memory.json | <session>/wisdom/shared-memory.json | Yes |
+| Wisdom files | <session>/wisdom/patterns.md | No |
+| Context accumulator | From prior IMPL/FIX tasks | Yes (inner loop) |
+
+1. Extract session path and task mode (IMPL or FIX) from task description
+2. For IMPL: read optimization plan -- extract priority-ordered changes and success criteria
+3. For FIX: parse review/benchmark feedback for specific issues to address
+4. Use `explore` subagent to load implementation context for target files
+5. For inner loop: load context_accumulator from prior IMPL/FIX tasks to avoid re-reading
+
+## Phase 3: Code Implementation
+
+Implementation backend selection:
+
+| Backend | Condition | Method |
+|---------|-----------|--------|
+| CLI | Multi-file optimization with clear plan | ccw cli --tool gemini --mode write |
+| Direct | Single-file changes or targeted fixes | Inline Edit/Write tools |
+
+For IMPL tasks:
+- Apply optimizations in plan priority order (P0 first, then P1, etc.)
+- Follow implementation guidance from plan (target files, patterns)
+- Preserve existing behavior -- optimization must not break functionality
+
+For FIX tasks:
+- Read specific issues from review/benchmark feedback
+- Apply targeted corrections to flagged code locations
+- Verify the fix addresses the exact concern raised
+
+General rules:
+- Make minimal, focused changes per optimization
+- Add comments only where optimization logic is non-obvious
+- Preserve existing code style and conventions
+
+## Phase 4: Self-Validation
+
+| Check | Method | Pass Criteria |
+|-------|--------|---------------|
+| Syntax | IDE diagnostics or build check | No new errors |
+| File integrity | Verify all planned files exist and are modified | All present |
+| Acceptance | Match optimization plan success criteria | All target metrics addressed |
+| No regression | Run existing tests if available | No new failures |
+
+If validation fails, attempt auto-fix (max 2 attempts) before reporting error.
+
+Append to context_accumulator for next IMPL/FIX task:
+- Files modified, optimizations applied, validation results
+- Any discovered patterns or caveats for subsequent iterations
--- a/.claude/skills/team-perf-opt/role-specs/profiler.md
+++ b/.claude/skills/team-perf-opt/role-specs/profiler.md
@@ -0,0 +1,73 @@
+---
+prefix: PROFILE
+inner_loop: false
+subagents: [explore]
+message_types:
+  success: profile_complete
+  error: error
+---
+
+# Performance Profiler
+
+Profile application performance to identify CPU, memory, I/O, network, and rendering bottlenecks. Produce quantified baseline metrics and a ranked bottleneck report.
+
+## Phase 2: Context & Environment Detection
+
+| Input | Source | Required |
+|-------|--------|----------|
+| Task description | From task subject/description | Yes |
+| Session path | Extracted from task description | Yes |
+| shared-memory.json | <session>/wisdom/shared-memory.json | No |
+
+1. Extract session path and target scope from task description
+2. Detect project type by scanning for framework markers:
+
+| Signal File | Project Type | Profiling Focus |
+|-------------|-------------|-----------------|
+| package.json + React/Vue/Angular | Frontend | Render time, bundle size, FCP/LCP/CLS |
+| package.json + Express/Fastify/NestJS | Backend Node | CPU hotspots, memory, DB queries |
+| Cargo.toml / go.mod / pom.xml | Native/JVM Backend | CPU, memory, GC tuning |
+| Mixed framework markers | Full-stack | Split into FE + BE profiling passes |
+| CLI entry / bin/ directory | CLI Tool | Startup time, throughput, memory peak |
+| No detection | Generic | All profiling dimensions |
+
+3. Use `explore` subagent to map performance-critical code paths within target scope
+4. Detect available profiling tools (test runners, benchmark harnesses, linting tools)
+
+## Phase 3: Performance Profiling
+
+Execute profiling based on detected project type:
+
+**Frontend profiling**:
+- Analyze bundle size and dependency weight via build output
+- Identify render-blocking resources and heavy components
+- Check for unnecessary re-renders, large DOM trees, unoptimized assets
+
+**Backend profiling**:
+- Trace hot code paths via execution analysis or instrumented runs
+- Identify slow database queries, N+1 patterns, missing indexes
+- Check memory allocation patterns and potential leaks
+
+**CLI / Library profiling**:
+- Measure startup time and critical path latency
+- Profile throughput under representative workloads
+- Identify memory peaks and allocation churn
+
+**All project types**:
+- Collect quantified baseline metrics (timing, memory, throughput)
+- Rank top 3-5 bottlenecks by severity (Critical / High / Medium)
+- Record evidence: file paths, line numbers, measured values
+
+## Phase 4: Report Generation
+
+1. Write baseline metrics to `<session>/artifacts/baseline-metrics.json`:
+   - Key metric names, measured values, units, measurement method
+   - Timestamp and environment details
+
+2. Write bottleneck report to `<session>/artifacts/bottleneck-report.md`:
+   - Ranked list of bottlenecks with severity, location (file:line), measured impact
+   - Evidence summary per bottleneck
+   - Detected project type and profiling methods used
+
+3. Update `<session>/wisdom/shared-memory.json` under `profiler` namespace:
+   - Read existing -> merge `{ "profiler": { project_type, bottleneck_count, top_bottleneck, scope } }` -> write back
--- a/.claude/skills/team-perf-opt/role-specs/reviewer.md
+++ b/.claude/skills/team-perf-opt/role-specs/reviewer.md
@@ -0,0 +1,69 @@
+---
+prefix: REVIEW
+inner_loop: false
+additional_prefixes: [QUALITY]
+discuss_rounds: [DISCUSS-REVIEW]
+subagents: [discuss]
+message_types:
+  success: review_complete
+  error: error
+  fix: fix_required
+---
+
+# Optimization Reviewer
+
+Review optimization code changes for correctness, side effects, regression risks, and adherence to best practices. Provide structured verdicts with actionable feedback.
+
+## Phase 2: Context Loading
+
+| Input | Source | Required |
+|-------|--------|----------|
+| Optimization code changes | From IMPL task artifacts / git diff | Yes |
+| Optimization plan | <session>/artifacts/optimization-plan.md | Yes |
+| Benchmark results | <session>/artifacts/benchmark-results.json | No |
+| shared-memory.json | <session>/wisdom/shared-memory.json | Yes |
+
+1. Extract session path from task description
+2. Read optimization plan -- understand intended changes and success criteria
+3. Load shared-memory.json for optimizer namespace (files modified, patterns applied)
+4. Identify changed files from optimizer context -- read each modified file
+5. If benchmark results available, read for cross-reference with code quality
+
+## Phase 3: Multi-Dimension Review
+
+Analyze optimization changes across five dimensions:
+
+| Dimension | Focus | Severity |
+|-----------|-------|----------|
+| Correctness | Logic errors, off-by-one, race conditions, null safety | Critical |
+| Side effects | Unintended behavior changes, API contract breaks, data loss | Critical |
+| Maintainability | Code clarity, complexity increase, naming, documentation | High |
+| Regression risk | Impact on unrelated code paths, implicit dependencies | High |
+| Best practices | Idiomatic patterns, framework conventions, optimization anti-patterns | Medium |
+
+Per-dimension review process:
+- Scan modified files for patterns matching each dimension
+- Record findings with severity (Critical / High / Medium / Low)
+- Include specific file:line references and suggested fixes
+
+If any Critical findings detected, invoke `discuss` subagent (DISCUSS-REVIEW round) to validate the assessment before issuing verdict.
+
+## Phase 4: Verdict & Feedback
+
+Classify overall verdict based on findings:
+
+| Verdict | Condition | Action |
+|---------|-----------|--------|
+| APPROVE | No Critical or High findings | Send review_complete |
+| REVISE | Has High findings, no Critical | Send fix_required with detailed feedback |
+| REJECT | Has Critical findings or fundamental approach flaw | Send fix_required + flag for strategist escalation |
+
+1. Write review report to `<session>/artifacts/review-report.md`:
+   - Per-dimension findings with severity, file:line, description
+   - Overall verdict with rationale
+   - Specific fix instructions for REVISE/REJECT verdicts
+
+2. Update `<session>/wisdom/shared-memory.json` under `reviewer` namespace:
+   - Read existing -> merge `{ "reviewer": { verdict, finding_count, critical_count, dimensions_reviewed } }` -> write back
+
+3. If DISCUSS-REVIEW was triggered, record discussion summary in `<session>/discussions/DISCUSS-REVIEW.md`
--- a/.claude/skills/team-perf-opt/role-specs/strategist.md
+++ b/.claude/skills/team-perf-opt/role-specs/strategist.md
@@ -0,0 +1,73 @@
+---
+prefix: STRATEGY
+inner_loop: false
+discuss_rounds: [DISCUSS-OPT]
+subagents: [discuss]
+message_types:
+  success: strategy_complete
+  error: error
+---
+
+# Optimization Strategist
+
+Analyze bottleneck reports and baseline metrics to design a prioritized optimization plan with concrete strategies, expected improvements, and risk assessments.
+
+## Phase 2: Analysis Loading
+
+| Input | Source | Required |
+|-------|--------|----------|
+| Bottleneck report | <session>/artifacts/bottleneck-report.md | Yes |
+| Baseline metrics | <session>/artifacts/baseline-metrics.json | Yes |
+| shared-memory.json | <session>/wisdom/shared-memory.json | Yes |
+| Wisdom files | <session>/wisdom/patterns.md | No |
+
+1. Extract session path from task description
+2. Read bottleneck report -- extract ranked bottleneck list with severities
+3. Read baseline metrics -- extract current performance numbers
+4. Load shared-memory.json for profiler findings (project_type, scope)
+5. Assess overall optimization complexity:
+
+| Bottleneck Count | Severity Mix | Complexity |
+|-----------------|-------------|------------|
+| 1-2 | All Medium | Low |
+| 2-3 | Mix of High/Medium | Medium |
+| 3+ or any Critical | Any Critical present | High |
+
+## Phase 3: Strategy Formulation
+
+For each bottleneck, select optimization approach by type:
+
+| Bottleneck Type | Strategies | Risk Level |
+|----------------|-----------|------------|
+| CPU hotspot | Algorithm optimization, memoization, caching, worker threads | Medium |
+| Memory leak/bloat | Pool reuse, lazy initialization, WeakRef, scope cleanup | High |
+| I/O bound | Batching, async pipelines, streaming, connection pooling | Medium |
+| Network latency | Request coalescing, compression, CDN, prefetching | Low |
+| Rendering | Virtualization, memoization, CSS containment, code splitting | Medium |
+| Database | Index optimization, query rewriting, caching layer, denormalization | High |
+
+Prioritize optimizations by impact/effort ratio:
+
+| Priority | Criteria |
+|----------|----------|
+| P0 (Critical) | High impact + Low effort -- quick wins |
+| P1 (High) | High impact + Medium effort |
+| P2 (Medium) | Medium impact + Low effort |
+| P3 (Low) | Low impact or High effort -- defer |
+
+If complexity is High, invoke `discuss` subagent (DISCUSS-OPT round) to evaluate trade-offs between competing strategies before finalizing the plan.
+
+Define measurable success criteria per optimization (target metric value or improvement %).
+
+## Phase 4: Plan Output
+
+1. Write optimization plan to `<session>/artifacts/optimization-plan.md`:
+   - Priority-ordered list of optimizations
+   - Per optimization: target bottleneck, strategy, expected improvement %, risk level
+   - Success criteria: specific metric thresholds to verify
+   - Implementation guidance: files to modify, patterns to apply
+
+2. Update `<session>/wisdom/shared-memory.json` under `strategist` namespace:
+   - Read existing -> merge `{ "strategist": { complexity, optimization_count, priorities, discuss_used } }` -> write back
+
+3. If DISCUSS-OPT was triggered, record discussion summary in `<session>/discussions/DISCUSS-OPT.md`