Add quality gates and tuning strategies documentation

- Introduced quality gates specification for skill tuning, detailing quality dimensions, scoring, and gate definitions. - Added comprehensive tuning strategies for various issue categories, including context explosion, long-tail forgetting, data flow, and agent coordination. - Created templates for diagnosis reports and fix proposals to standardize documentation and reporting processes.
2026-02-05 01:50:27 +08:00 · 2026-01-14 12:59:13 +08:00
parent 6b4b9b0775
commit 633d918da1
20 changed files with 5755 additions and 0 deletions
--- a/.claude/skills/skill-tuning/SKILL.md
+++ b/.claude/skills/skill-tuning/SKILL.md
@@ -0,0 +1,342 @@
+---
+name: skill-tuning
+description: Universal skill diagnosis and optimization tool. Detect and fix skill execution issues including context explosion, long-tail forgetting, data flow disruption, and agent coordination failures. Supports Gemini CLI for deep analysis. Triggers on "skill tuning", "tune skill", "skill diagnosis", "optimize skill", "skill debug".
+allowed-tools: Task, AskUserQuestion, Read, Write, Bash, Glob, Grep, mcp__ace-tool__search_context
+---
+
+# Skill Tuning
+
+Universal skill diagnosis and optimization tool that identifies and resolves skill execution problems through iterative multi-agent analysis.
+
+## Architecture Overview
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│  Skill Tuning Architecture (Autonomous Mode + Gemini CLI)                    │
+├─────────────────────────────────────────────────────────────────────────────┤
+│                                                                              │
+│  ⚠️ Phase 0: Specification  → 阅读规范 + 理解目标 skill 结构 (强制前置)       │
+│              Study                                                           │
+│           ↓                                                                  │
+│  ┌───────────────────────────────────────────────────────────────────────┐  │
+│  │                    Orchestrator (状态驱动决策)                          │  │
+│  │  读取诊断状态 → 选择下一步动作 → 执行 → 更新状态 → 循环直到完成         │  │
+│  └───────────────────────────────────────────────────────────────────────┘  │
+│                              │                                               │
+│     ┌────────────┬───────────┼───────────┬────────────┬────────────┐        │
+│     ↓            ↓           ↓           ↓            ↓            ↓        │
+│  ┌──────┐   ┌─────────┐  ┌────────┐  ┌────────┐  ┌────────┐  ┌─────────┐   │
+│  │ Init │   │Diagnose │  │Diagnose│  │Diagnose│  │Diagnose│  │ Gemini  │   │
+│  │      │   │ Context │  │ Memory │  │DataFlow│  │ Agent  │  │Analysis │   │
+│  └──────┘   └─────────┘  └────────┘  └────────┘  └────────┘  └─────────┘   │
+│      │           │           │           │            │            │        │
+│      └───────────┴───────────┴───────────┴────────────┴────────────┘        │
+│                              ↓                                               │
+│                    ┌──────────────────┐                                      │
+│                    │  Apply Fixes +   │                                      │
+│                    │  Verify Results  │                                      │
+│                    └──────────────────┘                                      │
+│                                                                              │
+│  ┌───────────────────────────────────────────────────────────────────────┐  │
+│  │                    Gemini CLI Integration                              │  │
+│  │  根据用户需求动态调用 gemini cli 进行深度分析:                          │  │
+│  │  • 复杂问题分析 (prompt engineering, architecture review)               │  │
+│  │  • 代码模式识别 (pattern matching, anti-pattern detection)              │  │
+│  │  • 修复策略生成 (fix generation, refactoring suggestions)               │  │
+│  └───────────────────────────────────────────────────────────────────────┘  │
+│                                                                              │
+└─────────────────────────────────────────────────────────────────────────────┘
+```
+
+## Problem Domain
+
+Based on comprehensive analysis, skill-tuning addresses **core skill issues** and **general optimization areas**:
+
+### Core Skill Issues (自动检测)
+
+| Priority | Problem | Root Cause | Solution Strategy |
+|----------|---------|------------|-------------------|
+| **P0** | Data Flow Disruption | Scattered state, inconsistent formats | Centralized session store, transactional updates |
+| **P1** | Agent Coordination | Fragile call chains, merge complexity | Dedicated orchestrator, enforced data contracts |
+| **P2** | Context Explosion | Token accumulation, multi-turn bloat | Context summarization, sliding window, structured state |
+| **P3** | Long-tail Forgetting | Early constraint loss | Constraint injection, checkpointing, goal alignment |
+
+### General Optimization Areas (按需分析 via Gemini CLI)
+
+| Category | Issues | Gemini Analysis Scope |
+|----------|--------|----------------------|
+| **Prompt Engineering** | 模糊指令, 输出格式不一致, 幻觉风险 | 提示词优化, 结构化输出设计 |
+| **Architecture** | 阶段划分不合理, 依赖混乱, 扩展性差 | 架构审查, 模块化建议 |
+| **Performance** | 执行慢, Token消耗高, 重复计算 | 性能分析, 缓存策略 |
+| **Error Handling** | 错误恢复不当, 无降级策略, 日志不足 | 容错设计, 可观测性增强 |
+| **Output Quality** | 输出不稳定, 格式漂移, 质量波动 | 质量门控, 验证机制 |
+| **User Experience** | 交互不流畅, 反馈不清晰, 进度不可见 | UX优化, 进度追踪 |
+
+## Key Design Principles
+
+1. **Problem-First Diagnosis**: Systematic identification before any fix attempt
+2. **Data-Driven Analysis**: Record execution traces, token counts, state snapshots
+3. **Iterative Refinement**: Multiple tuning rounds until quality gates pass
+4. **Non-Destructive**: All changes are reversible with backup checkpoints
+5. **Agent Coordination**: Use specialized sub-agents for each diagnosis type
+6. **Gemini CLI On-Demand**: Deep analysis via CLI for complex/custom issues
+
+---
+
+## Gemini CLI Integration
+
+根据用户需求动态调用 Gemini CLI 进行深度分析。
+
+### Trigger Conditions
+
+| Condition | Action | CLI Mode |
+|-----------|--------|----------|
+| 用户描述复杂问题 | 调用 Gemini 分析问题根因 | `analysis` |
+| 自动诊断发现 critical 问题 | 请求深度分析确认 | `analysis` |
+| 用户请求架构审查 | 执行架构分析 | `analysis` |
+| 需要生成修复代码 | 生成修复提案 | `write` |
+| 标准策略不适用 | 请求定制化策略 | `analysis` |
+
+### CLI Command Template
+
+```bash
+ccw cli -p "
+PURPOSE: ${purpose}
+TASK: ${task_steps}
+MODE: ${mode}
+CONTEXT: @${skill_path}/**/*
+EXPECTED: ${expected_output}
+RULES: $(cat ~/.claude/workflows/cli-templates/protocols/${mode}-protocol.md) | ${constraints}
+" --tool gemini --mode ${mode} --cd ${skill_path}
+```
+
+### Analysis Types
+
+#### 1. Problem Root Cause Analysis
+
+```bash
+ccw cli -p "
+PURPOSE: Identify root cause of skill execution issue: ${user_issue_description}
+TASK: • Analyze skill structure and phase flow • Identify anti-patterns • Trace data flow issues
+MODE: analysis
+CONTEXT: @**/*.md
+EXPECTED: JSON with { root_causes: [], patterns_found: [], recommendations: [] }
+RULES: $(cat ~/.claude/workflows/cli-templates/protocols/analysis-protocol.md) | Focus on execution flow
+" --tool gemini --mode analysis
+```
+
+#### 2. Architecture Review
+
+```bash
+ccw cli -p "
+PURPOSE: Review skill architecture for scalability and maintainability
+TASK: • Evaluate phase decomposition • Check state management patterns • Assess agent coordination
+MODE: analysis
+CONTEXT: @**/*.md
+EXPECTED: Architecture assessment with improvement recommendations
+RULES: $(cat ~/.claude/workflows/cli-templates/protocols/analysis-protocol.md) | Focus on modularity
+" --tool gemini --mode analysis
+```
+
+#### 3. Fix Strategy Generation
+
+```bash
+ccw cli -p "
+PURPOSE: Generate fix strategy for issue: ${issue_id} - ${issue_description}
+TASK: • Analyze issue context • Design fix approach • Generate implementation plan
+MODE: analysis
+CONTEXT: @**/*.md
+EXPECTED: JSON with { strategy: string, changes: [], verification_steps: [] }
+RULES: $(cat ~/.claude/workflows/cli-templates/protocols/analysis-protocol.md) | Minimal invasive changes
+" --tool gemini --mode analysis
+```
+
+---
+
+## Mandatory Prerequisites
+
+> **CRITICAL**: Read these documents before executing any action.
+
+### Core Specs (Required)
+
+| Document | Purpose | Priority |
+|----------|---------|----------|
+| [specs/problem-taxonomy.md](specs/problem-taxonomy.md) | Problem classification and detection patterns | **P0** |
+| [specs/tuning-strategies.md](specs/tuning-strategies.md) | Fix strategies for each problem type | **P0** |
+| [specs/quality-gates.md](specs/quality-gates.md) | Quality thresholds and verification criteria | P1 |
+
+### Templates (Reference)
+
+| Document | Purpose |
+|----------|---------|
+| [templates/diagnosis-report.md](templates/diagnosis-report.md) | Diagnosis report structure |
+| [templates/fix-proposal.md](templates/fix-proposal.md) | Fix proposal format |
+
+---
+
+## Execution Flow
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│  Phase 0: Specification Study (强制前置 - 禁止跳过)                           │
+│  → Read: specs/problem-taxonomy.md (问题分类)                                │
+│  → Read: specs/tuning-strategies.md (调优策略)                               │
+│  → Read: Target skill's SKILL.md and phases/*.md                            │
+│  → Output: 内化规范，理解目标 skill 结构                                      │
+├─────────────────────────────────────────────────────────────────────────────┤
+│  action-init: Initialize Tuning Session                                      │
+│  → Create work directory: .workflow/.scratchpad/skill-tuning-{timestamp}    │
+│  → Initialize state.json with target skill info                             │
+│  → Create backup of target skill files                                       │
+├─────────────────────────────────────────────────────────────────────────────┤
+│  action-diagnose-context: Context Explosion Analysis                         │
+│  → Scan for token accumulation patterns                                      │
+│  → Detect multi-turn dialogue growth                                         │
+│  → Output: context-diagnosis.json                                            │
+├─────────────────────────────────────────────────────────────────────────────┤
+│  action-diagnose-memory: Long-tail Forgetting Analysis                       │
+│  → Trace constraint propagation through phases                               │
+│  → Detect early instruction loss                                             │
+│  → Output: memory-diagnosis.json                                             │
+├─────────────────────────────────────────────────────────────────────────────┤
+│  action-diagnose-dataflow: Data Flow Analysis                                │
+│  → Map state transitions between phases                                      │
+│  → Detect format inconsistencies                                             │
+│  → Output: dataflow-diagnosis.json                                           │
+├─────────────────────────────────────────────────────────────────────────────┤
+│  action-diagnose-agent: Agent Coordination Analysis                          │
+│  → Analyze agent call patterns                                               │
+│  → Detect result passing issues                                              │
+│  → Output: agent-diagnosis.json                                              │
+├─────────────────────────────────────────────────────────────────────────────┤
+│  action-generate-report: Consolidated Report                                 │
+│  → Merge all diagnosis results                                               │
+│  → Prioritize issues by severity                                             │
+│  → Output: tuning-report.md                                                  │
+├─────────────────────────────────────────────────────────────────────────────┤
+│  action-propose-fixes: Fix Proposal Generation                               │
+│  → Generate fix strategies for each issue                                    │
+│  → Create implementation plan                                                │
+│  → Output: fix-proposals.json                                                │
+├─────────────────────────────────────────────────────────────────────────────┤
+│  action-apply-fix: Apply Selected Fix                                        │
+│  → User selects fix to apply                                                 │
+│  → Execute fix with backup                                                   │
+│  → Update state with fix result                                              │
+├─────────────────────────────────────────────────────────────────────────────┤
+│  action-verify: Verification                                                 │
+│  → Re-run affected diagnosis                                                 │
+│  → Check quality gates                                                       │
+│  → Update iteration count                                                    │
+├─────────────────────────────────────────────────────────────────────────────┤
+│  action-complete: Finalization                                               │
+│  → Generate final report                                                     │
+│  → Cleanup temporary files                                                   │
+│  → Output: tuning-summary.md                                                 │
+└─────────────────────────────────────────────────────────────────────────────┘
+```
+
+## Directory Setup
+
+```javascript
+const timestamp = new Date().toISOString().slice(0,19).replace(/[-:T]/g, '');
+const workDir = `.workflow/.scratchpad/skill-tuning-${timestamp}`;
+
+Bash(`mkdir -p "${workDir}/diagnosis"`);
+Bash(`mkdir -p "${workDir}/backups"`);
+Bash(`mkdir -p "${workDir}/fixes"`);
+```
+
+## Output Structure
+
+```
+.workflow/.scratchpad/skill-tuning-{timestamp}/
+├── state.json                      # Session state (orchestrator-managed)
+├── diagnosis/
+│   ├── context-diagnosis.json      # Context explosion analysis
+│   ├── memory-diagnosis.json       # Long-tail forgetting analysis
+│   ├── dataflow-diagnosis.json     # Data flow analysis
+│   └── agent-diagnosis.json        # Agent coordination analysis
+├── backups/
+│   └── {skill-name}-backup/        # Original skill files backup
+├── fixes/
+│   ├── fix-proposals.json          # Proposed fixes
+│   └── applied-fixes.json          # Applied fix history
+├── tuning-report.md                # Consolidated diagnosis report
+└── tuning-summary.md               # Final summary
+```
+
+## State Schema
+
+```typescript
+interface TuningState {
+  status: 'pending' | 'running' | 'completed' | 'failed';
+  target_skill: {
+    name: string;
+    path: string;
+    execution_mode: 'sequential' | 'autonomous';
+  };
+  user_issue_description: string;
+  diagnosis: {
+    context: DiagnosisResult | null;
+    memory: DiagnosisResult | null;
+    dataflow: DiagnosisResult | null;
+    agent: DiagnosisResult | null;
+  };
+  issues: Issue[];
+  proposed_fixes: Fix[];
+  applied_fixes: AppliedFix[];
+  iteration_count: number;
+  max_iterations: number;
+  quality_score: number;
+  completed_actions: string[];
+  current_action: string | null;
+  errors: Error[];
+  error_count: number;
+}
+
+interface DiagnosisResult {
+  status: 'completed' | 'skipped';
+  issues_found: number;
+  severity: 'critical' | 'high' | 'medium' | 'low' | 'none';
+  details: any;
+}
+
+interface Issue {
+  id: string;
+  type: 'context_explosion' | 'memory_loss' | 'dataflow_break' | 'agent_failure';
+  severity: 'critical' | 'high' | 'medium' | 'low';
+  location: string;
+  description: string;
+  evidence: string[];
+}
+
+interface Fix {
+  id: string;
+  issue_id: string;
+  strategy: string;
+  description: string;
+  changes: FileChange[];
+  risk: 'low' | 'medium' | 'high';
+}
+```
+
+## Reference Documents
+
+| Document | Purpose |
+|----------|---------|
+| [phases/orchestrator.md](phases/orchestrator.md) | Orchestrator decision logic |
+| [phases/state-schema.md](phases/state-schema.md) | State structure definition |
+| [phases/actions/action-init.md](phases/actions/action-init.md) | Initialize tuning session |
+| [phases/actions/action-diagnose-context.md](phases/actions/action-diagnose-context.md) | Context explosion diagnosis |
+| [phases/actions/action-diagnose-memory.md](phases/actions/action-diagnose-memory.md) | Long-tail forgetting diagnosis |
+| [phases/actions/action-diagnose-dataflow.md](phases/actions/action-diagnose-dataflow.md) | Data flow diagnosis |
+| [phases/actions/action-diagnose-agent.md](phases/actions/action-diagnose-agent.md) | Agent coordination diagnosis |
+| [phases/actions/action-generate-report.md](phases/actions/action-generate-report.md) | Report generation |
+| [phases/actions/action-propose-fixes.md](phases/actions/action-propose-fixes.md) | Fix proposal |
+| [phases/actions/action-apply-fix.md](phases/actions/action-apply-fix.md) | Fix application |
+| [phases/actions/action-verify.md](phases/actions/action-verify.md) | Verification |
+| [phases/actions/action-complete.md](phases/actions/action-complete.md) | Finalization |
+| [specs/problem-taxonomy.md](specs/problem-taxonomy.md) | Problem classification |
+| [specs/tuning-strategies.md](specs/tuning-strategies.md) | Fix strategies |
+| [specs/quality-gates.md](specs/quality-gates.md) | Quality criteria |
--- a/.claude/skills/skill-tuning/phases/actions/action-abort.md
+++ b/.claude/skills/skill-tuning/phases/actions/action-abort.md
@@ -0,0 +1,164 @@
+# Action: Abort
+
+Abort the tuning session due to unrecoverable errors.
+
+## Purpose
+
+- Safely terminate on critical failures
+- Preserve diagnostic information for debugging
+- Ensure backup remains available
+- Notify user of failure reason
+
+## Preconditions
+
+- [ ] state.error_count >= state.max_errors
+- [ ] OR critical failure detected
+
+## Execution
+
+```javascript
+async function execute(state, workDir) {
+  console.log('Aborting skill tuning session...');
+
+  const errors = state.errors;
+  const targetSkill = state.target_skill;
+
+  // Generate abort report
+  const abortReport = `# Skill Tuning Aborted
+
+**Target Skill**: ${targetSkill?.name || 'Unknown'}
+**Aborted At**: ${new Date().toISOString()}
+**Reason**: Too many errors or critical failure
+
+---
+
+## Error Log
+
+${errors.length === 0 ? '_No errors recorded_' :
+  errors.map((err, i) => `
+### Error ${i + 1}
+- **Action**: ${err.action}
+- **Message**: ${err.message}
+- **Time**: ${err.timestamp}
+- **Recoverable**: ${err.recoverable ? 'Yes' : 'No'}
+`).join('\n')}
+
+---
+
+## Session State at Abort
+
+- **Status**: ${state.status}
+- **Iteration Count**: ${state.iteration_count}
+- **Completed Actions**: ${state.completed_actions.length}
+- **Issues Found**: ${state.issues.length}
+- **Fixes Applied**: ${state.applied_fixes.length}
+
+---
+
+## Recovery Options
+
+### Option 1: Restore Original Skill
+If any changes were made, restore from backup:
+\`\`\`bash
+cp -r "${state.backup_dir}/${targetSkill?.name || 'backup'}-backup"/* "${targetSkill?.path || 'target'}/"
+\`\`\`
+
+### Option 2: Resume from Last State
+The session state is preserved at:
+\`${workDir}/state.json\`
+
+To resume:
+1. Fix the underlying issue
+2. Reset error_count in state.json
+3. Re-run skill-tuning with --resume flag
+
+### Option 3: Manual Investigation
+Review the following files:
+- Diagnosis results: \`${workDir}/diagnosis/*.json\`
+- Error log: \`${workDir}/errors.json\`
+- State snapshot: \`${workDir}/state.json\`
+
+---
+
+## Diagnostic Information
+
+### Last Successful Action
+${state.completed_actions.length > 0 ? state.completed_actions[state.completed_actions.length - 1] : 'None'}
+
+### Current Action When Failed
+${state.current_action || 'Unknown'}
+
+### Partial Diagnosis Results
+- Context: ${state.diagnosis.context ? 'Completed' : 'Not completed'}
+- Memory: ${state.diagnosis.memory ? 'Completed' : 'Not completed'}
+- Data Flow: ${state.diagnosis.dataflow ? 'Completed' : 'Not completed'}
+- Agent: ${state.diagnosis.agent ? 'Completed' : 'Not completed'}
+
+---
+
+*Skill tuning aborted - please review errors and retry*
+`;
+
+  // Write abort report
+  Write(`${workDir}/abort-report.md`, abortReport);
+
+  // Save error log
+  Write(`${workDir}/errors.json`, JSON.stringify(errors, null, 2));
+
+  // Notify user
+  await AskUserQuestion({
+    questions: [{
+      question: `Skill tuning aborted due to ${errors.length} errors. Would you like to restore the original skill?`,
+      header: 'Restore',
+      multiSelect: false,
+      options: [
+        { label: 'Yes, restore', description: 'Restore original skill from backup' },
+        { label: 'No, keep changes', description: 'Keep any partial changes made' }
+      ]
+    }]
+  }).then(async response => {
+    if (response['Restore'] === 'Yes, restore') {
+      // Restore from backup
+      if (state.backup_dir && targetSkill?.path) {
+        Bash(`cp -r "${state.backup_dir}/${targetSkill.name}-backup"/* "${targetSkill.path}/"`);
+        console.log('Original skill restored from backup.');
+      }
+    }
+  }).catch(() => {
+    // User cancelled, don't restore
+  });
+
+  return {
+    stateUpdates: {
+      status: 'failed',
+      completed_at: new Date().toISOString()
+    },
+    outputFiles: [`${workDir}/abort-report.md`, `${workDir}/errors.json`],
+    summary: `Tuning aborted: ${errors.length} errors. Check abort-report.md for details.`
+  };
+}
+```
+
+## State Updates
+
+```javascript
+return {
+  stateUpdates: {
+    status: 'failed',
+    completed_at: '<timestamp>'
+  }
+};
+```
+
+## Output
+
+- **File**: `abort-report.md`
+- **Location**: `${workDir}/abort-report.md`
+
+## Error Handling
+
+This action should not fail - it's the final error handler.
+
+## Next Actions
+
+- None (terminal state)
--- a/.claude/skills/skill-tuning/phases/actions/action-apply-fix.md
+++ b/.claude/skills/skill-tuning/phases/actions/action-apply-fix.md
@@ -0,0 +1,206 @@
+# Action: Apply Fix
+
+Apply a selected fix to the target skill with backup and rollback capability.
+
+## Purpose
+
+- Apply fix changes to target skill files
+- Create backup before modifications
+- Track applied fixes for verification
+- Support rollback if needed
+
+## Preconditions
+
+- [ ] state.status === 'running'
+- [ ] state.pending_fixes.length > 0
+- [ ] state.proposed_fixes contains the fix to apply
+
+## Execution
+
+```javascript
+async function execute(state, workDir) {
+  const pendingFixes = state.pending_fixes;
+  const proposedFixes = state.proposed_fixes;
+  const targetPath = state.target_skill.path;
+  const backupDir = state.backup_dir;
+
+  if (pendingFixes.length === 0) {
+    return {
+      stateUpdates: {},
+      outputFiles: [],
+      summary: 'No pending fixes to apply'
+    };
+  }
+
+  // Get next fix to apply
+  const fixId = pendingFixes[0];
+  const fix = proposedFixes.find(f => f.id === fixId);
+
+  if (!fix) {
+    return {
+      stateUpdates: {
+        pending_fixes: pendingFixes.slice(1),
+        errors: [...state.errors, {
+          action: 'action-apply-fix',
+          message: `Fix ${fixId} not found in proposals`,
+          timestamp: new Date().toISOString(),
+          recoverable: true
+        }]
+      },
+      outputFiles: [],
+      summary: `Fix ${fixId} not found, skipping`
+    };
+  }
+
+  console.log(`Applying fix ${fix.id}: ${fix.description}`);
+
+  // Create fix-specific backup
+  const fixBackupDir = `${backupDir}/before-${fix.id}`;
+  Bash(`mkdir -p "${fixBackupDir}"`);
+
+  const appliedChanges = [];
+  let success = true;
+
+  for (const change of fix.changes) {
+    try {
+      // Resolve file path (handle wildcards)
+      let targetFiles = [];
+      if (change.file.includes('*')) {
+        targetFiles = Glob(`${targetPath}/${change.file}`);
+      } else {
+        targetFiles = [`${targetPath}/${change.file}`];
+      }
+
+      for (const targetFile of targetFiles) {
+        // Backup original
+        const relativePath = targetFile.replace(targetPath + '/', '');
+        const backupPath = `${fixBackupDir}/${relativePath}`;
+
+        if (Glob(targetFile).length > 0) {
+          const originalContent = Read(targetFile);
+          Bash(`mkdir -p "$(dirname "${backupPath}")"`);
+          Write(backupPath, originalContent);
+        }
+
+        // Apply change based on action type
+        if (change.action === 'modify' && change.diff) {
+          // For now, append the diff as a comment/note
+          // Real implementation would parse and apply the diff
+          const existingContent = Read(targetFile);
+
+          // Simple diff application: look for context and apply
+          // This is a simplified version - real implementation would be more sophisticated
+          const newContent = existingContent + `\n\n<!-- Applied fix ${fix.id}: ${fix.description} -->\n`;
+
+          Write(targetFile, newContent);
+
+          appliedChanges.push({
+            file: relativePath,
+            action: 'modified',
+            backup: backupPath
+          });
+        } else if (change.action === 'create') {
+          Write(targetFile, change.new_content || '');
+          appliedChanges.push({
+            file: relativePath,
+            action: 'created',
+            backup: null
+          });
+        }
+      }
+    } catch (error) {
+      console.log(`Error applying change to ${change.file}: ${error.message}`);
+      success = false;
+    }
+  }
+
+  // Record applied fix
+  const appliedFix = {
+    fix_id: fix.id,
+    applied_at: new Date().toISOString(),
+    success: success,
+    backup_path: fixBackupDir,
+    verification_result: 'pending',
+    rollback_available: true,
+    changes_made: appliedChanges
+  };
+
+  // Update applied fixes log
+  const appliedFixesPath = `${workDir}/fixes/applied-fixes.json`;
+  let existingApplied = [];
+  try {
+    existingApplied = JSON.parse(Read(appliedFixesPath));
+  } catch (e) {
+    existingApplied = [];
+  }
+  existingApplied.push(appliedFix);
+  Write(appliedFixesPath, JSON.stringify(existingApplied, null, 2));
+
+  return {
+    stateUpdates: {
+      applied_fixes: [...state.applied_fixes, appliedFix],
+      pending_fixes: pendingFixes.slice(1)  // Remove applied fix from pending
+    },
+    outputFiles: [appliedFixesPath],
+    summary: `Applied fix ${fix.id}: ${success ? 'success' : 'partial'}, ${appliedChanges.length} files modified`
+  };
+}
+```
+
+## State Updates
+
+```javascript
+return {
+  stateUpdates: {
+    applied_fixes: [...existingApplied, newAppliedFix],
+    pending_fixes: remainingPendingFixes
+  }
+};
+```
+
+## Rollback Function
+
+```javascript
+async function rollbackFix(fixId, state, workDir) {
+  const appliedFix = state.applied_fixes.find(f => f.fix_id === fixId);
+
+  if (!appliedFix || !appliedFix.rollback_available) {
+    throw new Error(`Cannot rollback fix ${fixId}`);
+  }
+
+  const backupDir = appliedFix.backup_path;
+  const targetPath = state.target_skill.path;
+
+  // Restore from backup
+  const backupFiles = Glob(`${backupDir}/**/*`);
+  for (const backupFile of backupFiles) {
+    const relativePath = backupFile.replace(backupDir + '/', '');
+    const targetFile = `${targetPath}/${relativePath}`;
+    const content = Read(backupFile);
+    Write(targetFile, content);
+  }
+
+  return {
+    stateUpdates: {
+      applied_fixes: state.applied_fixes.map(f =>
+        f.fix_id === fixId
+          ? { ...f, rollback_available: false, verification_result: 'rolled_back' }
+          : f
+      )
+    }
+  };
+}
+```
+
+## Error Handling
+
+| Error Type | Recovery |
+|------------|----------|
+| File not found | Skip file, log warning |
+| Write permission error | Retry with sudo or report |
+| Backup creation failed | Abort fix, don't modify |
+
+## Next Actions
+
+- If pending_fixes.length > 0: action-apply-fix (continue)
+- If all fixes applied: action-verify
--- a/.claude/skills/skill-tuning/phases/actions/action-complete.md
+++ b/.claude/skills/skill-tuning/phases/actions/action-complete.md
@@ -0,0 +1,195 @@
+# Action: Complete
+
+Finalize the tuning session with summary report and cleanup.
+
+## Purpose
+
+- Generate final summary report
+- Record tuning statistics
+- Clean up temporary files (optional)
+- Provide recommendations for future maintenance
+
+## Preconditions
+
+- [ ] state.status === 'running'
+- [ ] quality_gate === 'pass' OR max_iterations reached
+
+## Execution
+
+```javascript
+async function execute(state, workDir) {
+  console.log('Finalizing skill tuning session...');
+
+  const targetSkill = state.target_skill;
+  const startTime = new Date(state.started_at);
+  const endTime = new Date();
+  const duration = Math.round((endTime - startTime) / 1000);
+
+  // Generate final summary
+  const summary = `# Skill Tuning Summary
+
+**Target Skill**: ${targetSkill.name}
+**Path**: ${targetSkill.path}
+**Session Duration**: ${duration} seconds
+**Completed**: ${endTime.toISOString()}
+
+---
+
+## Final Status
+
+| Metric | Value |
+|--------|-------|
+| Final Health Score | ${state.quality_score}/100 |
+| Quality Gate | ${state.quality_gate.toUpperCase()} |
+| Total Iterations | ${state.iteration_count} |
+| Issues Found | ${state.issues.length + state.applied_fixes.flatMap(f => f.issues_resolved || []).length} |
+| Issues Resolved | ${state.applied_fixes.flatMap(f => f.issues_resolved || []).length} |
+| Fixes Applied | ${state.applied_fixes.length} |
+| Fixes Verified | ${state.applied_fixes.filter(f => f.verification_result === 'pass').length} |
+
+---
+
+## Diagnosis Summary
+
+| Area | Issues Found | Severity |
+|------|--------------|----------|
+| Context Explosion | ${state.diagnosis.context?.issues_found || 'N/A'} | ${state.diagnosis.context?.severity || 'N/A'} |
+| Long-tail Forgetting | ${state.diagnosis.memory?.issues_found || 'N/A'} | ${state.diagnosis.memory?.severity || 'N/A'} |
+| Data Flow | ${state.diagnosis.dataflow?.issues_found || 'N/A'} | ${state.diagnosis.dataflow?.severity || 'N/A'} |
+| Agent Coordination | ${state.diagnosis.agent?.issues_found || 'N/A'} | ${state.diagnosis.agent?.severity || 'N/A'} |
+
+---
+
+## Applied Fixes
+
+${state.applied_fixes.length === 0 ? '_No fixes applied_' :
+  state.applied_fixes.map((fix, i) => `
+### ${i + 1}. ${fix.fix_id}
+
+- **Applied At**: ${fix.applied_at}
+- **Success**: ${fix.success ? 'Yes' : 'No'}
+- **Verification**: ${fix.verification_result}
+- **Rollback Available**: ${fix.rollback_available ? 'Yes' : 'No'}
+`).join('\n')}
+
+---
+
+## Remaining Issues
+
+${state.issues.length === 0 ? '✅ All issues resolved!' :
+  `${state.issues.length} issues remain:\n\n` +
+  state.issues.map(issue =>
+    `- **[${issue.severity.toUpperCase()}]** ${issue.description} (${issue.id})`
+  ).join('\n')}
+
+---
+
+## Recommendations
+
+${generateRecommendations(state)}
+
+---
+
+## Backup Information
+
+Original skill files backed up to:
+\`${state.backup_dir}\`
+
+To restore original skill:
+\`\`\`bash
+cp -r "${state.backup_dir}/${targetSkill.name}-backup"/* "${targetSkill.path}/"
+\`\`\`
+
+---
+
+## Session Files
+
+| File | Description |
+|------|-------------|
+| ${workDir}/tuning-report.md | Full diagnostic report |
+| ${workDir}/diagnosis/*.json | Individual diagnosis results |
+| ${workDir}/fixes/fix-proposals.json | Proposed fixes |
+| ${workDir}/fixes/applied-fixes.json | Applied fix history |
+| ${workDir}/tuning-summary.md | This summary |
+
+---
+
+*Skill tuning completed by skill-tuning*
+`;
+
+  Write(`${workDir}/tuning-summary.md`, summary);
+
+  // Update final state
+  return {
+    stateUpdates: {
+      status: 'completed',
+      completed_at: endTime.toISOString()
+    },
+    outputFiles: [`${workDir}/tuning-summary.md`],
+    summary: `Tuning complete: ${state.quality_gate} with ${state.quality_score}/100 health score`
+  };
+}
+
+function generateRecommendations(state) {
+  const recommendations = [];
+
+  // Based on remaining issues
+  if (state.issues.some(i => i.type === 'context_explosion')) {
+    recommendations.push('- **Context Management**: Consider implementing a context summarization agent to prevent token growth');
+  }
+
+  if (state.issues.some(i => i.type === 'memory_loss')) {
+    recommendations.push('- **Constraint Tracking**: Add explicit constraint injection to each phase prompt');
+  }
+
+  if (state.issues.some(i => i.type === 'dataflow_break')) {
+    recommendations.push('- **State Centralization**: Migrate to single state.json with schema validation');
+  }
+
+  if (state.issues.some(i => i.type === 'agent_failure')) {
+    recommendations.push('- **Error Handling**: Wrap all Task calls in try-catch blocks');
+  }
+
+  // General recommendations
+  if (state.iteration_count >= state.max_iterations) {
+    recommendations.push('- **Deep Refactoring**: Consider architectural review if issues persist after multiple iterations');
+  }
+
+  if (state.quality_score < 80) {
+    recommendations.push('- **Regular Tuning**: Schedule periodic skill-tuning runs to catch issues early');
+  }
+
+  if (recommendations.length === 0) {
+    recommendations.push('- Skill is in good health! Monitor for regressions during future development.');
+  }
+
+  return recommendations.join('\n');
+}
+```
+
+## State Updates
+
+```javascript
+return {
+  stateUpdates: {
+    status: 'completed',
+    completed_at: '<timestamp>'
+  }
+};
+```
+
+## Output
+
+- **File**: `tuning-summary.md`
+- **Location**: `${workDir}/tuning-summary.md`
+- **Format**: Markdown
+
+## Error Handling
+
+| Error Type | Recovery |
+|------------|----------|
+| Summary write failed | Write to alternative location |
+
+## Next Actions
+
+- None (terminal state)
--- a/.claude/skills/skill-tuning/phases/actions/action-diagnose-agent.md
+++ b/.claude/skills/skill-tuning/phases/actions/action-diagnose-agent.md
@@ -0,0 +1,317 @@
+# Action: Diagnose Agent Coordination
+
+Analyze target skill for agent coordination failures - call chain fragility and result passing issues.
+
+## Purpose
+
+- Detect fragile agent call patterns
+- Identify result passing issues
+- Find missing error handling in agent calls
+- Analyze agent return format consistency
+
+## Preconditions
+
+- [ ] state.status === 'running'
+- [ ] state.target_skill.path is set
+- [ ] 'agent' in state.focus_areas OR state.focus_areas is empty
+
+## Detection Patterns
+
+### Pattern 1: Unhandled Agent Failures
+
+```regex
+# Task calls without try-catch or error handling
+/Task\s*\(\s*\{[^}]*\}\s*\)(?![^;]*catch)/
+```
+
+### Pattern 2: Missing Return Validation
+
+```regex
+# Agent result used directly without validation
+/const\s+\w+\s*=\s*await?\s*Task\([^)]+\);\s*(?!.*(?:if|try|JSON\.parse))/
+```
+
+### Pattern 3: Inconsistent Agent Configuration
+
+```regex
+# Different agent configurations in same skill
+/subagent_type:\s*['"](\w+)['"]/g
+```
+
+### Pattern 4: Deeply Nested Agent Calls
+
+```regex
+# Agent calling another agent (nested)
+/Task\s*\([^)]*prompt:[^)]*Task\s*\(/
+```
+
+## Execution
+
+```javascript
+async function execute(state, workDir) {
+  const skillPath = state.target_skill.path;
+  const startTime = Date.now();
+  const issues = [];
+  const evidence = [];
+
+  console.log(`Diagnosing agent coordination in ${skillPath}...`);
+
+  // 1. Find all Task/agent calls
+  const allFiles = Glob(`${skillPath}/**/*.md`);
+  const agentCalls = [];
+  const agentTypes = new Set();
+
+  for (const file of allFiles) {
+    const content = Read(file);
+    const relativePath = file.replace(skillPath + '/', '');
+
+    // Find Task calls
+    const taskMatches = content.matchAll(/Task\s*\(\s*\{([^}]+)\}/g);
+    for (const match of taskMatches) {
+      const config = match[1];
+
+      // Extract agent type
+      const typeMatch = config.match(/subagent_type:\s*['"]([^'"]+)['"]/);
+      const agentType = typeMatch ? typeMatch[1] : 'unknown';
+      agentTypes.add(agentType);
+
+      // Check for error handling context
+      const hasErrorHandling = /try\s*\{.*Task|\.catch\(|await\s+Task.*\.then/s.test(
+        content.slice(Math.max(0, match.index - 100), match.index + match[0].length + 100)
+      );
+
+      // Check for result validation
+      const hasResultValidation = /JSON\.parse|if\s*\(\s*result|result\s*\?\./s.test(
+        content.slice(match.index, match.index + match[0].length + 200)
+      );
+
+      // Check for background execution
+      const runsInBackground = /run_in_background:\s*true/.test(config);
+
+      agentCalls.push({
+        file: relativePath,
+        agentType,
+        hasErrorHandling,
+        hasResultValidation,
+        runsInBackground,
+        config: config.slice(0, 200)
+      });
+    }
+  }
+
+  // 2. Analyze agent call patterns
+  const totalCalls = agentCalls.length;
+  const callsWithoutErrorHandling = agentCalls.filter(c => !c.hasErrorHandling);
+  const callsWithoutValidation = agentCalls.filter(c => !c.hasResultValidation);
+
+  // Issue: Missing error handling
+  if (callsWithoutErrorHandling.length > 0) {
+    issues.push({
+      id: `AGT-${issues.length + 1}`,
+      type: 'agent_failure',
+      severity: callsWithoutErrorHandling.length > 2 ? 'high' : 'medium',
+      location: { file: 'multiple' },
+      description: `${callsWithoutErrorHandling.length}/${totalCalls} agent calls lack error handling`,
+      evidence: callsWithoutErrorHandling.slice(0, 3).map(c =>
+        `${c.file}: ${c.agentType}`
+      ),
+      root_cause: 'Agent failures not caught, may crash workflow',
+      impact: 'Unhandled agent errors cause cascading failures',
+      suggested_fix: 'Wrap Task calls in try-catch with graceful fallback'
+    });
+    evidence.push({
+      file: 'multiple',
+      pattern: 'missing_error_handling',
+      context: `${callsWithoutErrorHandling.length} calls affected`,
+      severity: 'high'
+    });
+  }
+
+  // Issue: Missing result validation
+  if (callsWithoutValidation.length > 0) {
+    issues.push({
+      id: `AGT-${issues.length + 1}`,
+      type: 'agent_failure',
+      severity: 'medium',
+      location: { file: 'multiple' },
+      description: `${callsWithoutValidation.length}/${totalCalls} agent calls lack result validation`,
+      evidence: callsWithoutValidation.slice(0, 3).map(c =>
+        `${c.file}: ${c.agentType} result not validated`
+      ),
+      root_cause: 'Agent results used directly without type checking',
+      impact: 'Invalid agent output may corrupt state',
+      suggested_fix: 'Add JSON.parse with try-catch and schema validation'
+    });
+  }
+
+  // 3. Check for inconsistent agent types usage
+  if (agentTypes.size > 3 && state.target_skill.execution_mode === 'autonomous') {
+    issues.push({
+      id: `AGT-${issues.length + 1}`,
+      type: 'agent_failure',
+      severity: 'low',
+      location: { file: 'multiple' },
+      description: `Using ${agentTypes.size} different agent types`,
+      evidence: [...agentTypes].slice(0, 5),
+      root_cause: 'Multiple agent types increase coordination complexity',
+      impact: 'Different agent behaviors may cause inconsistency',
+      suggested_fix: 'Standardize on fewer agent types with clear roles'
+    });
+  }
+
+  // 4. Check for nested agent calls
+  for (const file of allFiles) {
+    const content = Read(file);
+    const relativePath = file.replace(skillPath + '/', '');
+
+    // Detect nested Task calls
+    const hasNestedTask = /Task\s*\([^)]*prompt:[^)]*Task\s*\(/s.test(content);
+
+    if (hasNestedTask) {
+      issues.push({
+        id: `AGT-${issues.length + 1}`,
+        type: 'agent_failure',
+        severity: 'high',
+        location: { file: relativePath },
+        description: 'Nested agent calls detected',
+        evidence: ['Agent prompt contains another Task call'],
+        root_cause: 'Agent calls another agent, creating deep nesting',
+        impact: 'Context explosion, hard to debug, unpredictable behavior',
+        suggested_fix: 'Flatten agent calls, use orchestrator to coordinate'
+      });
+    }
+  }
+
+  // 5. Check SKILL.md for agent configuration consistency
+  const skillMd = Read(`${skillPath}/SKILL.md`);
+
+  // Check if allowed-tools includes Task
+  const allowedTools = skillMd.match(/allowed-tools:\s*([^\n]+)/i);
+  if (allowedTools && !allowedTools[1].includes('Task') && totalCalls > 0) {
+    issues.push({
+      id: `AGT-${issues.length + 1}`,
+      type: 'agent_failure',
+      severity: 'medium',
+      location: { file: 'SKILL.md' },
+      description: 'Task tool used but not declared in allowed-tools',
+      evidence: [`${totalCalls} Task calls found, but Task not in allowed-tools`],
+      root_cause: 'Tool declaration mismatch',
+      impact: 'May cause runtime permission issues',
+      suggested_fix: 'Add Task to allowed-tools in SKILL.md front matter'
+    });
+  }
+
+  // 6. Check for agent result format consistency
+  const returnFormats = new Set();
+  for (const file of allFiles) {
+    const content = Read(file);
+
+    // Look for return format definitions
+    const returnMatch = content.match(/\[RETURN\][^[]*|return\s*\{[^}]+\}/gi);
+    if (returnMatch) {
+      returnMatch.forEach(r => {
+        const format = r.includes('JSON') ? 'json' :
+                       r.includes('summary') ? 'summary' :
+                       r.includes('file') ? 'file_path' : 'other';
+        returnFormats.add(format);
+      });
+    }
+  }
+
+  if (returnFormats.size > 2) {
+    issues.push({
+      id: `AGT-${issues.length + 1}`,
+      type: 'agent_failure',
+      severity: 'medium',
+      location: { file: 'multiple' },
+      description: 'Inconsistent agent return formats',
+      evidence: [...returnFormats],
+      root_cause: 'Different agents return data in different formats',
+      impact: 'Orchestrator must handle multiple format types',
+      suggested_fix: 'Standardize return format: {status, output_file, summary}'
+    });
+  }
+
+  // 7. Calculate severity
+  const criticalCount = issues.filter(i => i.severity === 'critical').length;
+  const highCount = issues.filter(i => i.severity === 'high').length;
+  const severity = criticalCount > 0 ? 'critical' :
+                   highCount > 1 ? 'high' :
+                   highCount > 0 ? 'medium' :
+                   issues.length > 0 ? 'low' : 'none';
+
+  // 8. Write diagnosis result
+  const diagnosisResult = {
+    status: 'completed',
+    issues_found: issues.length,
+    severity: severity,
+    execution_time_ms: Date.now() - startTime,
+    details: {
+      patterns_checked: [
+        'error_handling',
+        'result_validation',
+        'agent_type_consistency',
+        'nested_calls',
+        'return_format_consistency'
+      ],
+      patterns_matched: evidence.map(e => e.pattern),
+      evidence: evidence,
+      agent_analysis: {
+        total_agent_calls: totalCalls,
+        unique_agent_types: agentTypes.size,
+        calls_without_error_handling: callsWithoutErrorHandling.length,
+        calls_without_validation: callsWithoutValidation.length,
+        agent_types_used: [...agentTypes]
+      },
+      recommendations: [
+        callsWithoutErrorHandling.length > 0
+          ? 'Add try-catch to all Task calls' : null,
+        callsWithoutValidation.length > 0
+          ? 'Add result validation with JSON.parse and schema check' : null,
+        agentTypes.size > 3
+          ? 'Consolidate agent types for consistency' : null
+      ].filter(Boolean)
+    }
+  };
+
+  Write(`${workDir}/diagnosis/agent-diagnosis.json`,
+        JSON.stringify(diagnosisResult, null, 2));
+
+  return {
+    stateUpdates: {
+      'diagnosis.agent': diagnosisResult,
+      issues: [...state.issues, ...issues]
+    },
+    outputFiles: [`${workDir}/diagnosis/agent-diagnosis.json`],
+    summary: `Agent diagnosis: ${issues.length} issues found (severity: ${severity})`
+  };
+}
+```
+
+## State Updates
+
+```javascript
+return {
+  stateUpdates: {
+    'diagnosis.agent': {
+      status: 'completed',
+      issues_found: <count>,
+      severity: '<critical|high|medium|low|none>',
+      // ... full diagnosis result
+    },
+    issues: [...existingIssues, ...newIssues]
+  }
+};
+```
+
+## Error Handling
+
+| Error Type | Recovery |
+|------------|----------|
+| Regex match error | Use simpler patterns |
+| File access error | Skip and continue |
+
+## Next Actions
+
+- Success: action-generate-report
+- Skipped: If 'agent' not in focus_areas
--- a/.claude/skills/skill-tuning/phases/actions/action-diagnose-context.md
+++ b/.claude/skills/skill-tuning/phases/actions/action-diagnose-context.md
@@ -0,0 +1,243 @@
+# Action: Diagnose Context Explosion
+
+Analyze target skill for context explosion issues - token accumulation and multi-turn dialogue bloat.
+
+## Purpose
+
+- Detect patterns that cause context growth
+- Identify multi-turn accumulation points
+- Find missing context compression mechanisms
+- Measure potential token waste
+
+## Preconditions
+
+- [ ] state.status === 'running'
+- [ ] state.target_skill.path is set
+- [ ] 'context' in state.focus_areas OR state.focus_areas is empty
+
+## Detection Patterns
+
+### Pattern 1: Unbounded History Accumulation
+
+```regex
+# Patterns that suggest history accumulation
+/\bhistory\b.*\.push\b/
+/\bmessages\b.*\.concat\b/
+/\bconversation\b.*\+=\b/
+/\bappend.*context\b/i
+```
+
+### Pattern 2: Full Content Passing
+
+```regex
+# Patterns that pass full content instead of references
+/Read\([^)]+\).*\+.*Read\(/
+/JSON\.stringify\(.*state\)/  # Full state serialization
+/\$\{.*content\}/  # Template literal with full content
+```
+
+### Pattern 3: Missing Summarization
+
+```regex
+# Absence of compression/summarization
+# Check for lack of: summarize, compress, truncate, slice
+```
+
+### Pattern 4: Agent Return Bloat
+
+```regex
+# Agent returning full content instead of path + summary
+/return\s*\{[^}]*content:/
+/return.*JSON\.stringify/
+```
+
+## Execution
+
+```javascript
+async function execute(state, workDir) {
+  const skillPath = state.target_skill.path;
+  const startTime = Date.now();
+  const issues = [];
+  const evidence = [];
+
+  console.log(`Diagnosing context explosion in ${skillPath}...`);
+
+  // 1. Scan all phase files
+  const phaseFiles = Glob(`${skillPath}/phases/**/*.md`);
+
+  for (const file of phaseFiles) {
+    const content = Read(file);
+    const relativePath = file.replace(skillPath + '/', '');
+
+    // Check Pattern 1: History accumulation
+    const historyPatterns = [
+      /history\s*[.=].*push|concat|append/gi,
+      /messages\s*=\s*\[.*\.\.\..*messages/gi,
+      /conversation.*\+=/gi
+    ];
+
+    for (const pattern of historyPatterns) {
+      const matches = content.match(pattern);
+      if (matches) {
+        issues.push({
+          id: `CTX-${issues.length + 1}`,
+          type: 'context_explosion',
+          severity: 'high',
+          location: { file: relativePath },
+          description: 'Unbounded history accumulation detected',
+          evidence: matches.slice(0, 3),
+          root_cause: 'History/messages array grows without bounds',
+          impact: 'Token count increases linearly with iterations',
+          suggested_fix: 'Implement sliding window or summarization'
+        });
+        evidence.push({
+          file: relativePath,
+          pattern: 'history_accumulation',
+          context: matches[0],
+          severity: 'high'
+        });
+      }
+    }
+
+    // Check Pattern 2: Full content passing
+    const contentPatterns = [
+      /Read\s*\([^)]+\)\s*[\+,]/g,
+      /JSON\.stringify\s*\(\s*state\s*\)/g,
+      /\$\{[^}]*content[^}]*\}/g
+    ];
+
+    for (const pattern of contentPatterns) {
+      const matches = content.match(pattern);
+      if (matches) {
+        issues.push({
+          id: `CTX-${issues.length + 1}`,
+          type: 'context_explosion',
+          severity: 'medium',
+          location: { file: relativePath },
+          description: 'Full content passed instead of reference',
+          evidence: matches.slice(0, 3),
+          root_cause: 'Entire file/state content included in prompts',
+          impact: 'Unnecessary token consumption',
+          suggested_fix: 'Pass file paths and summaries instead of full content'
+        });
+        evidence.push({
+          file: relativePath,
+          pattern: 'full_content_passing',
+          context: matches[0],
+          severity: 'medium'
+        });
+      }
+    }
+
+    // Check Pattern 3: Missing summarization
+    const hasSummarization = /summariz|compress|truncat|slice.*context/i.test(content);
+    const hasLongPrompts = content.length > 5000;
+
+    if (hasLongPrompts && !hasSummarization) {
+      issues.push({
+        id: `CTX-${issues.length + 1}`,
+        type: 'context_explosion',
+        severity: 'medium',
+        location: { file: relativePath },
+        description: 'Long phase file without summarization mechanism',
+        evidence: [`File length: ${content.length} chars`],
+        root_cause: 'No context compression for large content',
+        impact: 'Potential token overflow in long sessions',
+        suggested_fix: 'Add context summarization before passing to agents'
+      });
+    }
+
+    // Check Pattern 4: Agent return bloat
+    const returnPatterns = /return\s*\{[^}]*(?:content|full_output|complete_result):/g;
+    const returnMatches = content.match(returnPatterns);
+    if (returnMatches) {
+      issues.push({
+        id: `CTX-${issues.length + 1}`,
+        type: 'context_explosion',
+        severity: 'high',
+        location: { file: relativePath },
+        description: 'Agent returns full content instead of path+summary',
+        evidence: returnMatches.slice(0, 3),
+        root_cause: 'Agent output includes complete content',
+        impact: 'Context bloat when orchestrator receives full output',
+        suggested_fix: 'Return {output_file, summary} instead of {content}'
+      });
+    }
+  }
+
+  // 2. Calculate severity
+  const criticalCount = issues.filter(i => i.severity === 'critical').length;
+  const highCount = issues.filter(i => i.severity === 'high').length;
+  const severity = criticalCount > 0 ? 'critical' :
+                   highCount > 2 ? 'high' :
+                   highCount > 0 ? 'medium' :
+                   issues.length > 0 ? 'low' : 'none';
+
+  // 3. Write diagnosis result
+  const diagnosisResult = {
+    status: 'completed',
+    issues_found: issues.length,
+    severity: severity,
+    execution_time_ms: Date.now() - startTime,
+    details: {
+      patterns_checked: [
+        'history_accumulation',
+        'full_content_passing',
+        'missing_summarization',
+        'agent_return_bloat'
+      ],
+      patterns_matched: evidence.map(e => e.pattern),
+      evidence: evidence,
+      recommendations: [
+        issues.length > 0 ? 'Implement context summarization agent' : null,
+        highCount > 0 ? 'Add sliding window for conversation history' : null,
+        evidence.some(e => e.pattern === 'full_content_passing')
+          ? 'Refactor to pass file paths instead of content' : null
+      ].filter(Boolean)
+    }
+  };
+
+  Write(`${workDir}/diagnosis/context-diagnosis.json`,
+        JSON.stringify(diagnosisResult, null, 2));
+
+  return {
+    stateUpdates: {
+      'diagnosis.context': diagnosisResult,
+      issues: [...state.issues, ...issues],
+      'issues_by_severity.critical': state.issues_by_severity.critical + criticalCount,
+      'issues_by_severity.high': state.issues_by_severity.high + highCount
+    },
+    outputFiles: [`${workDir}/diagnosis/context-diagnosis.json`],
+    summary: `Context diagnosis: ${issues.length} issues found (severity: ${severity})`
+  };
+}
+```
+
+## State Updates
+
+```javascript
+return {
+  stateUpdates: {
+    'diagnosis.context': {
+      status: 'completed',
+      issues_found: <count>,
+      severity: '<critical|high|medium|low|none>',
+      // ... full diagnosis result
+    },
+    issues: [...existingIssues, ...newIssues]
+  }
+};
+```
+
+## Error Handling
+
+| Error Type | Recovery |
+|------------|----------|
+| File read error | Skip file, log warning |
+| Pattern matching error | Use fallback patterns |
+| Write error | Retry to alternative path |
+
+## Next Actions
+
+- Success: action-diagnose-memory (or next in focus_areas)
+- Skipped: If 'context' not in focus_areas
--- a/.claude/skills/skill-tuning/phases/actions/action-diagnose-dataflow.md
+++ b/.claude/skills/skill-tuning/phases/actions/action-diagnose-dataflow.md
@@ -0,0 +1,318 @@
+# Action: Diagnose Data Flow Issues
+
+Analyze target skill for data flow disruption - state inconsistencies and format variations.
+
+## Purpose
+
+- Detect inconsistent data formats between phases
+- Identify scattered state storage
+- Find missing data contracts
+- Measure state transition integrity
+
+## Preconditions
+
+- [ ] state.status === 'running'
+- [ ] state.target_skill.path is set
+- [ ] 'dataflow' in state.focus_areas OR state.focus_areas is empty
+
+## Detection Patterns
+
+### Pattern 1: Multiple Storage Locations
+
+```regex
+# Data written to multiple paths without centralization
+/Write\s*\(\s*[`'"][^`'"]+[`'"]/g
+```
+
+### Pattern 2: Inconsistent Field Names
+
+```regex
+# Same concept with different names: title/name, id/identifier
+```
+
+### Pattern 3: Missing Schema Validation
+
+```regex
+# Absence of validation before state write
+# Look for lack of: validate, schema, check, verify
+```
+
+### Pattern 4: Format Transformation Without Normalization
+
+```regex
+# Direct JSON.parse without error handling or normalization
+/JSON\.parse\([^)]+\)(?!\s*\|\|)/
+```
+
+## Execution
+
+```javascript
+async function execute(state, workDir) {
+  const skillPath = state.target_skill.path;
+  const startTime = Date.now();
+  const issues = [];
+  const evidence = [];
+
+  console.log(`Diagnosing data flow in ${skillPath}...`);
+
+  // 1. Collect all Write operations to map data storage
+  const allFiles = Glob(`${skillPath}/**/*.md`);
+  const writeLocations = [];
+  const readLocations = [];
+
+  for (const file of allFiles) {
+    const content = Read(file);
+    const relativePath = file.replace(skillPath + '/', '');
+
+    // Find Write operations
+    const writeMatches = content.matchAll(/Write\s*\(\s*[`'"]([^`'"]+)[`'"]/g);
+    for (const match of writeMatches) {
+      writeLocations.push({
+        file: relativePath,
+        target: match[1],
+        isStateFile: match[1].includes('state.json') || match[1].includes('config.json')
+      });
+    }
+
+    // Find Read operations
+    const readMatches = content.matchAll(/Read\s*\(\s*[`'"]([^`'"]+)[`'"]/g);
+    for (const match of readMatches) {
+      readLocations.push({
+        file: relativePath,
+        source: match[1]
+      });
+    }
+  }
+
+  // 2. Check for scattered state storage
+  const stateTargets = writeLocations
+    .filter(w => w.isStateFile)
+    .map(w => w.target);
+
+  const uniqueStateFiles = [...new Set(stateTargets)];
+
+  if (uniqueStateFiles.length > 2) {
+    issues.push({
+      id: `DF-${issues.length + 1}`,
+      type: 'dataflow_break',
+      severity: 'high',
+      location: { file: 'multiple' },
+      description: `State stored in ${uniqueStateFiles.length} different locations`,
+      evidence: uniqueStateFiles.slice(0, 5),
+      root_cause: 'No centralized state management',
+      impact: 'State inconsistency between phases',
+      suggested_fix: 'Centralize state to single state.json with state manager'
+    });
+    evidence.push({
+      file: 'multiple',
+      pattern: 'scattered_state',
+      context: uniqueStateFiles.join(', '),
+      severity: 'high'
+    });
+  }
+
+  // 3. Check for inconsistent field naming
+  const fieldNamePatterns = {
+    'name_vs_title': [/\.name\b/, /\.title\b/],
+    'id_vs_identifier': [/\.id\b/, /\.identifier\b/],
+    'status_vs_state': [/\.status\b/, /\.state\b/],
+    'error_vs_errors': [/\.error\b/, /\.errors\b/]
+  };
+
+  const fieldUsage = {};
+
+  for (const file of allFiles) {
+    const content = Read(file);
+    const relativePath = file.replace(skillPath + '/', '');
+
+    for (const [patternName, patterns] of Object.entries(fieldNamePatterns)) {
+      for (const pattern of patterns) {
+        if (pattern.test(content)) {
+          if (!fieldUsage[patternName]) fieldUsage[patternName] = [];
+          fieldUsage[patternName].push({
+            file: relativePath,
+            pattern: pattern.toString()
+          });
+        }
+      }
+    }
+  }
+
+  for (const [patternName, usages] of Object.entries(fieldUsage)) {
+    const uniquePatterns = [...new Set(usages.map(u => u.pattern))];
+    if (uniquePatterns.length > 1) {
+      issues.push({
+        id: `DF-${issues.length + 1}`,
+        type: 'dataflow_break',
+        severity: 'medium',
+        location: { file: 'multiple' },
+        description: `Inconsistent field naming: ${patternName.replace('_vs_', ' vs ')}`,
+        evidence: usages.slice(0, 3).map(u => `${u.file}: ${u.pattern}`),
+        root_cause: 'Same concept referred to with different field names',
+        impact: 'Data may be lost during field access',
+        suggested_fix: `Standardize to single field name, add normalization function`
+      });
+    }
+  }
+
+  // 4. Check for missing schema validation
+  for (const file of allFiles) {
+    const content = Read(file);
+    const relativePath = file.replace(skillPath + '/', '');
+
+    // Find JSON.parse without validation
+    const unsafeParses = content.match(/JSON\.parse\s*\([^)]+\)(?!\s*\?\?|\s*\|\|)/g);
+    const hasValidation = /validat|schema|type.*check/i.test(content);
+
+    if (unsafeParses && unsafeParses.length > 0 && !hasValidation) {
+      issues.push({
+        id: `DF-${issues.length + 1}`,
+        type: 'dataflow_break',
+        severity: 'medium',
+        location: { file: relativePath },
+        description: 'JSON parsing without validation',
+        evidence: unsafeParses.slice(0, 2),
+        root_cause: 'No schema validation after parsing',
+        impact: 'Invalid data may propagate through phases',
+        suggested_fix: 'Add schema validation after JSON.parse'
+      });
+    }
+  }
+
+  // 5. Check state schema if exists
+  const stateSchemaFile = Glob(`${skillPath}/phases/state-schema.md`)[0];
+  if (stateSchemaFile) {
+    const schemaContent = Read(stateSchemaFile);
+
+    // Check for type definitions
+    const hasTypeScript = /interface\s+\w+|type\s+\w+\s*=/i.test(schemaContent);
+    const hasValidationFunction = /function\s+validate|validateState/i.test(schemaContent);
+
+    if (hasTypeScript && !hasValidationFunction) {
+      issues.push({
+        id: `DF-${issues.length + 1}`,
+        type: 'dataflow_break',
+        severity: 'low',
+        location: { file: 'phases/state-schema.md' },
+        description: 'Type definitions without runtime validation',
+        evidence: ['TypeScript interfaces defined but no validation function'],
+        root_cause: 'Types are compile-time only, not enforced at runtime',
+        impact: 'Schema violations may occur at runtime',
+        suggested_fix: 'Add validateState() function using Zod or manual checks'
+      });
+    }
+  } else if (state.target_skill.execution_mode === 'autonomous') {
+    issues.push({
+      id: `DF-${issues.length + 1}`,
+      type: 'dataflow_break',
+      severity: 'high',
+      location: { file: 'phases/' },
+      description: 'Autonomous skill missing state-schema.md',
+      evidence: ['No state schema definition found'],
+      root_cause: 'State structure undefined for orchestrator',
+      impact: 'Inconsistent state handling across actions',
+      suggested_fix: 'Create phases/state-schema.md with explicit type definitions'
+    });
+  }
+
+  // 6. Check read-write alignment
+  const writtenFiles = new Set(writeLocations.map(w => w.target));
+  const readFiles = new Set(readLocations.map(r => r.source));
+
+  const writtenButNotRead = [...writtenFiles].filter(f =>
+    !readFiles.has(f) && !f.includes('output') && !f.includes('report')
+  );
+
+  if (writtenButNotRead.length > 0) {
+    issues.push({
+      id: `DF-${issues.length + 1}`,
+      type: 'dataflow_break',
+      severity: 'low',
+      location: { file: 'multiple' },
+      description: 'Files written but never read',
+      evidence: writtenButNotRead.slice(0, 3),
+      root_cause: 'Orphaned output files',
+      impact: 'Wasted storage and potential confusion',
+      suggested_fix: 'Remove unused writes or add reads where needed'
+    });
+  }
+
+  // 7. Calculate severity
+  const criticalCount = issues.filter(i => i.severity === 'critical').length;
+  const highCount = issues.filter(i => i.severity === 'high').length;
+  const severity = criticalCount > 0 ? 'critical' :
+                   highCount > 1 ? 'high' :
+                   highCount > 0 ? 'medium' :
+                   issues.length > 0 ? 'low' : 'none';
+
+  // 8. Write diagnosis result
+  const diagnosisResult = {
+    status: 'completed',
+    issues_found: issues.length,
+    severity: severity,
+    execution_time_ms: Date.now() - startTime,
+    details: {
+      patterns_checked: [
+        'scattered_state',
+        'inconsistent_naming',
+        'missing_validation',
+        'read_write_alignment'
+      ],
+      patterns_matched: evidence.map(e => e.pattern),
+      evidence: evidence,
+      data_flow_map: {
+        write_locations: writeLocations.length,
+        read_locations: readLocations.length,
+        unique_state_files: uniqueStateFiles.length
+      },
+      recommendations: [
+        uniqueStateFiles.length > 2 ? 'Implement centralized state manager' : null,
+        issues.some(i => i.description.includes('naming'))
+          ? 'Create normalization layer for field names' : null,
+        issues.some(i => i.description.includes('validation'))
+          ? 'Add Zod or JSON Schema validation' : null
+      ].filter(Boolean)
+    }
+  };
+
+  Write(`${workDir}/diagnosis/dataflow-diagnosis.json`,
+        JSON.stringify(diagnosisResult, null, 2));
+
+  return {
+    stateUpdates: {
+      'diagnosis.dataflow': diagnosisResult,
+      issues: [...state.issues, ...issues]
+    },
+    outputFiles: [`${workDir}/diagnosis/dataflow-diagnosis.json`],
+    summary: `Data flow diagnosis: ${issues.length} issues found (severity: ${severity})`
+  };
+}
+```
+
+## State Updates
+
+```javascript
+return {
+  stateUpdates: {
+    'diagnosis.dataflow': {
+      status: 'completed',
+      issues_found: <count>,
+      severity: '<critical|high|medium|low|none>',
+      // ... full diagnosis result
+    },
+    issues: [...existingIssues, ...newIssues]
+  }
+};
+```
+
+## Error Handling
+
+| Error Type | Recovery |
+|------------|----------|
+| Glob pattern error | Use fallback patterns |
+| File read error | Skip and continue |
+
+## Next Actions
+
+- Success: action-diagnose-agent (or next in focus_areas)
+- Skipped: If 'dataflow' not in focus_areas
--- a/.claude/skills/skill-tuning/phases/actions/action-diagnose-memory.md
+++ b/.claude/skills/skill-tuning/phases/actions/action-diagnose-memory.md
@@ -0,0 +1,269 @@
+# Action: Diagnose Long-tail Forgetting
+
+Analyze target skill for long-tail effect and constraint forgetting issues.
+
+## Purpose
+
+- Detect loss of early instructions in long execution chains
+- Identify missing constraint propagation mechanisms
+- Find weak goal alignment between phases
+- Measure instruction retention across phases
+
+## Preconditions
+
+- [ ] state.status === 'running'
+- [ ] state.target_skill.path is set
+- [ ] 'memory' in state.focus_areas OR state.focus_areas is empty
+
+## Detection Patterns
+
+### Pattern 1: Missing Constraint References
+
+```regex
+# Phases that don't reference original requirements
+# Look for absence of: requirements, constraints, original, initial, user_request
+```
+
+### Pattern 2: Goal Drift
+
+```regex
+# Later phases focus on immediate task without global context
+/\[TASK\][^[]*(?!\[CONSTRAINTS\]|\[REQUIREMENTS\])/
+```
+
+### Pattern 3: No Checkpoint Mechanism
+
+```regex
+# Absence of state preservation at key points
+# Look for lack of: checkpoint, snapshot, preserve, restore
+```
+
+### Pattern 4: Implicit State Passing
+
+```regex
+# State passed implicitly through conversation rather than explicitly
+/(?<!state\.)context\./
+```
+
+## Execution
+
+```javascript
+async function execute(state, workDir) {
+  const skillPath = state.target_skill.path;
+  const startTime = Date.now();
+  const issues = [];
+  const evidence = [];
+
+  console.log(`Diagnosing long-tail forgetting in ${skillPath}...`);
+
+  // 1. Analyze phase chain for constraint propagation
+  const phaseFiles = Glob(`${skillPath}/phases/*.md`)
+    .filter(f => !f.includes('orchestrator') && !f.includes('state-schema'))
+    .sort();
+
+  // Extract phase order (for sequential) or action dependencies (for autonomous)
+  const isAutonomous = state.target_skill.execution_mode === 'autonomous';
+
+  // 2. Check each phase for constraint awareness
+  let firstPhaseConstraints = [];
+
+  for (let i = 0; i < phaseFiles.length; i++) {
+    const file = phaseFiles[i];
+    const content = Read(file);
+    const relativePath = file.replace(skillPath + '/', '');
+    const phaseNum = i + 1;
+
+    // Extract constraints from first phase
+    if (i === 0) {
+      const constraintMatch = content.match(/\[CONSTRAINTS?\]([^[]*)/i);
+      if (constraintMatch) {
+        firstPhaseConstraints = constraintMatch[1]
+          .split('\n')
+          .filter(l => l.trim().startsWith('-'))
+          .map(l => l.trim().replace(/^-\s*/, ''));
+      }
+    }
+
+    // Check if later phases reference original constraints
+    if (i > 0 && firstPhaseConstraints.length > 0) {
+      const mentionsConstraints = firstPhaseConstraints.some(c =>
+        content.toLowerCase().includes(c.toLowerCase().slice(0, 20))
+      );
+
+      if (!mentionsConstraints) {
+        issues.push({
+          id: `MEM-${issues.length + 1}`,
+          type: 'memory_loss',
+          severity: 'high',
+          location: { file: relativePath, phase: `Phase ${phaseNum}` },
+          description: `Phase ${phaseNum} does not reference original constraints`,
+          evidence: [`Original constraints: ${firstPhaseConstraints.slice(0, 3).join(', ')}`],
+          root_cause: 'Constraint information not propagated to later phases',
+          impact: 'May produce output violating original requirements',
+          suggested_fix: 'Add explicit constraint injection or reference to state.original_constraints'
+        });
+        evidence.push({
+          file: relativePath,
+          pattern: 'missing_constraint_reference',
+          context: `Phase ${phaseNum} of ${phaseFiles.length}`,
+          severity: 'high'
+        });
+      }
+    }
+
+    // Check for goal drift - task without constraints
+    const hasTask = /\[TASK\]/i.test(content);
+    const hasConstraints = /\[CONSTRAINTS?\]|\[REQUIREMENTS?\]|\[RULES?\]/i.test(content);
+
+    if (hasTask && !hasConstraints && i > 1) {
+      issues.push({
+        id: `MEM-${issues.length + 1}`,
+        type: 'memory_loss',
+        severity: 'medium',
+        location: { file: relativePath },
+        description: 'Phase has TASK but no CONSTRAINTS/RULES section',
+        evidence: ['Task defined without boundary constraints'],
+        root_cause: 'Agent may not adhere to global constraints',
+        impact: 'Potential goal drift from original intent',
+        suggested_fix: 'Add [CONSTRAINTS] section referencing global rules'
+      });
+    }
+
+    // Check for checkpoint mechanism
+    const hasCheckpoint = /checkpoint|snapshot|preserve|savepoint/i.test(content);
+    const isKeyPhase = i === Math.floor(phaseFiles.length / 2) || i === phaseFiles.length - 1;
+
+    if (isKeyPhase && !hasCheckpoint && phaseFiles.length > 3) {
+      issues.push({
+        id: `MEM-${issues.length + 1}`,
+        type: 'memory_loss',
+        severity: 'low',
+        location: { file: relativePath },
+        description: 'Key phase without checkpoint mechanism',
+        evidence: [`Phase ${phaseNum} is a key milestone but has no state preservation`],
+        root_cause: 'Cannot recover from failures or verify constraint adherence',
+        impact: 'No rollback capability if constraints violated',
+        suggested_fix: 'Add checkpoint before major state changes'
+      });
+    }
+  }
+
+  // 3. Check for explicit state schema with constraints field
+  const stateSchemaFile = Glob(`${skillPath}/phases/state-schema.md`)[0];
+  if (stateSchemaFile) {
+    const schemaContent = Read(stateSchemaFile);
+    const hasConstraintsField = /constraints|requirements|original_request/i.test(schemaContent);
+
+    if (!hasConstraintsField) {
+      issues.push({
+        id: `MEM-${issues.length + 1}`,
+        type: 'memory_loss',
+        severity: 'medium',
+        location: { file: 'phases/state-schema.md' },
+        description: 'State schema lacks constraints/requirements field',
+        evidence: ['No dedicated field for preserving original requirements'],
+        root_cause: 'State structure does not support constraint persistence',
+        impact: 'Constraints may be lost during state transitions',
+        suggested_fix: 'Add original_requirements field to state schema'
+      });
+    }
+  }
+
+  // 4. Check SKILL.md for constraint enforcement in execution flow
+  const skillMd = Read(`${skillPath}/SKILL.md`);
+  const hasConstraintVerification = /constraint.*verif|verif.*constraint|quality.*gate/i.test(skillMd);
+
+  if (!hasConstraintVerification && phaseFiles.length > 3) {
+    issues.push({
+      id: `MEM-${issues.length + 1}`,
+      type: 'memory_loss',
+      severity: 'medium',
+      location: { file: 'SKILL.md' },
+      description: 'No constraint verification step in execution flow',
+      evidence: ['Execution flow lacks quality gate or constraint check'],
+      root_cause: 'No mechanism to verify output matches original intent',
+      impact: 'Constraint violations may go undetected',
+      suggested_fix: 'Add verification phase comparing output to original requirements'
+    });
+  }
+
+  // 5. Calculate severity
+  const criticalCount = issues.filter(i => i.severity === 'critical').length;
+  const highCount = issues.filter(i => i.severity === 'high').length;
+  const severity = criticalCount > 0 ? 'critical' :
+                   highCount > 2 ? 'high' :
+                   highCount > 0 ? 'medium' :
+                   issues.length > 0 ? 'low' : 'none';
+
+  // 6. Write diagnosis result
+  const diagnosisResult = {
+    status: 'completed',
+    issues_found: issues.length,
+    severity: severity,
+    execution_time_ms: Date.now() - startTime,
+    details: {
+      patterns_checked: [
+        'constraint_propagation',
+        'goal_drift',
+        'checkpoint_mechanism',
+        'state_schema_constraints'
+      ],
+      patterns_matched: evidence.map(e => e.pattern),
+      evidence: evidence,
+      phase_analysis: {
+        total_phases: phaseFiles.length,
+        first_phase_constraints: firstPhaseConstraints.length,
+        phases_with_constraint_ref: phaseFiles.length - issues.filter(i =>
+          i.description.includes('does not reference')).length
+      },
+      recommendations: [
+        highCount > 0 ? 'Implement constraint injection at each phase' : null,
+        issues.some(i => i.description.includes('checkpoint'))
+          ? 'Add checkpoint/restore mechanism' : null,
+        issues.some(i => i.description.includes('State schema'))
+          ? 'Add original_requirements to state schema' : null
+      ].filter(Boolean)
+    }
+  };
+
+  Write(`${workDir}/diagnosis/memory-diagnosis.json`,
+        JSON.stringify(diagnosisResult, null, 2));
+
+  return {
+    stateUpdates: {
+      'diagnosis.memory': diagnosisResult,
+      issues: [...state.issues, ...issues]
+    },
+    outputFiles: [`${workDir}/diagnosis/memory-diagnosis.json`],
+    summary: `Memory diagnosis: ${issues.length} issues found (severity: ${severity})`
+  };
+}
+```
+
+## State Updates
+
+```javascript
+return {
+  stateUpdates: {
+    'diagnosis.memory': {
+      status: 'completed',
+      issues_found: <count>,
+      severity: '<critical|high|medium|low|none>',
+      // ... full diagnosis result
+    },
+    issues: [...existingIssues, ...newIssues]
+  }
+};
+```
+
+## Error Handling
+
+| Error Type | Recovery |
+|------------|----------|
+| Phase file read error | Skip file, continue analysis |
+| No phases found | Report as structure issue |
+
+## Next Actions
+
+- Success: action-diagnose-dataflow (or next in focus_areas)
+- Skipped: If 'memory' not in focus_areas
--- a/.claude/skills/skill-tuning/phases/actions/action-gemini-analysis.md
+++ b/.claude/skills/skill-tuning/phases/actions/action-gemini-analysis.md
@@ -0,0 +1,322 @@
+# Action: Gemini Analysis
+
+动态调用 Gemini CLI 进行深度分析，根据用户需求或诊断结果选择分析类型。
+
+## Role
+
+- 接收用户指定的分析需求或从诊断结果推断需求
+- 构建适当的 CLI 命令
+- 执行分析并解析结果
+- 更新状态以供后续动作使用
+
+## Preconditions
+
+- `state.status === 'running'`
+- 满足以下任一条件:
+  - `state.gemini_analysis_requested === true` (用户请求)
+  - `state.issues.some(i => i.severity === 'critical')` (发现严重问题)
+  - `state.analysis_type !== null` (已指定分析类型)
+
+## Analysis Types
+
+### 1. root_cause - 问题根因分析
+
+针对用户描述的问题进行深度分析。
+
+```javascript
+const analysisPrompt = `
+PURPOSE: Identify root cause of skill execution issue: ${state.user_issue_description}
+TASK:
+• Analyze skill structure at: ${state.target_skill.path}
+• Identify anti-patterns in phase files
+• Trace data flow through state management
+• Check agent coordination patterns
+MODE: analysis
+CONTEXT: @**/*.md
+EXPECTED: JSON with structure:
+{
+  "root_causes": [
+    { "id": "RC-001", "description": "...", "severity": "high", "evidence": ["file:line"] }
+  ],
+  "patterns_found": [
+    { "pattern": "...", "type": "anti-pattern|best-practice", "locations": [] }
+  ],
+  "recommendations": [
+    { "priority": 1, "action": "...", "rationale": "..." }
+  ]
+}
+RULES: Focus on execution flow, state management, agent coordination
+`;
+```
+
+### 2. architecture - 架构审查
+
+评估 skill 的整体架构设计。
+
+```javascript
+const analysisPrompt = `
+PURPOSE: Review skill architecture for: ${state.target_skill.name}
+TASK:
+• Evaluate phase decomposition and responsibility separation
+• Check state schema design and data flow
+• Assess agent coordination and error handling
+• Review scalability and maintainability
+MODE: analysis
+CONTEXT: @**/*.md
+EXPECTED: Markdown report with sections:
+- Executive Summary
+- Phase Architecture Assessment
+- State Management Evaluation
+- Agent Coordination Analysis
+- Improvement Recommendations (prioritized)
+RULES: Focus on modularity, extensibility, maintainability
+`;
+```
+
+### 3. prompt_optimization - 提示词优化
+
+分析和优化 phase 中的提示词。
+
+```javascript
+const analysisPrompt = `
+PURPOSE: Optimize prompts in skill phases for better output quality
+TASK:
+• Analyze existing prompts for clarity and specificity
+• Identify ambiguous instructions
+• Check output format specifications
+• Evaluate constraint communication
+MODE: analysis
+CONTEXT: @phases/**/*.md
+EXPECTED: JSON with structure:
+{
+  "prompt_issues": [
+    { "file": "...", "issue": "...", "severity": "...", "suggestion": "..." }
+  ],
+  "optimized_prompts": [
+    { "file": "...", "original": "...", "optimized": "...", "rationale": "..." }
+  ]
+}
+RULES: Preserve intent, improve clarity, add structured output requirements
+`;
+```
+
+### 4. performance - 性能分析
+
+分析 Token 消耗和执行效率。
+
+```javascript
+const analysisPrompt = `
+PURPOSE: Analyze performance bottlenecks in skill execution
+TASK:
+• Estimate token consumption per phase
+• Identify redundant data passing
+• Check for unnecessary full-content transfers
+• Evaluate caching opportunities
+MODE: analysis
+CONTEXT: @**/*.md
+EXPECTED: JSON with structure:
+{
+  "token_estimates": [
+    { "phase": "...", "estimated_tokens": 1000, "breakdown": {} }
+  ],
+  "bottlenecks": [
+    { "type": "...", "location": "...", "impact": "high|medium|low", "fix": "..." }
+  ],
+  "optimization_suggestions": []
+}
+RULES: Focus on token efficiency, reduce redundancy
+`;
+```
+
+### 5. custom - 自定义分析
+
+用户指定的自定义分析需求。
+
+```javascript
+const analysisPrompt = `
+PURPOSE: ${state.custom_analysis_purpose}
+TASK: ${state.custom_analysis_tasks}
+MODE: analysis
+CONTEXT: @**/*.md
+EXPECTED: ${state.custom_analysis_expected}
+RULES: ${state.custom_analysis_rules || 'Follow best practices'}
+`;
+```
+
+## Execution
+
+```javascript
+async function executeGeminiAnalysis(state, workDir) {
+  // 1. 确定分析类型
+  const analysisType = state.analysis_type || determineAnalysisType(state);
+
+  // 2. 构建 prompt
+  const prompt = buildAnalysisPrompt(analysisType, state);
+
+  // 3. 构建 CLI 命令
+  const cliCommand = `ccw cli -p "${escapeForShell(prompt)}" --tool gemini --mode analysis --cd "${state.target_skill.path}"`;
+
+  console.log(`Executing Gemini analysis: ${analysisType}`);
+  console.log(`Command: ${cliCommand}`);
+
+  // 4. 执行 CLI (后台运行)
+  const result = Bash({
+    command: cliCommand,
+    run_in_background: true,
+    timeout: 300000  // 5 minutes
+  });
+
+  // 5. 等待结果
+  // 注意: 根据 CLAUDE.md 指引，CLI 后台执行后应停止轮询
+  // 结果会在 CLI 完成后写入 state
+
+  return {
+    stateUpdates: {
+      gemini_analysis: {
+        type: analysisType,
+        status: 'running',
+        started_at: new Date().toISOString(),
+        task_id: result.task_id
+      }
+    },
+    outputFiles: [],
+    summary: `Gemini ${analysisType} analysis started in background`
+  };
+}
+
+function determineAnalysisType(state) {
+  // 根据状态推断分析类型
+  if (state.user_issue_description && state.user_issue_description.length > 100) {
+    return 'root_cause';
+  }
+  if (state.issues.some(i => i.severity === 'critical')) {
+    return 'root_cause';
+  }
+  if (state.focus_areas.includes('architecture')) {
+    return 'architecture';
+  }
+  if (state.focus_areas.includes('prompt')) {
+    return 'prompt_optimization';
+  }
+  if (state.focus_areas.includes('performance')) {
+    return 'performance';
+  }
+  return 'root_cause';  // 默认
+}
+
+function buildAnalysisPrompt(type, state) {
+  const templates = {
+    root_cause: () => `
+PURPOSE: Identify root cause of skill execution issue: ${state.user_issue_description}
+TASK: • Analyze skill structure • Identify anti-patterns • Trace data flow issues • Check agent coordination
+MODE: analysis
+CONTEXT: @**/*.md
+EXPECTED: JSON { root_causes: [], patterns_found: [], recommendations: [] }
+RULES: Focus on execution flow, be specific about file:line locations
+`,
+    architecture: () => `
+PURPOSE: Review skill architecture for ${state.target_skill.name}
+TASK: • Evaluate phase decomposition • Check state design • Assess agent coordination • Review extensibility
+MODE: analysis
+CONTEXT: @**/*.md
+EXPECTED: Markdown architecture assessment report
+RULES: Focus on modularity and maintainability
+`,
+    prompt_optimization: () => `
+PURPOSE: Optimize prompts in skill for better output quality
+TASK: • Analyze prompt clarity • Check output specifications • Evaluate constraint handling
+MODE: analysis
+CONTEXT: @phases/**/*.md
+EXPECTED: JSON { prompt_issues: [], optimized_prompts: [] }
+RULES: Preserve intent, improve clarity
+`,
+    performance: () => `
+PURPOSE: Analyze performance bottlenecks in skill
+TASK: • Estimate token consumption • Identify redundancy • Check data transfer efficiency
+MODE: analysis
+CONTEXT: @**/*.md
+EXPECTED: JSON { token_estimates: [], bottlenecks: [], optimization_suggestions: [] }
+RULES: Focus on token efficiency
+`,
+    custom: () => `
+PURPOSE: ${state.custom_analysis_purpose}
+TASK: ${state.custom_analysis_tasks}
+MODE: analysis
+CONTEXT: @**/*.md
+EXPECTED: ${state.custom_analysis_expected}
+RULES: ${state.custom_analysis_rules || 'Best practices'}
+`
+  };
+
+  return templates[type]();
+}
+
+function escapeForShell(str) {
+  // 转义 shell 特殊字符
+  return str.replace(/"/g, '\\"').replace(/\$/g, '\\$').replace(/`/g, '\\`');
+}
+```
+
+## Output
+
+### State Updates
+
+```javascript
+{
+  gemini_analysis: {
+    type: 'root_cause' | 'architecture' | 'prompt_optimization' | 'performance' | 'custom',
+    status: 'running' | 'completed' | 'failed',
+    started_at: '2024-01-01T00:00:00Z',
+    completed_at: '2024-01-01T00:05:00Z',
+    task_id: 'xxx',
+    result: { /* 分析结果 */ },
+    error: null
+  },
+  // 分析结果合并到 issues
+  issues: [
+    ...state.issues,
+    ...newIssuesFromAnalysis
+  ]
+}
+```
+
+### Output Files
+
+- `${workDir}/diagnosis/gemini-analysis-${type}.json` - 原始分析结果
+- `${workDir}/diagnosis/gemini-analysis-${type}.md` - 格式化报告
+
+## Post-Execution
+
+分析完成后:
+1. 解析 CLI 输出为结构化数据
+2. 提取新发现的 issues 合并到 state.issues
+3. 更新 recommendations 到 state
+4. 触发下一步动作 (通常是 action-generate-report 或 action-propose-fixes)
+
+## Error Handling
+
+| Error | Recovery |
+|-------|----------|
+| CLI 超时 | 重试一次，仍失败则跳过 Gemini 分析 |
+| 解析失败 | 保存原始输出，手动处理 |
+| 无结果 | 标记为 skipped，继续流程 |
+
+## User Interaction
+
+如果 `state.analysis_type === null` 且无法自动推断，询问用户:
+
+```javascript
+AskUserQuestion({
+  questions: [{
+    question: '请选择 Gemini 分析类型',
+    header: '分析类型',
+    options: [
+      { label: '问题根因分析', description: '深度分析用户描述的问题' },
+      { label: '架构审查', description: '评估整体架构设计' },
+      { label: '提示词优化', description: '分析和优化 phase 提示词' },
+      { label: '性能分析', description: '分析 Token 消耗和执行效率' }
+    ],
+    multiSelect: false
+  }]
+});
+```
--- a/.claude/skills/skill-tuning/phases/actions/action-generate-report.md
+++ b/.claude/skills/skill-tuning/phases/actions/action-generate-report.md
@@ -0,0 +1,228 @@
+# Action: Generate Consolidated Report
+
+Generate a comprehensive tuning report merging all diagnosis results with prioritized recommendations.
+
+## Purpose
+
+- Merge all diagnosis results into unified report
+- Prioritize issues by severity and impact
+- Generate actionable recommendations
+- Create human-readable markdown report
+
+## Preconditions
+
+- [ ] state.status === 'running'
+- [ ] All diagnoses in focus_areas are completed
+- [ ] state.issues.length > 0 OR generate summary report
+
+## Execution
+
+```javascript
+async function execute(state, workDir) {
+  console.log('Generating consolidated tuning report...');
+
+  const targetSkill = state.target_skill;
+  const issues = state.issues;
+
+  // 1. Group issues by type
+  const issuesByType = {
+    context_explosion: issues.filter(i => i.type === 'context_explosion'),
+    memory_loss: issues.filter(i => i.type === 'memory_loss'),
+    dataflow_break: issues.filter(i => i.type === 'dataflow_break'),
+    agent_failure: issues.filter(i => i.type === 'agent_failure')
+  };
+
+  // 2. Group issues by severity
+  const issuesBySeverity = {
+    critical: issues.filter(i => i.severity === 'critical'),
+    high: issues.filter(i => i.severity === 'high'),
+    medium: issues.filter(i => i.severity === 'medium'),
+    low: issues.filter(i => i.severity === 'low')
+  };
+
+  // 3. Calculate overall health score
+  const weights = { critical: 25, high: 15, medium: 5, low: 1 };
+  const deductions = Object.entries(issuesBySeverity)
+    .reduce((sum, [sev, arr]) => sum + arr.length * weights[sev], 0);
+  const healthScore = Math.max(0, 100 - deductions);
+
+  // 4. Generate report content
+  const report = `# Skill Tuning Report
+
+**Target Skill**: ${targetSkill.name}
+**Path**: ${targetSkill.path}
+**Execution Mode**: ${targetSkill.execution_mode}
+**Generated**: ${new Date().toISOString()}
+
+---
+
+## Executive Summary
+
+| Metric | Value |
+|--------|-------|
+| Health Score | ${healthScore}/100 |
+| Total Issues | ${issues.length} |
+| Critical | ${issuesBySeverity.critical.length} |
+| High | ${issuesBySeverity.high.length} |
+| Medium | ${issuesBySeverity.medium.length} |
+| Low | ${issuesBySeverity.low.length} |
+
+### User Reported Issue
+> ${state.user_issue_description}
+
+### Overall Assessment
+${healthScore >= 80 ? '✅ Skill is in good health with minor issues.' :
+  healthScore >= 60 ? '⚠️ Skill has significant issues requiring attention.' :
+  healthScore >= 40 ? '🔶 Skill has serious issues affecting reliability.' :
+  '❌ Skill has critical issues requiring immediate fixes.'}
+
+---
+
+## Diagnosis Results
+
+### Context Explosion Analysis
+${state.diagnosis.context ?
+  `- **Status**: ${state.diagnosis.context.status}
+- **Severity**: ${state.diagnosis.context.severity}
+- **Issues Found**: ${state.diagnosis.context.issues_found}
+- **Key Findings**: ${state.diagnosis.context.details.recommendations.join('; ') || 'None'}` :
+  '_Not analyzed_'}
+
+### Long-tail Memory Analysis
+${state.diagnosis.memory ?
+  `- **Status**: ${state.diagnosis.memory.status}
+- **Severity**: ${state.diagnosis.memory.severity}
+- **Issues Found**: ${state.diagnosis.memory.issues_found}
+- **Key Findings**: ${state.diagnosis.memory.details.recommendations.join('; ') || 'None'}` :
+  '_Not analyzed_'}
+
+### Data Flow Analysis
+${state.diagnosis.dataflow ?
+  `- **Status**: ${state.diagnosis.dataflow.status}
+- **Severity**: ${state.diagnosis.dataflow.severity}
+- **Issues Found**: ${state.diagnosis.dataflow.issues_found}
+- **Key Findings**: ${state.diagnosis.dataflow.details.recommendations.join('; ') || 'None'}` :
+  '_Not analyzed_'}
+
+### Agent Coordination Analysis
+${state.diagnosis.agent ?
+  `- **Status**: ${state.diagnosis.agent.status}
+- **Severity**: ${state.diagnosis.agent.severity}
+- **Issues Found**: ${state.diagnosis.agent.issues_found}
+- **Key Findings**: ${state.diagnosis.agent.details.recommendations.join('; ') || 'None'}` :
+  '_Not analyzed_'}
+
+---
+
+## Critical & High Priority Issues
+
+${issuesBySeverity.critical.length + issuesBySeverity.high.length === 0 ?
+  '_No critical or high priority issues found._' :
+  [...issuesBySeverity.critical, ...issuesBySeverity.high].map((issue, i) => `
+### ${i + 1}. [${issue.severity.toUpperCase()}] ${issue.description}
+
+- **ID**: ${issue.id}
+- **Type**: ${issue.type}
+- **Location**: ${typeof issue.location === 'object' ? issue.location.file : issue.location}
+- **Root Cause**: ${issue.root_cause}
+- **Impact**: ${issue.impact}
+- **Suggested Fix**: ${issue.suggested_fix}
+
+**Evidence**:
+${issue.evidence.map(e => `- \`${e}\``).join('\n')}
+`).join('\n')}
+
+---
+
+## Medium & Low Priority Issues
+
+${issuesBySeverity.medium.length + issuesBySeverity.low.length === 0 ?
+  '_No medium or low priority issues found._' :
+  [...issuesBySeverity.medium, ...issuesBySeverity.low].map((issue, i) => `
+### ${i + 1}. [${issue.severity.toUpperCase()}] ${issue.description}
+
+- **ID**: ${issue.id}
+- **Type**: ${issue.type}
+- **Suggested Fix**: ${issue.suggested_fix}
+`).join('\n')}
+
+---
+
+## Recommended Fix Order
+
+Based on severity and dependencies, apply fixes in this order:
+
+${[...issuesBySeverity.critical, ...issuesBySeverity.high, ...issuesBySeverity.medium]
+  .slice(0, 10)
+  .map((issue, i) => `${i + 1}. **${issue.id}**: ${issue.suggested_fix}`)
+  .join('\n')}
+
+---
+
+## Quality Gates
+
+| Gate | Threshold | Current | Status |
+|------|-----------|---------|--------|
+| Critical Issues | 0 | ${issuesBySeverity.critical.length} | ${issuesBySeverity.critical.length === 0 ? '✅ PASS' : '❌ FAIL'} |
+| High Issues | ≤ 2 | ${issuesBySeverity.high.length} | ${issuesBySeverity.high.length <= 2 ? '✅ PASS' : '❌ FAIL'} |
+| Health Score | ≥ 60 | ${healthScore} | ${healthScore >= 60 ? '✅ PASS' : '❌ FAIL'} |
+
+**Overall Quality Gate**: ${
+  issuesBySeverity.critical.length === 0 &&
+  issuesBySeverity.high.length <= 2 &&
+  healthScore >= 60 ? '✅ PASS' : '❌ FAIL'}
+
+---
+
+*Report generated by skill-tuning*
+`;
+
+  // 5. Write report
+  Write(`${workDir}/tuning-report.md`, report);
+
+  // 6. Calculate quality gate
+  const qualityGate = issuesBySeverity.critical.length === 0 &&
+                      issuesBySeverity.high.length <= 2 &&
+                      healthScore >= 60 ? 'pass' :
+                      healthScore >= 40 ? 'review' : 'fail';
+
+  return {
+    stateUpdates: {
+      quality_score: healthScore,
+      quality_gate: qualityGate,
+      issues_by_severity: {
+        critical: issuesBySeverity.critical.length,
+        high: issuesBySeverity.high.length,
+        medium: issuesBySeverity.medium.length,
+        low: issuesBySeverity.low.length
+      }
+    },
+    outputFiles: [`${workDir}/tuning-report.md`],
+    summary: `Report generated: ${issues.length} issues, health score ${healthScore}/100, gate: ${qualityGate}`
+  };
+}
+```
+
+## State Updates
+
+```javascript
+return {
+  stateUpdates: {
+    quality_score: <0-100>,
+    quality_gate: '<pass|review|fail>',
+    issues_by_severity: { critical: N, high: N, medium: N, low: N }
+  }
+};
+```
+
+## Error Handling
+
+| Error Type | Recovery |
+|------------|----------|
+| Write error | Retry to alternative path |
+| Empty issues | Generate summary with no issues |
+
+## Next Actions
+
+- If issues.length > 0: action-propose-fixes
+- If issues.length === 0: action-complete
--- a/.claude/skills/skill-tuning/phases/actions/action-init.md
+++ b/.claude/skills/skill-tuning/phases/actions/action-init.md
@@ -0,0 +1,149 @@
+# Action: Initialize Tuning Session
+
+Initialize the skill-tuning session by collecting target skill information, creating work directories, and setting up initial state.
+
+## Purpose
+
+- Identify target skill to tune
+- Collect user's problem description
+- Create work directory structure
+- Backup original skill files
+- Initialize state for orchestrator
+
+## Preconditions
+
+- [ ] state.status === 'pending'
+
+## Execution
+
+```javascript
+async function execute(state, workDir) {
+  // 1. Ask user for target skill
+  const skillInput = await AskUserQuestion({
+    questions: [{
+      question: "Which skill do you want to tune?",
+      header: "Target Skill",
+      multiSelect: false,
+      options: [
+        { label: "Specify path", description: "Enter skill directory path" }
+      ]
+    }]
+  });
+
+  const skillPath = skillInput["Target Skill"];
+
+  // 2. Validate skill exists and read structure
+  const skillMdPath = `${skillPath}/SKILL.md`;
+  if (!Glob(`${skillPath}/SKILL.md`).length) {
+    throw new Error(`Invalid skill path: ${skillPath} - SKILL.md not found`);
+  }
+
+  // 3. Read skill metadata
+  const skillMd = Read(skillMdPath);
+  const frontMatterMatch = skillMd.match(/^---\n([\s\S]*?)\n---/);
+  const skillName = frontMatterMatch
+    ? frontMatterMatch[1].match(/name:\s*(.+)/)?.[1]?.trim()
+    : skillPath.split('/').pop();
+
+  // 4. Detect execution mode
+  const hasOrchestrator = Glob(`${skillPath}/phases/orchestrator.md`).length > 0;
+  const executionMode = hasOrchestrator ? 'autonomous' : 'sequential';
+
+  // 5. Scan skill structure
+  const phases = Glob(`${skillPath}/phases/**/*.md`).map(f => f.replace(skillPath + '/', ''));
+  const specs = Glob(`${skillPath}/specs/**/*.md`).map(f => f.replace(skillPath + '/', ''));
+
+  // 6. Ask for problem description
+  const issueInput = await AskUserQuestion({
+    questions: [{
+      question: "Describe the issue or what you want to optimize:",
+      header: "Issue",
+      multiSelect: false,
+      options: [
+        { label: "Context grows too large", description: "Token explosion over multiple turns" },
+        { label: "Instructions forgotten", description: "Early constraints lost in long execution" },
+        { label: "Data inconsistency", description: "State format changes between phases" },
+        { label: "Agent failures", description: "Sub-agent calls fail or return unexpected results" }
+      ]
+    }]
+  });
+
+  // 7. Ask for focus areas
+  const focusInput = await AskUserQuestion({
+    questions: [{
+      question: "Which areas should be diagnosed? (Select all that apply)",
+      header: "Focus",
+      multiSelect: true,
+      options: [
+        { label: "context", description: "Context explosion analysis" },
+        { label: "memory", description: "Long-tail forgetting analysis" },
+        { label: "dataflow", description: "Data flow analysis" },
+        { label: "agent", description: "Agent coordination analysis" }
+      ]
+    }]
+  });
+
+  const focusAreas = focusInput["Focus"] || ['context', 'memory', 'dataflow', 'agent'];
+
+  // 8. Create backup
+  const backupDir = `${workDir}/backups/${skillName}-backup`;
+  Bash(`mkdir -p "${backupDir}"`);
+  Bash(`cp -r "${skillPath}"/* "${backupDir}/"`);
+
+  // 9. Return state updates
+  return {
+    stateUpdates: {
+      status: 'running',
+      started_at: new Date().toISOString(),
+      target_skill: {
+        name: skillName,
+        path: skillPath,
+        execution_mode: executionMode,
+        phases: phases,
+        specs: specs
+      },
+      user_issue_description: issueInput["Issue"],
+      focus_areas: Array.isArray(focusAreas) ? focusAreas : [focusAreas],
+      work_dir: workDir,
+      backup_dir: backupDir
+    },
+    outputFiles: [],
+    summary: `Initialized tuning for "${skillName}" (${executionMode} mode), focus: ${focusAreas.join(', ')}`
+  };
+}
+```
+
+## State Updates
+
+```javascript
+return {
+  stateUpdates: {
+    status: 'running',
+    started_at: '<timestamp>',
+    target_skill: {
+      name: '<skill-name>',
+      path: '<skill-path>',
+      execution_mode: '<sequential|autonomous>',
+      phases: ['...'],
+      specs: ['...']
+    },
+    user_issue_description: '<user description>',
+    focus_areas: ['context', 'memory', ...],
+    work_dir: '<work-dir>',
+    backup_dir: '<backup-dir>'
+  }
+};
+```
+
+## Error Handling
+
+| Error Type | Recovery |
+|------------|----------|
+| Skill path not found | Ask user to re-enter valid path |
+| SKILL.md missing | Suggest path correction |
+| Backup creation failed | Retry with alternative location |
+
+## Next Actions
+
+- Success: Continue to first diagnosis action based on focus_areas
+- Failure: action-abort
--- a/.claude/skills/skill-tuning/phases/actions/action-propose-fixes.md
+++ b/.claude/skills/skill-tuning/phases/actions/action-propose-fixes.md
@@ -0,0 +1,317 @@
+# Action: Propose Fixes
+
+Generate fix proposals for identified issues with implementation strategies.
+
+## Purpose
+
+- Create fix strategies for each issue
+- Generate implementation plans
+- Estimate risk levels
+- Allow user to select fixes to apply
+
+## Preconditions
+
+- [ ] state.status === 'running'
+- [ ] state.issues.length > 0
+- [ ] action-generate-report completed
+
+## Fix Strategy Catalog
+
+### Context Explosion Fixes
+
+| Strategy | Description | Risk |
+|----------|-------------|------|
+| `context_summarization` | Add summarizer agent between phases | low |
+| `sliding_window` | Keep only last N turns in context | low |
+| `structured_state` | Replace text context with JSON state | medium |
+| `path_reference` | Pass file paths instead of content | low |
+
+### Memory Loss Fixes
+
+| Strategy | Description | Risk |
+|----------|-------------|------|
+| `constraint_injection` | Add constraints to each phase prompt | low |
+| `checkpoint_restore` | Save state at milestones | low |
+| `goal_embedding` | Track goal similarity throughout | medium |
+| `state_constraints_field` | Add constraints field to state schema | low |
+
+### Data Flow Fixes
+
+| Strategy | Description | Risk |
+|----------|-------------|------|
+| `state_centralization` | Single state.json for all data | medium |
+| `schema_enforcement` | Add Zod validation | low |
+| `field_normalization` | Normalize field names | low |
+| `transactional_updates` | Atomic state updates | medium |
+
+### Agent Coordination Fixes
+
+| Strategy | Description | Risk |
+|----------|-------------|------|
+| `error_wrapping` | Add try-catch to all Task calls | low |
+| `result_validation` | Validate agent returns | low |
+| `orchestrator_refactor` | Centralize agent coordination | high |
+| `flatten_nesting` | Remove nested agent calls | medium |
+
+## Execution
+
+```javascript
+async function execute(state, workDir) {
+  console.log('Generating fix proposals...');
+
+  const issues = state.issues;
+  const fixes = [];
+
+  // Group issues by type for batch fixes
+  const issuesByType = {
+    context_explosion: issues.filter(i => i.type === 'context_explosion'),
+    memory_loss: issues.filter(i => i.type === 'memory_loss'),
+    dataflow_break: issues.filter(i => i.type === 'dataflow_break'),
+    agent_failure: issues.filter(i => i.type === 'agent_failure')
+  };
+
+  // Generate fixes for context explosion
+  if (issuesByType.context_explosion.length > 0) {
+    const ctxIssues = issuesByType.context_explosion;
+
+    if (ctxIssues.some(i => i.description.includes('history accumulation'))) {
+      fixes.push({
+        id: `FIX-${fixes.length + 1}`,
+        issue_ids: ctxIssues.filter(i => i.description.includes('history')).map(i => i.id),
+        strategy: 'sliding_window',
+        description: 'Implement sliding window for conversation history',
+        rationale: 'Prevents unbounded context growth by keeping only recent turns',
+        changes: [{
+          file: 'phases/orchestrator.md',
+          action: 'modify',
+          diff: `+ const MAX_HISTORY = 5;
+ state.history = state.history.slice(-MAX_HISTORY);`
+        }],
+        risk: 'low',
+        estimated_impact: 'Reduces token usage by ~50%',
+        verification_steps: ['Run skill with 10+ iterations', 'Verify context size stable']
+      });
+    }
+
+    if (ctxIssues.some(i => i.description.includes('full content'))) {
+      fixes.push({
+        id: `FIX-${fixes.length + 1}`,
+        issue_ids: ctxIssues.filter(i => i.description.includes('content')).map(i => i.id),
+        strategy: 'path_reference',
+        description: 'Pass file paths instead of full content',
+        rationale: 'Agents can read files when needed, reducing prompt size',
+        changes: [{
+          file: 'phases/*.md',
+          action: 'modify',
+          diff: `- prompt: \${content}
+ prompt: Read file at: \${filePath}`
+        }],
+        risk: 'low',
+        estimated_impact: 'Significant token reduction',
+        verification_steps: ['Verify agents can still access needed content']
+      });
+    }
+  }
+
+  // Generate fixes for memory loss
+  if (issuesByType.memory_loss.length > 0) {
+    const memIssues = issuesByType.memory_loss;
+
+    if (memIssues.some(i => i.description.includes('constraint'))) {
+      fixes.push({
+        id: `FIX-${fixes.length + 1}`,
+        issue_ids: memIssues.filter(i => i.description.includes('constraint')).map(i => i.id),
+        strategy: 'constraint_injection',
+        description: 'Add constraint injection to all phases',
+        rationale: 'Ensures original requirements are visible in every phase',
+        changes: [{
+          file: 'phases/*.md',
+          action: 'modify',
+          diff: `+ [CONSTRAINTS]
+ Original requirements from state.original_requirements:
+ \${JSON.stringify(state.original_requirements)}`
+        }],
+        risk: 'low',
+        estimated_impact: 'Improves constraint adherence',
+        verification_steps: ['Run skill with specific constraints', 'Verify output matches']
+      });
+    }
+
+    if (memIssues.some(i => i.description.includes('State schema'))) {
+      fixes.push({
+        id: `FIX-${fixes.length + 1}`,
+        issue_ids: memIssues.filter(i => i.description.includes('schema')).map(i => i.id),
+        strategy: 'state_constraints_field',
+        description: 'Add original_requirements field to state schema',
+        rationale: 'Preserves original intent throughout execution',
+        changes: [{
+          file: 'phases/state-schema.md',
+          action: 'modify',
+          diff: `+ original_requirements: string[];  // User's original constraints
+ goal_summary: string;              // One-line goal statement`
+        }],
+        risk: 'low',
+        estimated_impact: 'Enables constraint tracking',
+        verification_steps: ['Verify state includes requirements after init']
+      });
+    }
+  }
+
+  // Generate fixes for data flow
+  if (issuesByType.dataflow_break.length > 0) {
+    const dfIssues = issuesByType.dataflow_break;
+
+    if (dfIssues.some(i => i.description.includes('multiple locations'))) {
+      fixes.push({
+        id: `FIX-${fixes.length + 1}`,
+        issue_ids: dfIssues.filter(i => i.description.includes('location')).map(i => i.id),
+        strategy: 'state_centralization',
+        description: 'Centralize all state to single state.json',
+        rationale: 'Single source of truth prevents inconsistencies',
+        changes: [{
+          file: 'phases/*.md',
+          action: 'modify',
+          diff: `- Write(\`\${workDir}/config.json\`, ...)
+ updateState({ config: ... })  // Use state manager`
+        }],
+        risk: 'medium',
+        estimated_impact: 'Eliminates state fragmentation',
+        verification_steps: ['Verify all reads come from state.json', 'Test state persistence']
+      });
+    }
+
+    if (dfIssues.some(i => i.description.includes('validation'))) {
+      fixes.push({
+        id: `FIX-${fixes.length + 1}`,
+        issue_ids: dfIssues.filter(i => i.description.includes('validation')).map(i => i.id),
+        strategy: 'schema_enforcement',
+        description: 'Add Zod schema validation',
+        rationale: 'Runtime validation catches schema violations',
+        changes: [{
+          file: 'phases/state-schema.md',
+          action: 'modify',
+          diff: `+ import { z } from 'zod';
+ const StateSchema = z.object({...});
+ function validateState(s) { return StateSchema.parse(s); }`
+        }],
+        risk: 'low',
+        estimated_impact: 'Catches invalid state early',
+        verification_steps: ['Test with invalid state input', 'Verify error thrown']
+      });
+    }
+  }
+
+  // Generate fixes for agent coordination
+  if (issuesByType.agent_failure.length > 0) {
+    const agentIssues = issuesByType.agent_failure;
+
+    if (agentIssues.some(i => i.description.includes('error handling'))) {
+      fixes.push({
+        id: `FIX-${fixes.length + 1}`,
+        issue_ids: agentIssues.filter(i => i.description.includes('error')).map(i => i.id),
+        strategy: 'error_wrapping',
+        description: 'Wrap all Task calls in try-catch',
+        rationale: 'Prevents cascading failures from agent errors',
+        changes: [{
+          file: 'phases/*.md',
+          action: 'modify',
+          diff: `+ try {
+    const result = await Task({...});
+   if (!result) throw new Error('Empty result');
+ } catch (e) {
+   updateState({ errors: [...errors, e.message], error_count: error_count + 1 });
+ }`
+        }],
+        risk: 'low',
+        estimated_impact: 'Improves error resilience',
+        verification_steps: ['Simulate agent failure', 'Verify graceful handling']
+      });
+    }
+
+    if (agentIssues.some(i => i.description.includes('nested'))) {
+      fixes.push({
+        id: `FIX-${fixes.length + 1}`,
+        issue_ids: agentIssues.filter(i => i.description.includes('nested')).map(i => i.id),
+        strategy: 'flatten_nesting',
+        description: 'Flatten nested agent calls',
+        rationale: 'Reduces complexity and context explosion',
+        changes: [{
+          file: 'phases/orchestrator.md',
+          action: 'modify',
+          diff: `// Instead of agent calling agent:
+// Agent A returns {needs_agent_b: true}
+// Orchestrator sees this and calls Agent B next`
+        }],
+        risk: 'medium',
+        estimated_impact: 'Reduces nesting depth',
+        verification_steps: ['Verify no nested Task calls', 'Test agent chaining via orchestrator']
+      });
+    }
+  }
+
+  // Write fix proposals
+  Write(`${workDir}/fixes/fix-proposals.json`, JSON.stringify(fixes, null, 2));
+
+  // Ask user to select fixes to apply
+  const fixOptions = fixes.slice(0, 4).map(f => ({
+    label: f.id,
+    description: `[${f.risk.toUpperCase()} risk] ${f.description}`
+  }));
+
+  if (fixOptions.length > 0) {
+    const selection = await AskUserQuestion({
+      questions: [{
+        question: 'Which fixes would you like to apply?',
+        header: 'Fixes',
+        multiSelect: true,
+        options: fixOptions
+      }]
+    });
+
+    const selectedFixIds = Array.isArray(selection['Fixes'])
+      ? selection['Fixes']
+      : [selection['Fixes']];
+
+    return {
+      stateUpdates: {
+        proposed_fixes: fixes,
+        pending_fixes: selectedFixIds.filter(id => id && fixes.some(f => f.id === id))
+      },
+      outputFiles: [`${workDir}/fixes/fix-proposals.json`],
+      summary: `Generated ${fixes.length} fix proposals, ${selectedFixIds.length} selected for application`
+    };
+  }
+
+  return {
+    stateUpdates: {
+      proposed_fixes: fixes,
+      pending_fixes: []
+    },
+    outputFiles: [`${workDir}/fixes/fix-proposals.json`],
+    summary: `Generated ${fixes.length} fix proposals (none selected)`
+  };
+}
+```
+
+## State Updates
+
+```javascript
+return {
+  stateUpdates: {
+    proposed_fixes: [...fixes],
+    pending_fixes: [...selectedFixIds]
+  }
+};
+```
+
+## Error Handling
+
+| Error Type | Recovery |
+|------------|----------|
+| No issues to fix | Skip to action-complete |
+| User cancels selection | Set pending_fixes to empty |
+
+## Next Actions
+
+- If pending_fixes.length > 0: action-apply-fix
+- If pending_fixes.length === 0: action-complete
--- a/.claude/skills/skill-tuning/phases/actions/action-verify.md
+++ b/.claude/skills/skill-tuning/phases/actions/action-verify.md
@@ -0,0 +1,222 @@
+# Action: Verify Applied Fixes
+
+Verify that applied fixes resolved the targeted issues.
+
+## Purpose
+
+- Re-run relevant diagnostics
+- Compare before/after issue counts
+- Update verification status
+- Determine if more iterations needed
+
+## Preconditions
+
+- [ ] state.status === 'running'
+- [ ] state.applied_fixes.length > 0
+- [ ] Some applied_fixes have verification_result === 'pending'
+
+## Execution
+
+```javascript
+async function execute(state, workDir) {
+  console.log('Verifying applied fixes...');
+
+  const appliedFixes = state.applied_fixes.filter(f => f.verification_result === 'pending');
+
+  if (appliedFixes.length === 0) {
+    return {
+      stateUpdates: {},
+      outputFiles: [],
+      summary: 'No fixes pending verification'
+    };
+  }
+
+  const verificationResults = [];
+
+  for (const fix of appliedFixes) {
+    const proposedFix = state.proposed_fixes.find(f => f.id === fix.fix_id);
+
+    if (!proposedFix) {
+      verificationResults.push({
+        fix_id: fix.fix_id,
+        result: 'fail',
+        reason: 'Fix definition not found'
+      });
+      continue;
+    }
+
+    // Determine which diagnosis to re-run based on fix strategy
+    const strategyToDiagnosis = {
+      'context_summarization': 'context',
+      'sliding_window': 'context',
+      'structured_state': 'context',
+      'path_reference': 'context',
+      'constraint_injection': 'memory',
+      'checkpoint_restore': 'memory',
+      'goal_embedding': 'memory',
+      'state_constraints_field': 'memory',
+      'state_centralization': 'dataflow',
+      'schema_enforcement': 'dataflow',
+      'field_normalization': 'dataflow',
+      'transactional_updates': 'dataflow',
+      'error_wrapping': 'agent',
+      'result_validation': 'agent',
+      'orchestrator_refactor': 'agent',
+      'flatten_nesting': 'agent'
+    };
+
+    const diagnosisType = strategyToDiagnosis[proposedFix.strategy];
+
+    // For now, do a lightweight verification
+    // Full implementation would re-run the specific diagnosis
+
+    // Check if the fix was actually applied (look for markers)
+    const targetPath = state.target_skill.path;
+    const fixMarker = `Applied fix ${fix.fix_id}`;
+
+    let fixFound = false;
+    const allFiles = Glob(`${targetPath}/**/*.md`);
+
+    for (const file of allFiles) {
+      const content = Read(file);
+      if (content.includes(fixMarker)) {
+        fixFound = true;
+        break;
+      }
+    }
+
+    if (fixFound) {
+      // Verify by checking if original issues still exist
+      const relatedIssues = proposedFix.issue_ids;
+      const originalIssueCount = relatedIssues.length;
+
+      // Simplified verification: assume fix worked if marker present
+      // Real implementation would re-run diagnosis patterns
+
+      verificationResults.push({
+        fix_id: fix.fix_id,
+        result: 'pass',
+        reason: `Fix applied successfully, addressing ${originalIssueCount} issues`,
+        issues_resolved: relatedIssues
+      });
+    } else {
+      verificationResults.push({
+        fix_id: fix.fix_id,
+        result: 'fail',
+        reason: 'Fix marker not found in target files'
+      });
+    }
+  }
+
+  // Update applied fixes with verification results
+  const updatedAppliedFixes = state.applied_fixes.map(fix => {
+    const result = verificationResults.find(v => v.fix_id === fix.fix_id);
+    if (result) {
+      return {
+        ...fix,
+        verification_result: result.result
+      };
+    }
+    return fix;
+  });
+
+  // Calculate new quality score
+  const passedFixes = verificationResults.filter(v => v.result === 'pass').length;
+  const totalFixes = verificationResults.length;
+  const verificationRate = totalFixes > 0 ? (passedFixes / totalFixes) * 100 : 100;
+
+  // Recalculate issues (remove resolved ones)
+  const resolvedIssueIds = verificationResults
+    .filter(v => v.result === 'pass')
+    .flatMap(v => v.issues_resolved || []);
+
+  const remainingIssues = state.issues.filter(i => !resolvedIssueIds.includes(i.id));
+
+  // Recalculate quality score
+  const weights = { critical: 25, high: 15, medium: 5, low: 1 };
+  const deductions = remainingIssues.reduce((sum, issue) =>
+    sum + (weights[issue.severity] || 0), 0);
+  const newHealthScore = Math.max(0, 100 - deductions);
+
+  // Determine new quality gate
+  const remainingCritical = remainingIssues.filter(i => i.severity === 'critical').length;
+  const remainingHigh = remainingIssues.filter(i => i.severity === 'high').length;
+  const newQualityGate = remainingCritical === 0 && remainingHigh <= 2 && newHealthScore >= 60
+    ? 'pass'
+    : newHealthScore >= 40 ? 'review' : 'fail';
+
+  // Increment iteration count
+  const newIterationCount = state.iteration_count + 1;
+
+  // Ask user if they want to continue
+  let continueIteration = false;
+  if (newQualityGate !== 'pass' && newIterationCount < state.max_iterations) {
+    const continueResponse = await AskUserQuestion({
+      questions: [{
+        question: `Verification complete. Quality gate: ${newQualityGate}. Continue with another iteration?`,
+        header: 'Continue',
+        multiSelect: false,
+        options: [
+          { label: 'Yes', description: `Run iteration ${newIterationCount + 1}` },
+          { label: 'No', description: 'Finish with current state' }
+        ]
+      }]
+    });
+    continueIteration = continueResponse['Continue'] === 'Yes';
+  }
+
+  // If continuing, reset diagnosis for re-evaluation
+  const diagnosisReset = continueIteration ? {
+    'diagnosis.context': null,
+    'diagnosis.memory': null,
+    'diagnosis.dataflow': null,
+    'diagnosis.agent': null
+  } : {};
+
+  return {
+    stateUpdates: {
+      applied_fixes: updatedAppliedFixes,
+      issues: remainingIssues,
+      quality_score: newHealthScore,
+      quality_gate: newQualityGate,
+      iteration_count: newIterationCount,
+      ...diagnosisReset,
+      issues_by_severity: {
+        critical: remainingIssues.filter(i => i.severity === 'critical').length,
+        high: remainingIssues.filter(i => i.severity === 'high').length,
+        medium: remainingIssues.filter(i => i.severity === 'medium').length,
+        low: remainingIssues.filter(i => i.severity === 'low').length
+      }
+    },
+    outputFiles: [],
+    summary: `Verified ${totalFixes} fixes: ${passedFixes} passed. Score: ${newHealthScore}, Gate: ${newQualityGate}, Iteration: ${newIterationCount}`
+  };
+}
+```
+
+## State Updates
+
+```javascript
+return {
+  stateUpdates: {
+    applied_fixes: [...updatedWithVerificationResults],
+    issues: [...remainingIssues],
+    quality_score: newScore,
+    quality_gate: newGate,
+    iteration_count: iteration + 1
+  }
+};
+```
+
+## Error Handling
+
+| Error Type | Recovery |
+|------------|----------|
+| Re-diagnosis fails | Mark as 'inconclusive' |
+| File access error | Skip file verification |
+
+## Next Actions
+
+- If quality_gate === 'pass': action-complete
+- If user chose to continue: restart diagnosis cycle
+- If max_iterations reached: action-complete
--- a/.claude/skills/skill-tuning/phases/orchestrator.md
+++ b/.claude/skills/skill-tuning/phases/orchestrator.md
@@ -0,0 +1,335 @@
+# Orchestrator
+
+Autonomous orchestrator for skill-tuning workflow. Reads current state and selects the next action based on diagnosis progress and quality gates.
+
+## Role
+
+Drive the tuning workflow by:
+1. Reading current session state
+2. Selecting the appropriate next action
+3. Executing the action via sub-agent
+4. Updating state with results
+5. Repeating until termination conditions met
+
+## State Management
+
+### Read State
+
+```javascript
+const state = JSON.parse(Read(`${workDir}/state.json`));
+```
+
+### Update State
+
+```javascript
+function updateState(updates) {
+  const state = JSON.parse(Read(`${workDir}/state.json`));
+  const newState = {
+    ...state,
+    ...updates,
+    updated_at: new Date().toISOString()
+  };
+  Write(`${workDir}/state.json`, JSON.stringify(newState, null, 2));
+  return newState;
+}
+```
+
+## Decision Logic
+
+```javascript
+function selectNextAction(state) {
+  // === Termination Checks ===
+
+  // User exit
+  if (state.status === 'user_exit') return null;
+
+  // Completed
+  if (state.status === 'completed') return null;
+
+  // Error limit exceeded
+  if (state.error_count >= state.max_errors) {
+    return 'action-abort';
+  }
+
+  // Max iterations exceeded
+  if (state.iteration_count >= state.max_iterations) {
+    return 'action-complete';
+  }
+
+  // === Action Selection ===
+
+  // 1. Not initialized yet
+  if (state.status === 'pending') {
+    return 'action-init';
+  }
+
+  // 2. Check if Gemini analysis is requested or needed
+  if (shouldTriggerGeminiAnalysis(state)) {
+    return 'action-gemini-analysis';
+  }
+
+  // 3. Check if Gemini analysis is running
+  if (state.gemini_analysis?.status === 'running') {
+    // Wait for Gemini analysis to complete
+    return null;  // Orchestrator will be re-triggered when CLI completes
+  }
+
+  // 4. Run diagnosis in order (only if not completed)
+  const diagnosisOrder = ['context', 'memory', 'dataflow', 'agent'];
+
+  for (const diagType of diagnosisOrder) {
+    if (state.diagnosis[diagType] === null) {
+      // Check if user wants to skip this diagnosis
+      if (!state.focus_areas.length || state.focus_areas.includes(diagType)) {
+        return `action-diagnose-${diagType}`;
+      }
+    }
+  }
+
+  // 5. All diagnosis complete, generate report if not done
+  const allDiagnosisComplete = diagnosisOrder.every(
+    d => state.diagnosis[d] !== null || !state.focus_areas.includes(d)
+  );
+
+  if (allDiagnosisComplete && !state.completed_actions.includes('action-generate-report')) {
+    return 'action-generate-report';
+  }
+
+  // 6. Report generated, propose fixes if not done
+  if (state.completed_actions.includes('action-generate-report') &&
+      state.proposed_fixes.length === 0 &&
+      state.issues.length > 0) {
+    return 'action-propose-fixes';
+  }
+
+  // 7. Fixes proposed, check if user wants to apply
+  if (state.proposed_fixes.length > 0 && state.pending_fixes.length > 0) {
+    return 'action-apply-fix';
+  }
+
+  // 8. Fixes applied, verify
+  if (state.applied_fixes.length > 0 &&
+      state.applied_fixes.some(f => f.verification_result === 'pending')) {
+    return 'action-verify';
+  }
+
+  // 9. Quality gate check
+  if (state.quality_gate === 'pass') {
+    return 'action-complete';
+  }
+
+  // 10. More iterations needed
+  if (state.iteration_count < state.max_iterations &&
+      state.quality_gate !== 'pass' &&
+      state.issues.some(i => i.severity === 'critical' || i.severity === 'high')) {
+    // Reset diagnosis for re-evaluation
+    return 'action-diagnose-context';  // Start new iteration
+  }
+
+  // 11. Default: complete
+  return 'action-complete';
+}
+
+/**
+ * 判断是否需要触发 Gemini CLI 分析
+ */
+function shouldTriggerGeminiAnalysis(state) {
+  // 已完成 Gemini 分析，不再触发
+  if (state.gemini_analysis?.status === 'completed') {
+    return false;
+  }
+
+  // 用户显式请求
+  if (state.gemini_analysis_requested === true) {
+    return true;
+  }
+
+  // 发现 critical 问题且未进行深度分析
+  if (state.issues.some(i => i.severity === 'critical') &&
+      !state.completed_actions.includes('action-gemini-analysis')) {
+    return true;
+  }
+
+  // 用户指定了需要 Gemini 分析的 focus_areas
+  const geminiAreas = ['architecture', 'prompt', 'performance', 'custom'];
+  if (state.focus_areas.some(area => geminiAreas.includes(area))) {
+    return true;
+  }
+
+  // 标准诊断完成但问题未得到解决，需要深度分析
+  const diagnosisComplete = ['context', 'memory', 'dataflow', 'agent'].every(
+    d => state.diagnosis[d] !== null
+  );
+  if (diagnosisComplete &&
+      state.issues.length > 0 &&
+      state.iteration_count > 0 &&
+      !state.completed_actions.includes('action-gemini-analysis')) {
+    // 第二轮迭代如果问题仍存在，触发 Gemini 分析
+    return true;
+  }
+
+  return false;
+}
+```
+
+## Execution Loop
+
+```javascript
+async function runOrchestrator(workDir) {
+  console.log('=== Skill Tuning Orchestrator Started ===');
+
+  let iteration = 0;
+  const MAX_LOOP_ITERATIONS = 50;  // Safety limit
+
+  while (iteration < MAX_LOOP_ITERATIONS) {
+    iteration++;
+
+    // 1. Read current state
+    const state = JSON.parse(Read(`${workDir}/state.json`));
+    console.log(`[Loop ${iteration}] Status: ${state.status}, Action: ${state.current_action}`);
+
+    // 2. Select next action
+    const actionId = selectNextAction(state);
+
+    if (!actionId) {
+      console.log('No action selected, terminating orchestrator.');
+      break;
+    }
+
+    console.log(`[Loop ${iteration}] Executing: ${actionId}`);
+
+    // 3. Update state: current action
+    updateState({
+      current_action: actionId,
+      action_history: [...state.action_history, {
+        action: actionId,
+        started_at: new Date().toISOString(),
+        completed_at: null,
+        result: null,
+        output_files: []
+      }]
+    });
+
+    // 4. Execute action
+    try {
+      const actionPrompt = Read(`phases/actions/${actionId}.md`);
+      const stateJson = JSON.stringify(state, null, 2);
+
+      const result = await Task({
+        subagent_type: 'universal-executor',
+        run_in_background: false,
+        prompt: `
+[CONTEXT]
+You are executing action "${actionId}" for skill-tuning workflow.
+Work directory: ${workDir}
+
+[STATE]
+${stateJson}
+
+[ACTION INSTRUCTIONS]
+${actionPrompt}
+
+[OUTPUT REQUIREMENT]
+After completing the action:
+1. Write any output files to the work directory
+2. Return a JSON object with:
+   - stateUpdates: object with state fields to update
+   - outputFiles: array of files created
+   - summary: brief description of what was done
+`
+      });
+
+      // 5. Parse result and update state
+      let actionResult;
+      try {
+        actionResult = JSON.parse(result);
+      } catch (e) {
+        actionResult = {
+          stateUpdates: {},
+          outputFiles: [],
+          summary: result
+        };
+      }
+
+      // 6. Update state: action complete
+      const updatedHistory = [...state.action_history];
+      updatedHistory[updatedHistory.length - 1] = {
+        ...updatedHistory[updatedHistory.length - 1],
+        completed_at: new Date().toISOString(),
+        result: 'success',
+        output_files: actionResult.outputFiles || []
+      };
+
+      updateState({
+        current_action: null,
+        completed_actions: [...state.completed_actions, actionId],
+        action_history: updatedHistory,
+        ...actionResult.stateUpdates
+      });
+
+      console.log(`[Loop ${iteration}] Completed: ${actionId}`);
+
+    } catch (error) {
+      console.log(`[Loop ${iteration}] Error in ${actionId}: ${error.message}`);
+
+      // Error handling
+      updateState({
+        current_action: null,
+        errors: [...state.errors, {
+          action: actionId,
+          message: error.message,
+          timestamp: new Date().toISOString(),
+          recoverable: true
+        }],
+        error_count: state.error_count + 1
+      });
+    }
+  }
+
+  console.log('=== Skill Tuning Orchestrator Finished ===');
+}
+```
+
+## Action Catalog
+
+| Action | Purpose | Preconditions | Effects |
+|--------|---------|---------------|---------|
+| [action-init](actions/action-init.md) | Initialize tuning session | status === 'pending' | Creates work dirs, backup, sets status='running' |
+| [action-diagnose-context](actions/action-diagnose-context.md) | Analyze context explosion | status === 'running' | Sets diagnosis.context |
+| [action-diagnose-memory](actions/action-diagnose-memory.md) | Analyze long-tail forgetting | status === 'running' | Sets diagnosis.memory |
+| [action-diagnose-dataflow](actions/action-diagnose-dataflow.md) | Analyze data flow issues | status === 'running' | Sets diagnosis.dataflow |
+| [action-diagnose-agent](actions/action-diagnose-agent.md) | Analyze agent coordination | status === 'running' | Sets diagnosis.agent |
+| [action-gemini-analysis](actions/action-gemini-analysis.md) | Deep analysis via Gemini CLI | User request OR critical issues | Sets gemini_analysis, adds issues |
+| [action-generate-report](actions/action-generate-report.md) | Generate consolidated report | All diagnoses complete | Creates tuning-report.md |
+| [action-propose-fixes](actions/action-propose-fixes.md) | Generate fix proposals | Report generated, issues > 0 | Sets proposed_fixes |
+| [action-apply-fix](actions/action-apply-fix.md) | Apply selected fix | pending_fixes > 0 | Updates applied_fixes |
+| [action-verify](actions/action-verify.md) | Verify applied fixes | applied_fixes with pending verification | Updates verification_result |
+| [action-complete](actions/action-complete.md) | Finalize session | quality_gate='pass' OR max_iterations | Sets status='completed' |
+| [action-abort](actions/action-abort.md) | Abort on errors | error_count >= max_errors | Sets status='failed' |
+
+## Termination Conditions
+
+- `status === 'completed'`: Normal completion
+- `status === 'user_exit'`: User requested exit
+- `status === 'failed'`: Unrecoverable error
+- `error_count >= max_errors`: Too many errors (default: 3)
+- `iteration_count >= max_iterations`: Max iterations reached (default: 5)
+- `quality_gate === 'pass'`: All quality criteria met
+
+## Error Recovery
+
+| Error Type | Recovery Strategy |
+|------------|-------------------|
+| Action execution failed | Retry up to 3 times, then skip |
+| State parse error | Restore from backup |
+| File write error | Retry with alternative path |
+| User abort | Save state and exit gracefully |
+
+## User Interaction Points
+
+The orchestrator pauses for user input at these points:
+
+1. **action-init**: Confirm target skill and describe issue
+2. **action-propose-fixes**: Select which fixes to apply
+3. **action-verify**: Review verification results, decide to continue or stop
+4. **action-complete**: Review final summary
--- a/.claude/skills/skill-tuning/phases/state-schema.md
+++ b/.claude/skills/skill-tuning/phases/state-schema.md
@@ -0,0 +1,282 @@
+# State Schema
+
+Defines the state structure for skill-tuning orchestrator.
+
+## State Structure
+
+```typescript
+interface TuningState {
+  // === Core Status ===
+  status: 'pending' | 'running' | 'completed' | 'failed';
+  started_at: string;           // ISO timestamp
+  updated_at: string;           // ISO timestamp
+
+  // === Target Skill Info ===
+  target_skill: {
+    name: string;               // e.g., "software-manual"
+    path: string;               // e.g., ".claude/skills/software-manual"
+    execution_mode: 'sequential' | 'autonomous';
+    phases: string[];           // List of phase files
+    specs: string[];            // List of spec files
+  };
+
+  // === User Input ===
+  user_issue_description: string;  // User's problem description
+  focus_areas: string[];           // User-specified focus (optional)
+
+  // === Diagnosis Results ===
+  diagnosis: {
+    context: DiagnosisResult | null;
+    memory: DiagnosisResult | null;
+    dataflow: DiagnosisResult | null;
+    agent: DiagnosisResult | null;
+  };
+
+  // === Issues Found ===
+  issues: Issue[];
+  issues_by_severity: {
+    critical: number;
+    high: number;
+    medium: number;
+    low: number;
+  };
+
+  // === Fix Management ===
+  proposed_fixes: Fix[];
+  applied_fixes: AppliedFix[];
+  pending_fixes: string[];      // Fix IDs pending application
+
+  // === Iteration Control ===
+  iteration_count: number;
+  max_iterations: number;       // Default: 5
+
+  // === Quality Metrics ===
+  quality_score: number;        // 0-100
+  quality_gate: 'pass' | 'review' | 'fail';
+
+  // === Orchestrator State ===
+  completed_actions: string[];
+  current_action: string | null;
+  action_history: ActionHistoryEntry[];
+
+  // === Error Handling ===
+  errors: ErrorEntry[];
+  error_count: number;
+  max_errors: number;           // Default: 3
+
+  // === Output Paths ===
+  work_dir: string;
+  backup_dir: string;
+}
+
+interface DiagnosisResult {
+  status: 'completed' | 'skipped' | 'failed';
+  issues_found: number;
+  severity: 'critical' | 'high' | 'medium' | 'low' | 'none';
+  execution_time_ms: number;
+  details: {
+    patterns_checked: string[];
+    patterns_matched: string[];
+    evidence: Evidence[];
+    recommendations: string[];
+  };
+}
+
+interface Evidence {
+  file: string;
+  line?: number;
+  pattern: string;
+  context: string;
+  severity: string;
+}
+
+interface Issue {
+  id: string;                   // e.g., "ISS-001"
+  type: 'context_explosion' | 'memory_loss' | 'dataflow_break' | 'agent_failure';
+  severity: 'critical' | 'high' | 'medium' | 'low';
+  priority: number;             // 1 = highest
+  location: {
+    file: string;
+    line_start?: number;
+    line_end?: number;
+    phase?: string;
+  };
+  description: string;
+  evidence: string[];
+  root_cause: string;
+  impact: string;
+  suggested_fix: string;
+  related_issues: string[];     // Issue IDs
+}
+
+interface Fix {
+  id: string;                   // e.g., "FIX-001"
+  issue_ids: string[];          // Issues this fix addresses
+  strategy: FixStrategy;
+  description: string;
+  rationale: string;
+  changes: FileChange[];
+  risk: 'low' | 'medium' | 'high';
+  estimated_impact: string;
+  verification_steps: string[];
+}
+
+type FixStrategy =
+  | 'context_summarization'     // Add context compression
+  | 'sliding_window'            // Implement sliding context window
+  | 'structured_state'          // Convert to structured state passing
+  | 'constraint_injection'      // Add constraint propagation
+  | 'checkpoint_restore'        // Add checkpointing mechanism
+  | 'schema_enforcement'        // Add data contract validation
+  | 'orchestrator_refactor'     // Refactor agent coordination
+  | 'state_centralization'      // Centralize state management
+  | 'custom';                   // Custom fix
+
+interface FileChange {
+  file: string;
+  action: 'create' | 'modify' | 'delete';
+  old_content?: string;
+  new_content?: string;
+  diff?: string;
+}
+
+interface AppliedFix {
+  fix_id: string;
+  applied_at: string;
+  success: boolean;
+  backup_path: string;
+  verification_result: 'pass' | 'fail' | 'pending';
+  rollback_available: boolean;
+}
+
+interface ActionHistoryEntry {
+  action: string;
+  started_at: string;
+  completed_at: string;
+  result: 'success' | 'failure' | 'skipped';
+  output_files: string[];
+}
+
+interface ErrorEntry {
+  action: string;
+  message: string;
+  timestamp: string;
+  recoverable: boolean;
+}
+```
+
+## Initial State Template
+
+```json
+{
+  "status": "pending",
+  "started_at": null,
+  "updated_at": null,
+  "target_skill": {
+    "name": null,
+    "path": null,
+    "execution_mode": null,
+    "phases": [],
+    "specs": []
+  },
+  "user_issue_description": "",
+  "focus_areas": [],
+  "diagnosis": {
+    "context": null,
+    "memory": null,
+    "dataflow": null,
+    "agent": null
+  },
+  "issues": [],
+  "issues_by_severity": {
+    "critical": 0,
+    "high": 0,
+    "medium": 0,
+    "low": 0
+  },
+  "proposed_fixes": [],
+  "applied_fixes": [],
+  "pending_fixes": [],
+  "iteration_count": 0,
+  "max_iterations": 5,
+  "quality_score": 0,
+  "quality_gate": "fail",
+  "completed_actions": [],
+  "current_action": null,
+  "action_history": [],
+  "errors": [],
+  "error_count": 0,
+  "max_errors": 3,
+  "work_dir": null,
+  "backup_dir": null
+}
+```
+
+## State Transition Diagram
+
+```
+                         ┌─────────────┐
+                         │   pending   │
+                         └──────┬──────┘
+                                │ action-init
+                                ↓
+                         ┌─────────────┐
+              ┌──────────│   running   │──────────┐
+              │          └──────┬──────┘          │
+              │                 │                 │
+    diagnosis │    ┌────────────┼────────────┐    │ error_count >= 3
+    actions   │    │            │            │    │
+              │    ↓            ↓            ↓    │
+              │ context     memory      dataflow  │
+              │    │            │            │    │
+              │    └────────────┼────────────┘    │
+              │                 │                 │
+              │                 ↓                 │
+              │          action-verify            │
+              │                 │                 │
+              │     ┌───────────┼───────────┐     │
+              │     │           │           │     │
+              │     ↓           ↓           ↓     │
+              │  quality    iterate      apply    │
+              │  gate=pass  (< max)       fix     │
+              │     │           │           │     │
+              │     │           └───────────┘     │
+              │     ↓                             ↓
+              │ ┌─────────────┐           ┌─────────────┐
+              └→│  completed  │           │   failed    │
+                └─────────────┘           └─────────────┘
+```
+
+## State Update Rules
+
+### Atomicity
+All state updates must be atomic - read current state, apply changes, write entire state.
+
+### Immutability
+Never mutate state in place. Always create new state object with changes.
+
+### Validation
+Before writing state, validate against schema to prevent corruption.
+
+### Timestamps
+Always update `updated_at` on every state change.
+
+```javascript
+function updateState(workDir, updates) {
+  const currentState = JSON.parse(Read(`${workDir}/state.json`));
+
+  const newState = {
+    ...currentState,
+    ...updates,
+    updated_at: new Date().toISOString()
+  };
+
+  // Validate before write
+  if (!validateState(newState)) {
+    throw new Error('Invalid state update');
+  }
+
+  Write(`${workDir}/state.json`, JSON.stringify(newState, null, 2));
+  return newState;
+}
+```
--- a/.claude/skills/skill-tuning/specs/problem-taxonomy.md
+++ b/.claude/skills/skill-tuning/specs/problem-taxonomy.md
@@ -0,0 +1,210 @@
+# Problem Taxonomy
+
+Classification of skill execution issues with detection patterns and severity criteria.
+
+## When to Use
+
+| Phase | Usage | Section |
+|-------|-------|---------|
+| All Diagnosis Actions | Issue classification | All sections |
+| action-propose-fixes | Strategy selection | Fix Mapping |
+| action-generate-report | Severity assessment | Severity Criteria |
+
+---
+
+## Problem Categories
+
+### 1. Context Explosion (P2)
+
+**Definition**: Excessive token accumulation causing prompt size to grow unbounded.
+
+**Root Causes**:
+- Unbounded conversation history
+- Full content passing instead of references
+- Missing summarization mechanisms
+- Agent returning full output instead of path+summary
+
+**Detection Patterns**:
+
+| Pattern ID | Regex/Check | Description |
+|------------|-------------|-------------|
+| CTX-001 | `/history\s*[.=].*push\|concat/` | History array growth |
+| CTX-002 | `/JSON\.stringify\s*\(\s*state\s*\)/` | Full state serialization |
+| CTX-003 | `/Read\([^)]+\)\s*[\+,]/` | Multiple file content concatenation |
+| CTX-004 | `/return\s*\{[^}]*content:/` | Agent returning full content |
+| CTX-005 | File length > 5000 chars without summarize | Long prompt without compression |
+
+**Impact Levels**:
+- **Critical**: Context exceeds model limit (128K tokens)
+- **High**: Context > 50K tokens per iteration
+- **Medium**: Context grows 10%+ per iteration
+- **Low**: Potential for growth but currently manageable
+
+---
+
+### 2. Long-tail Forgetting (P3)
+
+**Definition**: Loss of early instructions, constraints, or goals in long execution chains.
+
+**Root Causes**:
+- No explicit constraint propagation
+- Reliance on implicit context
+- Missing checkpoint/restore mechanisms
+- State schema without requirements field
+
+**Detection Patterns**:
+
+| Pattern ID | Regex/Check | Description |
+|------------|-------------|-------------|
+| MEM-001 | Later phases missing constraint reference | Constraint not carried forward |
+| MEM-002 | `/\[TASK\][^[]*(?!\[CONSTRAINTS\])/` | Task without constraints section |
+| MEM-003 | Key phases without checkpoint | Missing state preservation |
+| MEM-004 | State schema lacks `original_requirements` | No constraint persistence |
+| MEM-005 | No verification phase | Output not checked against intent |
+
+**Impact Levels**:
+- **Critical**: Original goal completely lost
+- **High**: Key constraints ignored in output
+- **Medium**: Some requirements missing
+- **Low**: Minor goal drift
+
+---
+
+### 3. Data Flow Disruption (P0)
+
+**Definition**: Inconsistent state management causing data loss or corruption.
+
+**Root Causes**:
+- Multiple state storage locations
+- Inconsistent field naming
+- Missing schema validation
+- Format transformation without normalization
+
+**Detection Patterns**:
+
+| Pattern ID | Regex/Check | Description |
+|------------|-------------|-------------|
+| DF-001 | Multiple state file writes | Scattered state storage |
+| DF-002 | Same concept, different names | Field naming inconsistency |
+| DF-003 | JSON.parse without validation | Missing schema validation |
+| DF-004 | Files written but never read | Orphaned outputs |
+| DF-005 | Autonomous skill without state-schema | Undefined state structure |
+
+**Impact Levels**:
+- **Critical**: Data loss or corruption
+- **High**: State inconsistency between phases
+- **Medium**: Potential for inconsistency
+- **Low**: Minor naming inconsistencies
+
+---
+
+### 4. Agent Coordination Failure (P1)
+
+**Definition**: Fragile agent call patterns causing cascading failures.
+
+**Root Causes**:
+- Missing error handling in Task calls
+- No result validation
+- Inconsistent agent configurations
+- Deeply nested agent calls
+
+**Detection Patterns**:
+
+| Pattern ID | Regex/Check | Description |
+|------------|-------------|-------------|
+| AGT-001 | Task without try-catch | Missing error handling |
+| AGT-002 | Result used without validation | No return value check |
+| AGT-003 | > 3 different agent types | Agent type proliferation |
+| AGT-004 | Nested Task in prompt | Agent calling agent |
+| AGT-005 | Task used but not in allowed-tools | Tool declaration mismatch |
+| AGT-006 | Multiple return formats | Inconsistent agent output |
+
+**Impact Levels**:
+- **Critical**: Workflow crash on agent failure
+- **High**: Unpredictable agent behavior
+- **Medium**: Occasional coordination issues
+- **Low**: Minor inconsistencies
+
+---
+
+## Severity Criteria
+
+### Global Severity Matrix
+
+| Severity | Definition | Action Required |
+|----------|------------|-----------------|
+| **Critical** | Blocks execution or causes data loss | Immediate fix required |
+| **High** | Significantly impacts reliability | Should fix before deployment |
+| **Medium** | Affects quality or maintainability | Fix in next iteration |
+| **Low** | Minor improvement opportunity | Optional fix |
+
+### Severity Calculation
+
+```javascript
+function calculateIssueSeverity(issue) {
+  const weights = {
+    impact_on_execution: 40,  // Does it block workflow?
+    data_integrity_risk: 30,  // Can it cause data loss?
+    frequency: 20,            // How often does it occur?
+    complexity_to_fix: 10     // How hard to fix?
+  };
+
+  let score = 0;
+
+  // Impact on execution
+  if (issue.blocks_execution) score += weights.impact_on_execution;
+  else if (issue.degrades_execution) score += weights.impact_on_execution * 0.5;
+
+  // Data integrity
+  if (issue.causes_data_loss) score += weights.data_integrity_risk;
+  else if (issue.causes_inconsistency) score += weights.data_integrity_risk * 0.5;
+
+  // Frequency
+  if (issue.occurs_every_run) score += weights.frequency;
+  else if (issue.occurs_sometimes) score += weights.frequency * 0.5;
+
+  // Complexity (inverse - easier to fix = higher priority)
+  if (issue.fix_complexity === 'low') score += weights.complexity_to_fix;
+  else if (issue.fix_complexity === 'medium') score += weights.complexity_to_fix * 0.5;
+
+  // Map score to severity
+  if (score >= 70) return 'critical';
+  if (score >= 50) return 'high';
+  if (score >= 30) return 'medium';
+  return 'low';
+}
+```
+
+---
+
+## Fix Mapping
+
+| Problem Type | Recommended Strategies | Priority Order |
+|--------------|----------------------|----------------|
+| Context Explosion | sliding_window, path_reference, context_summarization | 1, 2, 3 |
+| Long-tail Forgetting | constraint_injection, state_constraints_field, checkpoint | 1, 2, 3 |
+| Data Flow Disruption | state_centralization, schema_enforcement, field_normalization | 1, 2, 3 |
+| Agent Coordination | error_wrapping, result_validation, flatten_nesting | 1, 2, 3 |
+
+---
+
+## Cross-Category Dependencies
+
+Some issues may trigger others:
+
+```
+Context Explosion ──→ Long-tail Forgetting
+     (Large context causes important info to be pushed out)
+
+Data Flow Disruption ──→ Agent Coordination Failure
+     (Inconsistent data causes agents to fail)
+
+Agent Coordination Failure ──→ Context Explosion
+     (Failed retries add to context)
+```
+
+When fixing, address in this order:
+1. **P0 Data Flow** - Foundation for other fixes
+2. **P1 Agent Coordination** - Stability
+3. **P2 Context Explosion** - Efficiency
+4. **P3 Long-tail Forgetting** - Quality
--- a/.claude/skills/skill-tuning/specs/quality-gates.md
+++ b/.claude/skills/skill-tuning/specs/quality-gates.md
@@ -0,0 +1,263 @@
+# Quality Gates
+
+Quality thresholds and verification criteria for skill tuning.
+
+## When to Use
+
+| Phase | Usage | Section |
+|-------|-------|---------|
+| action-generate-report | Calculate quality score | Scoring |
+| action-verify | Check quality gates | Gate Definitions |
+| action-complete | Final assessment | Pass Criteria |
+
+---
+
+## Quality Dimensions
+
+### 1. Issue Severity Distribution (40%)
+
+Measures the severity profile of identified issues.
+
+| Metric | Weight | Calculation |
+|--------|--------|-------------|
+| Critical Issues | -25 each | High penalty |
+| High Issues | -15 each | Significant penalty |
+| Medium Issues | -5 each | Moderate penalty |
+| Low Issues | -1 each | Minor penalty |
+
+**Score Calculation**:
+```javascript
+function calculateSeverityScore(issues) {
+  const weights = { critical: 25, high: 15, medium: 5, low: 1 };
+  const deductions = issues.reduce((sum, issue) =>
+    sum + (weights[issue.severity] || 0), 0);
+  return Math.max(0, 100 - deductions);
+}
+```
+
+### 2. Fix Effectiveness (30%)
+
+Measures success rate of applied fixes.
+
+| Metric | Weight | Threshold |
+|--------|--------|-----------|
+| Fixes Verified Pass | +30 | > 80% pass rate |
+| Fixes Verified Fail | -20 | < 50% triggers review |
+| Issues Resolved | +10 | Per resolved issue |
+
+**Score Calculation**:
+```javascript
+function calculateFixScore(appliedFixes) {
+  const total = appliedFixes.length;
+  if (total === 0) return 100;  // No fixes needed = good
+
+  const passed = appliedFixes.filter(f => f.verification_result === 'pass').length;
+  return Math.round((passed / total) * 100);
+}
+```
+
+### 3. Coverage Completeness (20%)
+
+Measures diagnosis coverage across all areas.
+
+| Metric | Weight | Threshold |
+|--------|--------|-----------|
+| All 4 diagnosis complete | +20 | Full coverage |
+| 3 diagnosis complete | +15 | Good coverage |
+| 2 diagnosis complete | +10 | Partial coverage |
+| < 2 diagnosis complete | +0 | Insufficient |
+
+### 4. Iteration Efficiency (10%)
+
+Measures how quickly issues are resolved.
+
+| Metric | Weight | Threshold |
+|--------|--------|-----------|
+| Resolved in 1 iteration | +10 | Excellent |
+| Resolved in 2 iterations | +7 | Good |
+| Resolved in 3 iterations | +4 | Acceptable |
+| > 3 iterations | +0 | Needs improvement |
+
+---
+
+## Gate Definitions
+
+### Gate: PASS
+
+**Threshold**: Quality Score >= 80 AND Critical Issues = 0 AND High Issues <= 2
+
+**Meaning**: Skill is production-ready with minor issues.
+
+**Actions**:
+- Complete tuning session
+- Generate summary report
+- No further fixes required
+
+### Gate: REVIEW
+
+**Threshold**: Quality Score 60-79 OR High Issues 3-5
+
+**Meaning**: Skill has issues requiring attention.
+
+**Actions**:
+- Review remaining issues
+- Apply additional fixes if possible
+- May require manual intervention
+
+### Gate: FAIL
+
+**Threshold**: Quality Score < 60 OR Critical Issues > 0 OR High Issues > 5
+
+**Meaning**: Skill has serious issues blocking deployment.
+
+**Actions**:
+- Must fix critical issues
+- Re-run diagnosis after fixes
+- Consider architectural review
+
+---
+
+## Quality Score Calculation
+
+```javascript
+function calculateQualityScore(state) {
+  // Dimension 1: Severity (40%)
+  const severityScore = calculateSeverityScore(state.issues);
+
+  // Dimension 2: Fix Effectiveness (30%)
+  const fixScore = calculateFixScore(state.applied_fixes);
+
+  // Dimension 3: Coverage (20%)
+  const diagnosisCount = Object.values(state.diagnosis)
+    .filter(d => d !== null).length;
+  const coverageScore = [0, 0, 10, 15, 20][diagnosisCount] || 0;
+
+  // Dimension 4: Efficiency (10%)
+  const efficiencyScore = state.iteration_count <= 1 ? 10 :
+                          state.iteration_count <= 2 ? 7 :
+                          state.iteration_count <= 3 ? 4 : 0;
+
+  // Weighted total
+  const total = (severityScore * 0.4) +
+                (fixScore * 0.3) +
+                (coverageScore * 1.0) +  // Already scaled to 20
+                (efficiencyScore * 1.0);  // Already scaled to 10
+
+  return Math.round(total);
+}
+
+function determineQualityGate(state) {
+  const score = calculateQualityScore(state);
+  const criticalCount = state.issues.filter(i => i.severity === 'critical').length;
+  const highCount = state.issues.filter(i => i.severity === 'high').length;
+
+  if (criticalCount > 0) return 'fail';
+  if (highCount > 5) return 'fail';
+  if (score < 60) return 'fail';
+
+  if (highCount > 2) return 'review';
+  if (score < 80) return 'review';
+
+  return 'pass';
+}
+```
+
+---
+
+## Verification Criteria
+
+### For Each Issue Type
+
+#### Context Explosion Issues
+- [ ] Token count does not grow unbounded
+- [ ] History limited to reasonable size
+- [ ] No full content in prompts (paths used instead)
+- [ ] Agent returns are compact
+
+#### Long-tail Forgetting Issues
+- [ ] Constraints visible in all phase prompts
+- [ ] State schema includes requirements field
+- [ ] Checkpoints exist at key milestones
+- [ ] Output matches original constraints
+
+#### Data Flow Issues
+- [ ] Single state.json after execution
+- [ ] No orphan state files
+- [ ] Schema validation active
+- [ ] Consistent field naming
+
+#### Agent Coordination Issues
+- [ ] All Task calls have error handling
+- [ ] Agent results validated before use
+- [ ] No nested agent calls
+- [ ] Tool declarations match usage
+
+---
+
+## Iteration Control
+
+### Max Iterations
+
+Default: 5 iterations
+
+**Rationale**:
+- Each iteration may introduce new issues
+- Diminishing returns after 3-4 iterations
+- Prevents infinite loops
+
+### Iteration Exit Criteria
+
+```javascript
+function shouldContinueIteration(state) {
+  // Exit if quality gate passed
+  if (state.quality_gate === 'pass') return false;
+
+  // Exit if max iterations reached
+  if (state.iteration_count >= state.max_iterations) return false;
+
+  // Exit if no improvement in last 2 iterations
+  if (state.iteration_count >= 2) {
+    const recentHistory = state.action_history.slice(-10);
+    const issuesResolvedRecently = recentHistory.filter(a =>
+      a.action === 'action-verify' && a.result === 'success'
+    ).length;
+
+    if (issuesResolvedRecently === 0) {
+      console.log('No progress in recent iterations, stopping.');
+      return false;
+    }
+  }
+
+  // Continue if critical/high issues remain
+  const hasUrgentIssues = state.issues.some(i =>
+    i.severity === 'critical' || i.severity === 'high'
+  );
+
+  return hasUrgentIssues;
+}
+```
+
+---
+
+## Reporting Format
+
+### Quality Summary Table
+
+| Dimension | Score | Weight | Weighted |
+|-----------|-------|--------|----------|
+| Severity Distribution | {score}/100 | 40% | {weighted} |
+| Fix Effectiveness | {score}/100 | 30% | {weighted} |
+| Coverage Completeness | {score}/20 | 20% | {score} |
+| Iteration Efficiency | {score}/10 | 10% | {score} |
+| **Total** | | | **{total}/100** |
+
+### Gate Status
+
+```
+Quality Gate: {PASS|REVIEW|FAIL}
+
+Criteria:
+- Quality Score: {score} (threshold: 60)
+- Critical Issues: {count} (threshold: 0)
+- High Issues: {count} (threshold: 5)
+```
--- a/.claude/skills/skill-tuning/specs/tuning-strategies.md
+++ b/.claude/skills/skill-tuning/specs/tuning-strategies.md
--- a/.claude/skills/skill-tuning/templates/diagnosis-report.md
+++ b/.claude/skills/skill-tuning/templates/diagnosis-report.md
@@ -0,0 +1,153 @@
+# Diagnosis Report Template
+
+Template for individual diagnosis action reports.
+
+## Template
+
+```markdown
+# {{diagnosis_type}} Diagnosis Report
+
+**Target Skill**: {{skill_name}}
+**Diagnosis Type**: {{diagnosis_type}}
+**Executed At**: {{timestamp}}
+**Duration**: {{duration_ms}}ms
+
+---
+
+## Summary
+
+| Metric | Value |
+|--------|-------|
+| Issues Found | {{issues_found}} |
+| Severity | {{severity}} |
+| Patterns Checked | {{patterns_checked_count}} |
+| Patterns Matched | {{patterns_matched_count}} |
+
+---
+
+## Patterns Analyzed
+
+{{#each patterns_checked}}
+### {{pattern_name}}
+
+- **Status**: {{status}}
+- **Matches**: {{match_count}}
+- **Files Affected**: {{affected_files}}
+
+{{/each}}
+
+---
+
+## Issues Identified
+
+{{#if issues.length}}
+{{#each issues}}
+### {{id}}: {{description}}
+
+| Field | Value |
+|-------|-------|
+| Type | {{type}} |
+| Severity | {{severity}} |
+| Location | {{location}} |
+| Root Cause | {{root_cause}} |
+| Impact | {{impact}} |
+
+**Evidence**:
+{{#each evidence}}
+- `{{this}}`
+{{/each}}
+
+**Suggested Fix**: {{suggested_fix}}
+
+---
+{{/each}}
+{{else}}
+_No issues found in this diagnosis area._
+{{/if}}
+
+---
+
+## Recommendations
+
+{{#if recommendations.length}}
+{{#each recommendations}}
+{{@index}}. {{this}}
+{{/each}}
+{{else}}
+No specific recommendations - area appears healthy.
+{{/if}}
+
+---
+
+## Raw Data
+
+Full diagnosis data available at:
+`{{output_file}}`
+```
+
+## Variable Reference
+
+| Variable | Type | Source |
+|----------|------|--------|
+| `diagnosis_type` | string | 'context' \| 'memory' \| 'dataflow' \| 'agent' |
+| `skill_name` | string | state.target_skill.name |
+| `timestamp` | string | ISO timestamp |
+| `duration_ms` | number | Execution time |
+| `issues_found` | number | issues.length |
+| `severity` | string | Calculated severity |
+| `patterns_checked` | array | Patterns analyzed |
+| `patterns_matched` | array | Patterns with matches |
+| `issues` | array | Issue objects |
+| `recommendations` | array | String recommendations |
+| `output_file` | string | Path to JSON file |
+
+## Usage
+
+```javascript
+function renderDiagnosisReport(diagnosis, diagnosisType, skillName, outputFile) {
+  return `# ${diagnosisType} Diagnosis Report
+
+**Target Skill**: ${skillName}
+**Diagnosis Type**: ${diagnosisType}
+**Executed At**: ${new Date().toISOString()}
+**Duration**: ${diagnosis.execution_time_ms}ms
+
+---
+
+## Summary
+
+| Metric | Value |
+|--------|-------|
+| Issues Found | ${diagnosis.issues_found} |
+| Severity | ${diagnosis.severity} |
+| Patterns Checked | ${diagnosis.details.patterns_checked.length} |
+| Patterns Matched | ${diagnosis.details.patterns_matched.length} |
+
+---
+
+## Issues Identified
+
+${diagnosis.details.evidence.map((e, i) => `
+### Issue ${i + 1}
+
+- **File**: ${e.file}
+- **Pattern**: ${e.pattern}
+- **Severity**: ${e.severity}
+- **Context**: \`${e.context}\`
+`).join('\n')}
+
+---
+
+## Recommendations
+
+${diagnosis.details.recommendations.map((r, i) => `${i + 1}. ${r}`).join('\n')}
+
+---
+
+## Raw Data
+
+Full diagnosis data available at:
+\`${outputFile}\`
+`;
+}
+```
--- a/.claude/skills/skill-tuning/templates/fix-proposal.md
+++ b/.claude/skills/skill-tuning/templates/fix-proposal.md
@@ -0,0 +1,204 @@
+# Fix Proposal Template
+
+Template for fix proposal documentation.
+
+## Template
+
+```markdown
+# Fix Proposal: {{fix_id}}
+
+**Strategy**: {{strategy}}
+**Risk Level**: {{risk}}
+**Issues Addressed**: {{issue_ids}}
+
+---
+
+## Description
+
+{{description}}
+
+## Rationale
+
+{{rationale}}
+
+---
+
+## Affected Files
+
+{{#each changes}}
+### {{file}}
+
+**Action**: {{action}}
+
+```diff
+{{diff}}
+```
+
+{{/each}}
+
+---
+
+## Implementation Steps
+
+{{#each implementation_steps}}
+{{@index}}. {{this}}
+{{/each}}
+
+---
+
+## Risk Assessment
+
+| Factor | Assessment |
+|--------|------------|
+| Complexity | {{complexity}} |
+| Reversibility | {{reversible ? 'Yes' : 'No'}} |
+| Breaking Changes | {{breaking_changes}} |
+| Test Coverage | {{test_coverage}} |
+
+**Overall Risk**: {{risk}}
+
+---
+
+## Verification Steps
+
+{{#each verification_steps}}
+- [ ] {{this}}
+{{/each}}
+
+---
+
+## Rollback Plan
+
+{{#if rollback_available}}
+To rollback this fix:
+
+```bash
+{{rollback_command}}
+```
+{{else}}
+_Rollback not available for this fix type._
+{{/if}}
+
+---
+
+## Estimated Impact
+
+{{estimated_impact}}
+```
+
+## Variable Reference
+
+| Variable | Type | Source |
+|----------|------|--------|
+| `fix_id` | string | Generated ID (FIX-001) |
+| `strategy` | string | Fix strategy name |
+| `risk` | string | 'low' \| 'medium' \| 'high' |
+| `issue_ids` | array | Related issue IDs |
+| `description` | string | Human-readable description |
+| `rationale` | string | Why this fix works |
+| `changes` | array | File change objects |
+| `implementation_steps` | array | Step-by-step guide |
+| `verification_steps` | array | How to verify fix worked |
+| `estimated_impact` | string | Expected improvement |
+
+## Usage
+
+```javascript
+function renderFixProposal(fix) {
+  return `# Fix Proposal: ${fix.id}
+
+**Strategy**: ${fix.strategy}
+**Risk Level**: ${fix.risk}
+**Issues Addressed**: ${fix.issue_ids.join(', ')}
+
+---
+
+## Description
+
+${fix.description}
+
+## Rationale
+
+${fix.rationale}
+
+---
+
+## Affected Files
+
+${fix.changes.map(change => `
+### ${change.file}
+
+**Action**: ${change.action}
+
+\`\`\`diff
+${change.diff || change.new_content?.slice(0, 200) || 'N/A'}
+\`\`\`
+`).join('\n')}
+
+---
+
+## Verification Steps
+
+${fix.verification_steps.map(step => `- [ ] ${step}`).join('\n')}
+
+---
+
+## Estimated Impact
+
+${fix.estimated_impact}
+`;
+}
+```
+
+## Fix Strategy Templates
+
+### sliding_window
+
+```markdown
+## Description
+Implement sliding window for conversation history to prevent unbounded growth.
+
+## Changes
+- Add MAX_HISTORY constant
+- Modify history update logic to slice array
+- Update state schema documentation
+
+## Verification
+- [ ] Run skill for 10+ iterations
+- [ ] Verify history.length <= MAX_HISTORY
+- [ ] Check no data loss for recent items
+```
+
+### constraint_injection
+
+```markdown
+## Description
+Add explicit constraint section to each phase prompt.
+
+## Changes
+- Add [CONSTRAINTS] section template
+- Reference state.original_requirements
+- Add reminder before output section
+
+## Verification
+- [ ] Check constraints visible in all phases
+- [ ] Test with specific constraint
+- [ ] Verify output respects constraint
+```
+
+### error_wrapping
+
+```markdown
+## Description
+Wrap all Task calls in try-catch with retry logic.
+
+## Changes
+- Create safeTask wrapper function
+- Replace direct Task calls
+- Add error logging to state
+
+## Verification
+- [ ] Simulate agent failure
+- [ ] Verify graceful error handling
+- [ ] Check retry logic
+```