Refactor orchestrator logic and enhance problem taxonomy

- Updated orchestrator decision logic to improve state management and action selection. - Introduced structured termination checks and action selection criteria. - Enhanced state update mechanism with sliding window for action history and error tracking. - Revised problem taxonomy for skill execution issues, consolidating categories and refining detection patterns. - Improved severity calculation method for issue prioritization. - Streamlined fix mapping strategies for better clarity and usability.
2026-02-14 02:42:04 +08:00 · 2026-01-28 21:08:49 +08:00
parent 071c98d89c
commit 24dad8cefd
5 changed files with 598 additions and 860 deletions
--- a/.claude/skills/skill-generator/phases/02-structure-generation.md
+++ b/.claude/skills/skill-generator/phases/02-structure-generation.md
@@ -21,19 +21,79 @@ const skillDir = `.claude/skills/${config.skill_name}`;
 ### Step 2: 创建目录结构
-```javascript
+#### 基础目录（所有模式）
 // 基础目录
 Bash(`mkdir -p "${skillDir}/phases"`);
 Bash(`mkdir -p "${skillDir}/specs"`);
 Bash(`mkdir -p "${skillDir}/templates"`);
-// Autonomous 模式额外目录
+```javascript
 // 基础架构
 Bash(`mkdir -p "${skillDir}/{phases,specs,templates,scripts}"`);
 ```
 #### 执行模式特定目录
 ```
 config.execution_mode
    ↓
    ├─ "sequential"
    │   ↓ Creates:
    │   └─ phases/ (基础目录已包含)
    │      ├─ _orchestrator.md
    │      └─ workflow.json
    │
    └─ "autonomous" | "hybrid"
        ↓ Creates:
        └─ phases/actions/
           ├─ state-schema.md
           └─ *.md (动作文件)
 ```
 ```javascript
 // Autonomous/Hybrid 模式额外目录
 if (config.execution_mode === 'autonomous' || config.execution_mode === 'hybrid') {
  Bash(`mkdir -p "${skillDir}/phases/actions"`);
 }
 ```
-// scripts 目录（默认创建，用于存放确定性脚本）
+#### Context Strategy 特定目录 (P0 增强)
-Bash(`mkdir -p "${skillDir}/scripts"`);
+
 ```javascript
 // ========== P0: 根据上下文策略创建目录 ==========
 const contextStrategy = config.context_strategy || 'file';
 if (contextStrategy === 'file') {
  // 文件策略：创建上下文持久化目录
  Bash(`mkdir -p "${skillDir}/.scratchpad-template/context"`);
  // 创建上下文模板文件
  Write(
    `${skillDir}/.scratchpad-template/context/.gitkeep`,
    "# Runtime context storage for file-based strategy"
  );
 }
 // 内存策略无需创建目录 (in-memory only)
 ```
 **目录树视图**:
 ```
 Sequential + File Strategy:
  .claude/skills/{skill-name}/
  ├── phases/
  │   ├── _orchestrator.md
  │   ├── workflow.json
  │   ├── 01-*.md
  │   └── 02-*.md
  ├── .scratchpad-template/
  │   └── context/           ← File strategy persistent storage
  └── specs/
 Autonomous + Memory Strategy:
  .claude/skills/{skill-name}/
  ├── phases/
  │   ├── orchestrator.md
  │   ├── state-schema.md
  │   └── actions/
  │       └── *.md
  └── specs/
 ```
 ### Step 3: 生成 SKILL.md
--- a/.claude/skills/skill-generator/phases/03-phase-generation.md
+++ b/.claude/skills/skill-generator/phases/03-phase-generation.md
@@ -61,7 +61,11 @@ if (config.execution_mode === 'sequential') {
  const workflowDef = generateWorkflowDefinition(config, phases);
  Write(`${skillDir}/workflow.json`, JSON.stringify(workflowDef, null, 2));
-  // 生成各阶段文件
+  // ========== P0 增强: 生成 Phase 0 (强制规范研读) ==========
  const phase0Content = generatePhase0Spec(config);
  Write(`${skillDir}/phases/00-spec-study.md`, phase0Content);
  // ========== 生成用户定义的各阶段文件 ==========
  for (let i = 0; i < phases.length; i++) {
    const phase = phases[i];
    const prevPhase = i > 0 ? phases[i-1] : null;
@@ -72,7 +76,7 @@ if (config.execution_mode === 'sequential') {
      phaseId: phase.id,
      phaseName: phase.name,
      phaseDescription: phase.description || `Execute ${phase.name}`,
-      input: prevPhase ? prevPhase.output : "user input",
+      input: prevPhase ? prevPhase.output : "phase 0 output", // Phase 0 为首个输入源
      output: phase.output,
      nextPhase: nextPhase ? nextPhase.id : null,
      config: config,
@@ -85,32 +89,55 @@ if (config.execution_mode === 'sequential') {
 // ========== P0 增强: 声明式工作流定义 ==========
 function generateWorkflowDefinition(config, phases) {
  // ========== P0: 添加强制 Phase 0 ==========
  const phase0 = {
    id: '00-spec-study',
    name: 'Specification Study',
    order: 0,
    input: null,
    output: 'spec-study-complete.flag',
    description: '⚠️ MANDATORY: Read all specification documents before execution',
    parallel: false,
    condition: null,
    agent: {
      type: 'universal-executor',
      run_in_background: false
    }
  };
  return {
    skill_name: config.skill_name,
    version: "1.0.0",
    execution_mode: "sequential",
    context_strategy: config.context_strategy || "file",
-    // 声明式阶段列表 (类似 software-manual 的 agents_to_run)
+    // ========== P0: Phase 0 置于首位 ==========
-    phases_to_run: phases.map(p => p.id),
+    phases_to_run: ['00-spec-study', ...phases.map(p => p.id)],
-    // 阶段配置
+    // ========== P0: Phase 0 + 用户定义阶段 ==========
-    phases: phases.map((p, i) => ({
+    phases: [
-      id: p.id,
+      phase0,
-      name: p.name,
+      ...phases.map((p, i) => ({
-      order: i + 1,
+        id: p.id,
-      input: i > 0 ? phases[i-1].output : null,
+        name: p.name,
-      output: p.output,
+        order: i + 1,
-      // 可选的并行配置
+        input: i === 0 ? phase0.output : phases[i-1].output, // 第一个阶段依赖 Phase 0
-      parallel: p.parallel || false,
+        output: p.output,
-      // 可选的条件执行
+        parallel: p.parallel || false,
-      condition: p.condition || null,
+        condition: p.condition || null,
-      // Agent 配置
+        // Agent 配置 (支持 LLM 集成)
-      agent: p.agent || {
+        agent: p.agent || (config.llm_integration?.enabled ? {
-        type: "universal-executor",
+          type: "llm",
-        run_in_background: false
+          tool: config.llm_integration.default_tool,
-      }
+          mode: config.llm_integration.mode || "analysis",
-    })),
+          fallback_chain: config.llm_integration.fallback_chain || [],
          run_in_background: false
        } : {
          type: "universal-executor",
          run_in_background: false
        })
      }))
    ],
    // 终止条件
    termination: {
@@ -233,10 +260,30 @@ async function executePhase(phaseId, phaseConfig, workDir) {
 ## 阶段执行计划
 **执行流程**:
 \`\`\`
 START
    ↓
 Phase 0: Specification Study
    ↓ Output: spec-study-complete.flag
    ↓
 Phase 1: ${phases[0]?.name || 'First Phase'}
    ↓ Output: ${phases[0]?.output || 'phase-1.json'}
 ${phases.slice(1).map((p, i) => `    ↓
 Phase ${i+2}: ${p.name}
    ↓ Output: ${p.output}`).join('\n')}
    ↓
 COMPLETE
 \`\`\`
 **阶段列表**:
 | Order | Phase | Input | Output | Agent |
 |-------|-------|-------|--------|-------|
 | 0 | 00-spec-study | - | spec-study-complete.flag | universal-executor |
 ${phases.map((p, i) =>
-  `| ${i+1} | ${p.id} | ${i > 0 ? phases[i-1].output : '-'} | ${p.output} | ${p.agent?.type || 'universal-executor'} |`
+  `| ${i+1} | ${p.id} | ${i === 0 ? 'spec-study-complete.flag' : phases[i-1].output} | ${p.output} | ${p.agent?.type || 'universal-executor'} |`
 ).join('\n')}
 ## 错误恢复
@@ -751,6 +798,146 @@ ${actions.sort((a, b) => (b.priority || 0) - (a.priority || 0)).map(a =>
 ### Step 4: 辅助函数
 ```javascript
 // ========== P0: Phase 0 生成函数 ==========
 function generatePhase0Spec(config) {
  const skillRoot = '.claude/skills/skill-generator';
  const specsToRead = [
    '../_shared/SKILL-DESIGN-SPEC.md',
    `${skillRoot}/templates/*.md`
  ];
  return `# Phase 0: Specification Study
 ⚠️ **MANDATORY PREREQUISITE** - 此阶段不可跳过
 ## Objective
 在生成任何文件前，完整阅读所有规范文档，理解 Skill 设计标准。
 ## Why This Matters
 **不研读规范 (❌)**:
 \`\`\`
 跳过规范
    ├─ ✗ 不符合标准
    ├─ ✗ 结构混乱
    └─ ✗ 质量问题
 \`\`\`
 **研读规范 (✅)**:
 \`\`\`
 完整研读
    ├─ ✓ 标准化输出
    ├─ ✓ 高质量代码
    └─ ✓ 易于维护
 \`\`\`
 ## Required Reading
 ### P0 - 核心设计规范
 \`\`\`javascript
 // 通用设计标准 (MUST READ)
 const designSpec = Read('.claude/skills/_shared/SKILL-DESIGN-SPEC.md');
 // 关键内容检查点:
 const checkpoints = {
  structure: '目录结构约定',
  naming: '命名规范',
  quality: '质量标准',
  output: '输出格式要求'
 };
 \`\`\`
 ### P1 - 模板文件 (生成前必读)
 \`\`\`javascript
 // 根据执行模式加载对应模板
 const templates = {
  all: [
    'templates/skill-md.md'  // SKILL.md 入口文件模板
  ],
  sequential: [
    'templates/sequential-phase.md'
  ],
  autonomous: [
    'templates/autonomous-orchestrator.md',
    'templates/autonomous-action.md'
  ]
 };
 const mode = '${config.execution_mode}';
 const requiredTemplates = [...templates.all, ...templates[mode]];
 requiredTemplates.forEach(template => {
  const content = Read(\`.claude/skills/skill-generator/\${template}\`);
  // 理解模板结构、变量位置、生成规则
 });
 \`\`\`
 ## Execution
 \`\`\`javascript
 // ========== 加载规范 ==========
 const specs = [];
 // 1. 设计规范 (P0)
 specs.push({
  file: '../_shared/SKILL-DESIGN-SPEC.md',
  content: Read('.claude/skills/_shared/SKILL-DESIGN-SPEC.md'),
  priority: 'P0'
 });
 // 2. 模板文件 (P1)
 const templateFiles = Glob('.claude/skills/skill-generator/templates/*.md');
 templateFiles.forEach(file => {
  specs.push({
    file: file,
    content: Read(file),
    priority: 'P1'
  });
 });
 // ========== 内化规范 ==========
 console.log('📖 Reading specifications...');
 specs.forEach(spec => {
  console.log(\`  [\${spec.priority}] \${spec.file}\`);
  // 理解内容（无需生成文件，仅内存处理）
 });
 // ========== 生成完成标记 ==========
 const result = {
  status: 'completed',
  specs_loaded: specs.length,
  timestamp: new Date().toISOString()
 };
 Write(\`\${workDir}/spec-study-complete.flag\`, JSON.stringify(result, null, 2));
 \`\`\`
 ## Output
 - **标记文件**: \`spec-study-complete.flag\` (证明已完成阅读)
 - **副作用**: 内化规范知识，后续阶段遵循标准
 ## Success Criteria
 ✅ **通过标准**:
 - [ ] 已阅读 SKILL-DESIGN-SPEC.md
 - [ ] 已阅读执行模式对应的模板文件
 - [ ] 理解目录结构约定
 - [ ] 理解命名规范
 - [ ] 理解质量标准
 ## Next Phase
 → [Phase 1: Requirements Discovery](01-requirements-discovery.md)
 **关键**: 只有完成规范研读后，Phase 1 才能正确收集需求并生成符合标准的配置。
 `;
 }
 // ========== 其他辅助函数 ==========
 function toPascalCase(str) {
  return str.split('-').map(s => s.charAt(0).toUpperCase() + s.slice(1)).join('');
 }
--- a/.claude/skills/skill-tuning/SKILL.md
+++ b/.claude/skills/skill-tuning/SKILL.md
@@ -6,375 +6,162 @@ allowed-tools: Task, AskUserQuestion, Read, Write, Bash, Glob, Grep, mcp__ace-to
 # Skill Tuning
-Universal skill diagnosis and optimization tool that identifies and resolves skill execution problems through iterative multi-agent analysis.
+Autonomous diagnosis and optimization for skill execution issues.
-## Architecture Overview
+## Architecture
 ```
-┌─────────────────────────────────────────────────────────────────────────────┐
+┌─────────────────────────────────────────────────────┐
-│  Skill Tuning Architecture (Autonomous Mode + Gemini CLI)                    │
+│  Phase 0: Read Specs (mandatory)                    │
-├─────────────────────────────────────────────────────────────────────────────┤
+│  → problem-taxonomy.md, tuning-strategies.md         │
-│                                                                              │
+└─────────────────────────────────────────────────────┘
-│  ⚠️ Phase 0: Specification  → 阅读规范 + 理解目标 skill 结构 (强制前置)       │
+                        ↓
-│              Study                                                           │
+┌─────────────────────────────────────────────────────┐
-│           ↓                                                                  │
+│  Orchestrator (state-driven)                         │
-│  ┌───────────────────────────────────────────────────────────────────────┐  │
+│  Read state → Select action → Execute → Update → ✓ │
-│  │                    Orchestrator (状态驱动决策)                          │  │
+└─────────────────────────────────────────────────────┘
-│  │  读取诊断状态 → 选择下一步动作 → 执行 → 更新状态 → 循环直到完成         │  │
+        ↓                           ↓
-│  └───────────────────────────────────────────────────────────────────────┘  │
+┌──────────────────────┐   ┌──────────────────┐
-│                              │                                               │
+│  Diagnosis Phase     │   │ Gemini CLI       │
-│     ┌────────────┬───────────┼───────────┬────────────┬────────────┐        │
+│  • Context          │   │ Deep analysis    │
-│     ↓            ↓           ↓           ↓            ↓            ↓        │
+│  • Memory           │   │ (on-demand)      │
-│  ┌──────┐  ┌──────────┐  ┌─────────┐  ┌────────┐  ┌────────┐  ┌─────────┐  │
+│  • DataFlow         │   │                  │
-│  │ Init │→ │ Analyze  │→ │Diagnose │  │Diagnose│  │Diagnose│  │ Gemini  │  │
+│  • Agent            │   │ Complex issues   │
-│  │      │  │Requiremts│  │ Context │  │ Memory │  │DataFlow│  │Analysis │  │
+│  • Docs             │   │ Architecture     │
-│  └──────┘  └──────────┘  └─────────┘  └────────┘  └────────┘  └─────────┘  │
+│  • Token Usage      │   │ Performance      │
-│                 │              │           │           │            │        │
+└──────────────────────┘   └──────────────────┘
-│                 │              └───────────┴───────────┴────────────┘        │
+                ↓
-│                 ↓                                                            │
+        ┌───────────────────┐
-│  ┌───────────────────────────────────────────────────────────────────────┐  │
+        │  Fix & Verify     │
-│  │  Requirement Analysis (NEW)                                            │  │
+        │  Apply → Re-test  │
-│  │  • Phase 1: 维度拆解 (Gemini CLI) - 单一描述 → 多个关注维度             │  │
+        └───────────────────┘
 │  │  • Phase 2: Spec 匹配 - 每个维度 → taxonomy + strategy                 │  │
 │  │  • Phase 3: 覆盖度评估 - 以"有修复策略"为满足标准                       │  │
 │  │  • Phase 4: 歧义检测 - 识别多义性描述，必要时请求澄清                   │  │
 │  └───────────────────────────────────────────────────────────────────────┘  │
 │                              ↓                                               │
 │                    ┌──────────────────┐                                      │
 │                    │  Apply Fixes +   │                                      │
 │                    │  Verify Results  │                                      │
 │                    └──────────────────┘                                      │
 │                                                                              │
 │  ┌───────────────────────────────────────────────────────────────────────┐  │
 │  │                    Gemini CLI Integration                              │  │
 │  │  根据用户需求动态调用 gemini cli 进行深度分析:                          │  │
 │  │  • 需求维度拆解 (requirement decomposition)                             │  │
 │  │  • 复杂问题分析 (prompt engineering, architecture review)               │  │
 │  │  • 代码模式识别 (pattern matching, anti-pattern detection)              │  │
 │  │  • 修复策略生成 (fix generation, refactoring suggestions)               │  │
 │  └───────────────────────────────────────────────────────────────────────┘  │
 │                                                                              │
 └─────────────────────────────────────────────────────────────────────────────┘
 ```
-## Problem Domain
+## Core Issues Detected
-Based on comprehensive analysis, skill-tuning addresses **core skill issues** and **general optimization areas**:
+| Priority | Problem | Root Cause | Fix Strategy |
-
+|----------|---------|-----------|--------------|
-### Core Skill Issues (自动检测)
+| **P0** | Authoring Violation | Intermediate files, state bloat, file relay | eliminate_intermediate, minimize_state |
 | Priority | Problem | Root Cause | Solution Strategy |
 |----------|---------|------------|-------------------|
 | **P0** | Authoring Principles Violation | 中间文件存储, State膨胀, 文件中转 | eliminate_intermediate_files, minimize_state, context_passing |
 | **P1** | Data Flow Disruption | Scattered state, inconsistent formats | state_centralization, schema_enforcement |
-| **P2** | Agent Coordination | Fragile call chains, merge complexity | error_wrapping, result_validation |
+| **P2** | Agent Coordination | Fragile chains, no error handling | error_wrapping, result_validation |
-| **P3** | Context Explosion | Token accumulation, multi-turn bloat | sliding_window, context_summarization |
+| **P3** | Context Explosion | Unbounded history, full content passing | sliding_window, path_reference |
 | **P4** | Long-tail Forgetting | Early constraint loss | constraint_injection, checkpoint_restore |
-| **P5** | Token Consumption | Verbose prompts, excessive state, redundant I/O | prompt_compression, lazy_loading, output_minimization |
+| **P5** | Token Consumption | Verbose prompts, state bloat | prompt_compression, lazy_loading |
-### General Optimization Areas (按需分析 via Gemini CLI)
+## Problem Categories (Detailed Specs)
-| Category | Issues | Gemini Analysis Scope |
+See [specs/problem-taxonomy.md](specs/problem-taxonomy.md) for:
-|----------|--------|----------------------|
+- Detection patterns (regex/checks)
-| **Prompt Engineering** | 模糊指令, 输出格式不一致, 幻觉风险 | 提示词优化, 结构化输出设计 |
+- Severity calculations
-| **Architecture** | 阶段划分不合理, 依赖混乱, 扩展性差 | 架构审查, 模块化建议 |
+- Impact assessments
 | **Performance** | 执行慢, Token消耗高, 重复计算 | 性能分析, 缓存策略 |
 | **Error Handling** | 错误恢复不当, 无降级策略, 日志不足 | 容错设计, 可观测性增强 |
 | **Output Quality** | 输出不稳定, 格式漂移, 质量波动 | 质量门控, 验证机制 |
 | **User Experience** | 交互不流畅, 反馈不清晰, 进度不可见 | UX优化, 进度追踪 |
-## Key Design Principles
+## Tuning Strategies (Detailed Specs)
-1. **Problem-First Diagnosis**: Systematic identification before any fix attempt
+See [specs/tuning-strategies.md](specs/tuning-strategies.md) for:
-2. **Data-Driven Analysis**: Record execution traces, token counts, state snapshots
+- 10+ strategies per category
-3. **Iterative Refinement**: Multiple tuning rounds until quality gates pass
+- Implementation patterns
-4. **Non-Destructive**: All changes are reversible with backup checkpoints
+- Verification methods
 5. **Agent Coordination**: Use specialized sub-agents for each diagnosis type
 6. **Gemini CLI On-Demand**: Deep analysis via CLI for complex/custom issues
---
+## Workflow
-## Gemini CLI Integration
+| Step | Action | Orchestrator Decision | Output |
 |------|--------|----------------------|--------|
 | 1 | `action-init` | status='pending' | Backup, session created |
 | 2 | `action-analyze-requirements` | After init | Required dimensions + coverage |
 | 3 | Diagnosis (6 types) | Focus areas | state.diagnosis.{type} |
 | 4 | `action-gemini-analysis` | Critical issues OR user request | Deep findings |
 | 5 | `action-generate-report` | All diagnosis complete | state.final_report |
 | 6 | `action-propose-fixes` | Issues found | state.proposed_fixes[] |
 | 7 | `action-apply-fix` | Pending fixes | Applied + verified |
 | 8 | `action-complete` | Quality gates pass | session.status='completed' |
-根据用户需求动态调用 Gemini CLI 进行深度分析。
+## Action Reference
-### Trigger Conditions
+| Category | Actions | Purpose |
 |----------|---------|---------|
 | **Setup** | action-init | Initialize backup, session state |
 | **Analysis** | action-analyze-requirements | Decompose user request via Gemini CLI |
 | **Diagnosis** | action-diagnose-{context,memory,dataflow,agent,docs,token_consumption} | Detect category-specific issues |
 | **Deep Analysis** | action-gemini-analysis | Gemini CLI: complex/critical issues |
 | **Reporting** | action-generate-report | Consolidate findings → final_report |
 | **Fixing** | action-propose-fixes, action-apply-fix | Generate + apply fixes |
 | **Verify** | action-verify | Re-run diagnosis, check gates |
 | **Exit** | action-complete, action-abort | Finalize or rollback |
-| Condition | Action | CLI Mode |
+Full action details: [phases/actions/](phases/actions/)
 |-----------|--------|----------|
 | 用户描述复杂问题 | 调用 Gemini 分析问题根因 | `analysis` |
 | 自动诊断发现 critical 问题 | 请求深度分析确认 | `analysis` |
 | 用户请求架构审查 | 执行架构分析 | `analysis` |
 | 需要生成修复代码 | 生成修复提案 | `write` |
 | 标准策略不适用 | 请求定制化策略 | `analysis` |
-### CLI Command Template
+## State Management
 **Single source of truth**: `.workflow/.scratchpad/skill-tuning-{ts}/state.json`
 ```json
 {
  "status": "pending|running|completed|failed",
  "target_skill": { "name": "...", "path": "..." },
  "diagnosis": {
    "context": {...},
    "memory": {...},
    "dataflow": {...},
    "agent": {...},
    "docs": {...},
    "token_consumption": {...}
  },
  "issues": [{"id":"...", "severity":"...", "category":"...", "strategy":"..."}],
  "proposed_fixes": [...],
  "applied_fixes": [...],
  "quality_gate": "pass|fail",
  "final_report": "..."
 }
 ```
 See [phases/state-schema.md](phases/state-schema.md) for complete schema.
 ## Orchestrator Logic
 See [phases/orchestrator.md](phases/orchestrator.md) for:
 - Decision logic (termination checks → action selection)
 - State transitions
 - Error recovery
 ## Key Principles
 1. **Problem-First**: Diagnosis before any fix
 2. **Data-Driven**: Record traces, token counts, snapshots
 3. **Iterative**: Multiple rounds until quality gates pass
 4. **Reversible**: All changes with backup checkpoints
 5. **Non-Invasive**: Minimal changes, maximum clarity
 ## Usage Examples
 ```bash
-ccw cli -p "
+# Basic skill diagnosis
-PURPOSE: ${purpose}
+/skill-tuning "Fix memory leaks in my skill"
-TASK: ${task_steps}
+
-MODE: ${mode}
+# Deep analysis with Gemini
-CONTEXT: @${skill_path}/**/*
+/skill-tuning "Architecture issues in async workflow"
-EXPECTED: ${expected_output}
+
-RULES: $(cat ~/.claude/workflows/cli-templates/protocols/${mode}-protocol.md) | ${constraints}
+# Focus on specific areas
-" --tool gemini --mode ${mode} --cd ${skill_path}
+/skill-tuning "Optimize token consumption and fix agent coordination"
 # Custom issue
 /skill-tuning "My skill produces inconsistent outputs"
 ```
-### Analysis Types
+## Output
-#### 1. Problem Root Cause Analysis
+After completion, review:
-
+- `.workflow/.scratchpad/skill-tuning-{ts}/state.json` - Full state with final_report
-```bash
+- `state.final_report` - Markdown summary (in state.json)
-ccw cli -p "
+- `state.applied_fixes` - List of applied fixes with verification results
 PURPOSE: Identify root cause of skill execution issue: ${user_issue_description}
 TASK: • Analyze skill structure and phase flow • Identify anti-patterns • Trace data flow issues
 MODE: analysis
 CONTEXT: @**/*.md
 EXPECTED: JSON with { root_causes: [], patterns_found: [], recommendations: [] }
 RULES: $(cat ~/.claude/workflows/cli-templates/protocols/analysis-protocol.md) | Focus on execution flow
 " --tool gemini --mode analysis
 ```
 #### 2. Architecture Review
 ```bash
 ccw cli -p "
 PURPOSE: Review skill architecture for scalability and maintainability
 TASK: • Evaluate phase decomposition • Check state management patterns • Assess agent coordination
 MODE: analysis
 CONTEXT: @**/*.md
 EXPECTED: Architecture assessment with improvement recommendations
 RULES: $(cat ~/.claude/workflows/cli-templates/protocols/analysis-protocol.md) | Focus on modularity
 " --tool gemini --mode analysis
 ```
 #### 3. Fix Strategy Generation
 ```bash
 ccw cli -p "
 PURPOSE: Generate fix strategy for issue: ${issue_id} - ${issue_description}
 TASK: • Analyze issue context • Design fix approach • Generate implementation plan
 MODE: analysis
 CONTEXT: @**/*.md
 EXPECTED: JSON with { strategy: string, changes: [], verification_steps: [] }
 RULES: $(cat ~/.claude/workflows/cli-templates/protocols/analysis-protocol.md) | Minimal invasive changes
 " --tool gemini --mode analysis
 ```
 ---
 ## Mandatory Prerequisites
 > **CRITICAL**: Read these documents before executing any action.
 ### Core Specs (Required)
 | Document | Purpose | Priority |
 |----------|---------|----------|
 | [specs/skill-authoring-principles.md](specs/skill-authoring-principles.md) | **首要准则：简洁高效、去除存储、上下文流转** | **P0** |
 | [specs/problem-taxonomy.md](specs/problem-taxonomy.md) | Problem classification and detection patterns | **P0** |
 | [specs/tuning-strategies.md](specs/tuning-strategies.md) | Fix strategies for each problem type | **P0** |
 | [specs/dimension-mapping.md](specs/dimension-mapping.md) | Dimension to Spec mapping rules | **P0** |
 | [specs/quality-gates.md](specs/quality-gates.md) | Quality thresholds and verification criteria | P1 |
 ### Templates (Reference)
 | Document | Purpose |
 |----------|---------|
 | [templates/diagnosis-report.md](templates/diagnosis-report.md) | Diagnosis report structure |
 | [templates/fix-proposal.md](templates/fix-proposal.md) | Fix proposal format |
 ---
 ## Execution Flow
 ```
 ┌─────────────────────────────────────────────────────────────────────────────┐
 │  Phase 0: Specification Study (强制前置 - 禁止跳过)                           │
 │  → Read: specs/problem-taxonomy.md (问题分类)                                │
 │  → Read: specs/tuning-strategies.md (调优策略)                               │
 │  → Read: specs/dimension-mapping.md (维度映射规则)                           │
 │  → Read: Target skill's SKILL.md and phases/*.md                            │
 │  → Output: 内化规范，理解目标 skill 结构                                      │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │  action-init: Initialize Tuning Session                                      │
 │  → Create work directory: .workflow/.scratchpad/skill-tuning-{timestamp}    │
 │  → Initialize state.json with target skill info                             │
 │  → Create backup of target skill files                                       │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │  action-analyze-requirements: Requirement Analysis                           │
 │  → Phase 1: 维度拆解 (Gemini CLI) - 单一描述 → 多个关注维度                   │
 │  → Phase 2: Spec 匹配 - 每个维度 → taxonomy + strategy                       │
 │  → Phase 3: 覆盖度评估 - 以"有修复策略"为满足标准                             │
 │  → Phase 4: 歧义检测 - 识别多义性描述，必要时请求澄清                         │
 │  → Output: state.json (requirement_analysis field)                           │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │  action-diagnose-*: Diagnosis Actions (context/memory/dataflow/agent/docs/   │
 │                      token_consumption)                                      │
 │  → Execute pattern-based detection for each category                         │
 │  → Output: state.json (diagnosis.{category} field)                           │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │  action-generate-report: Consolidated Report                                 │
 │  → Generate markdown summary from state.diagnosis                            │
 │  → Prioritize issues by severity                                             │
 │  → Output: state.json (final_report field)                                   │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │  action-propose-fixes: Fix Proposal Generation                               │
 │  → Generate fix strategies for each issue                                    │
 │  → Create implementation plan                                                │
 │  → Output: state.json (proposed_fixes field)                                 │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │  action-apply-fix: Apply Selected Fix                                        │
 │  → User selects fix to apply                                                 │
 │  → Execute fix with backup                                                   │
 │  → Update state with fix result                                              │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │  action-verify: Verification                                                 │
 │  → Re-run affected diagnosis                                                 │
 │  → Check quality gates                                                       │
 │  → Update iteration count                                                    │
 ├─────────────────────────────────────────────────────────────────────────────┤
 │  action-complete: Finalization                                               │
 │  → Set status='completed'                                                    │
 │  → Final report already in state.json (final_report field)                   │
 │  → Output: state.json (final)                                                │
 └─────────────────────────────────────────────────────────────────────────────┘
 ```
 ## Directory Setup
 ```javascript
 const timestamp = new Date().toISOString().slice(0,19).replace(/[-:T]/g, '');
 const workDir = `.workflow/.scratchpad/skill-tuning-${timestamp}`;
 // Simplified: Only backups dir needed, diagnosis results go into state.json
 Bash(`mkdir -p "${workDir}/backups"`);
 ```
 ## Output Structure
 ```
 .workflow/.scratchpad/skill-tuning-{timestamp}/
 ├── state.json                      # Single source of truth (all results consolidated)
 │   ├── diagnosis.*                 # All diagnosis results embedded
 │   ├── issues[]                    # Found issues
 │   ├── proposed_fixes[]            # Fix proposals
 │   └── final_report                # Markdown summary (on completion)
 └── backups/
    └── {skill-name}-backup/        # Original skill files backup
 ```
 > **Token Optimization**: All outputs consolidated into state.json. No separate diagnosis files or report files.
 ## State Schema
 详细状态结构定义请参阅 [phases/state-schema.md](phases/state-schema.md)。
 核心状态字段：
 - `status`: 工作流状态 (pending/running/completed/failed)
 - `target_skill`: 目标 skill 信息
 - `diagnosis`: 各维度诊断结果
 - `issues`: 发现的问题列表
 - `proposed_fixes`: 建议的修复方案
 ---
 ## Action Reference Guide
 Navigation and entry points for each action in the autonomous workflow:
 ### Core Orchestration
 **Document**: 🔗 [phases/orchestrator.md](phases/orchestrator.md)
 | Attribute | Value |
 |-----------|-------|
 | **Purpose** | Drive tuning workflow via state-driven action selection |
 | **Decision Logic** | Termination checks → Action preconditions → Selection |
 | **Related** | [phases/state-schema.md](phases/state-schema.md) |
 ---
 ### Initialization & Requirements
 | Action | Document | Purpose | Preconditions |
 |--------|----------|---------|---------------|
 | **action-init** | [action-init.md](phases/actions/action-init.md) | Initialize session, backup target skill | `state.status === 'pending'` |
 | **action-analyze-requirements** | [action-analyze-requirements.md](phases/actions/action-analyze-requirements.md) | Decompose user request into dimensions via Gemini CLI | After init, before diagnosis |
 ---
 ### Diagnosis Actions
 | Action | Document | Purpose | Detects |
 |--------|----------|---------|---------|
 | **action-diagnose-context** | [action-diagnose-context.md](phases/actions/action-diagnose-context.md) | Context explosion analysis | Token accumulation, multi-turn bloat |
 | **action-diagnose-memory** | [action-diagnose-memory.md](phases/actions/action-diagnose-memory.md) | Long-tail forgetting analysis | Early constraint loss |
 | **action-diagnose-dataflow** | [action-diagnose-dataflow.md](phases/actions/action-diagnose-dataflow.md) | Data flow analysis | State inconsistency, format drift |
 | **action-diagnose-agent** | [action-diagnose-agent.md](phases/actions/action-diagnose-agent.md) | Agent coordination analysis | Call chain failures, merge issues |
 | **action-diagnose-docs** | [action-diagnose-docs.md](phases/actions/action-diagnose-docs.md) | Documentation structure analysis | Missing specs, unclear flow |
 | **action-diagnose-token-consumption** | [action-diagnose-token-consumption.md](phases/actions/action-diagnose-token-consumption.md) | Token consumption analysis | Verbose prompts, redundant I/O |
 ---
 ### Analysis & Reporting
 | Action | Document | Purpose | Output |
 |--------|----------|---------|--------|
 | **action-gemini-analysis** | [action-gemini-analysis.md](phases/actions/action-gemini-analysis.md) | Deep analysis via Gemini CLI | Custom issue diagnosis |
 | **action-generate-report** | [action-generate-report.md](phases/actions/action-generate-report.md) | Consolidate diagnosis results | `state.final_report` |
 | **action-propose-fixes** | [action-propose-fixes.md](phases/actions/action-propose-fixes.md) | Generate fix strategies | `state.proposed_fixes[]` |
 ---
 ### Fix & Verification
 | Action | Document | Purpose | Preconditions |
 |--------|----------|---------|---------------|
 | **action-apply-fix** | [action-apply-fix.md](phases/actions/action-apply-fix.md) | Apply selected fix with backup | User selected fix |
 | **action-verify** | [action-verify.md](phases/actions/action-verify.md) | Re-run diagnosis, check quality gates | After fix applied |
 ---
 ### Termination
 | Action | Document | Purpose | Trigger |
 |--------|----------|---------|---------|
 | **action-complete** | [action-complete.md](phases/actions/action-complete.md) | Finalize session with report | All quality gates pass |
 | **action-abort** | [action-abort.md](phases/actions/action-abort.md) | Abort session, restore backup | Error limit exceeded |
 ---
 ## Template Reference
 | Template | Purpose | When Used |
 |----------|---------|-----------|
 | [templates/diagnosis-report.md](templates/diagnosis-report.md) | Diagnosis report structure | action-generate-report |
 | [templates/fix-proposal.md](templates/fix-proposal.md) | Fix proposal format | action-propose-fixes |
 ---
 ## Reference Documents
 | Document | Purpose |
 |----------|---------|
-| [phases/orchestrator.md](phases/orchestrator.md) | Orchestrator decision logic |
+| [specs/problem-taxonomy.md](specs/problem-taxonomy.md) | Classification + detection patterns |
 | [specs/tuning-strategies.md](specs/tuning-strategies.md) | Fix implementation guide |
 | [specs/dimension-mapping.md](specs/dimension-mapping.md) | Dimension ↔ Spec mapping |
 | [specs/quality-gates.md](specs/quality-gates.md) | Quality verification criteria |
 | [phases/orchestrator.md](phases/orchestrator.md) | Workflow orchestration |
 | [phases/state-schema.md](phases/state-schema.md) | State structure definition |
-| [phases/actions/action-init.md](phases/actions/action-init.md) | Initialize tuning session |
+| [phases/actions/](phases/actions/) | Individual action implementations |
 | [phases/actions/action-analyze-requirements.md](phases/actions/action-analyze-requirements.md) | Requirement analysis (NEW) |
 | [phases/actions/action-diagnose-context.md](phases/actions/action-diagnose-context.md) | Context explosion diagnosis |
 | [phases/actions/action-diagnose-memory.md](phases/actions/action-diagnose-memory.md) | Long-tail forgetting diagnosis |
 | [phases/actions/action-diagnose-dataflow.md](phases/actions/action-diagnose-dataflow.md) | Data flow diagnosis |
 | [phases/actions/action-diagnose-agent.md](phases/actions/action-diagnose-agent.md) | Agent coordination diagnosis |
 | [phases/actions/action-diagnose-docs.md](phases/actions/action-diagnose-docs.md) | Documentation structure diagnosis |
 | [phases/actions/action-diagnose-token-consumption.md](phases/actions/action-diagnose-token-consumption.md) | Token consumption diagnosis |
 | [phases/actions/action-generate-report.md](phases/actions/action-generate-report.md) | Report generation |
 | [phases/actions/action-propose-fixes.md](phases/actions/action-propose-fixes.md) | Fix proposal |
 | [phases/actions/action-apply-fix.md](phases/actions/action-apply-fix.md) | Fix application |
 | [phases/actions/action-verify.md](phases/actions/action-verify.md) | Verification |
 | [phases/actions/action-complete.md](phases/actions/action-complete.md) | Finalization |
 | [specs/problem-taxonomy.md](specs/problem-taxonomy.md) | Problem classification |
 | [specs/tuning-strategies.md](specs/tuning-strategies.md) | Fix strategies |
 | [specs/dimension-mapping.md](specs/dimension-mapping.md) | Dimension to Spec mapping (NEW) |
 | [specs/quality-gates.md](specs/quality-gates.md) | Quality criteria |
--- a/.claude/skills/skill-tuning/phases/orchestrator.md
+++ b/.claude/skills/skill-tuning/phases/orchestrator.md
@@ -1,28 +1,57 @@
 # Orchestrator
-Autonomous orchestrator for skill-tuning workflow. Reads current state and selects the next action based on diagnosis progress and quality gates.
+State-driven orchestrator for autonomous skill-tuning workflow.
 ## Role
-Drive the tuning workflow by:
+Read state → Select action → Execute → Update → Repeat until termination.
-1. Reading current session state
+
-2. Selecting the appropriate next action
+## Decision Logic
-3. Executing the action via sub-agent
+
-4. Updating state with results
+### Termination Checks (priority order)
-5. Repeating until termination conditions met
+
 | Condition | Action |
 |-----------|--------|
 | `status === 'user_exit'` | null (exit) |
 | `status === 'completed'` | null (exit) |
 | `error_count >= max_errors` | action-abort |
 | `iteration_count >= max_iterations` | action-complete |
 | `quality_gate === 'pass'` | action-complete |
 ### Action Selection
 | Priority | Condition | Action |
 |----------|-----------|--------|
 | 1 | `status === 'pending'` | action-init |
 | 2 | Init done, req analysis missing | action-analyze-requirements |
 | 3 | Req needs clarification | null (wait) |
 | 4 | Req coverage unsatisfied | action-gemini-analysis |
 | 5 | Gemini requested/critical issues | action-gemini-analysis |
 | 6 | Gemini running | null (wait) |
 | 7 | Diagnosis pending (in order) | action-diagnose-{type} |
 | 8 | All diagnosis done, no report | action-generate-report |
 | 9 | Report done, issues exist | action-propose-fixes |
 | 10 | Pending fixes exist | action-apply-fix |
 | 11 | Fixes need verification | action-verify |
 | 12 | New iteration needed | action-diagnose-context (restart) |
 | 13 | Default | action-complete |
 **Diagnosis Order**: context → memory → dataflow → agent → docs → token_consumption
 **Gemini Triggers**:
 - `gemini_analysis_requested === true`
 - Critical issues detected
 - Focus areas include: architecture, prompt, performance, custom
 - Second iteration with unresolved issues
 ## State Management
 ### Read State
 ```javascript
 // Read
 const state = JSON.parse(Read(`${workDir}/state.json`));
 ```
-### Update State
+// Update (with sliding window for history)
-
+function updateState(workDir, updates) {
 ```javascript
 function updateState(updates) {
  const state = JSON.parse(Read(`${workDir}/state.json`));
  const newState = {
    ...state,
@@ -34,344 +63,127 @@ function updateState(updates) {
 }
 ```
 ## Decision Logic
 ```javascript
 function selectNextAction(state) {
  // === Termination Checks ===
  // User exit
  if (state.status === 'user_exit') return null;
  // Completed
  if (state.status === 'completed') return null;
  // Error limit exceeded
  if (state.error_count >= state.max_errors) {
    return 'action-abort';
  }
  // Max iterations exceeded
  if (state.iteration_count >= state.max_iterations) {
    return 'action-complete';
  }
  // === Action Selection ===
  // 1. Not initialized yet
  if (state.status === 'pending') {
    return 'action-init';
  }
  // 1.5. Requirement analysis (在 init 后，diagnosis 前)
  if (state.status === 'running' &&
      state.completed_actions.includes('action-init') &&
      !state.completed_actions.includes('action-analyze-requirements')) {
    return 'action-analyze-requirements';
  }
  // 1.6. 如果需求分析发现歧义需要澄清，暂停等待用户
  if (state.requirement_analysis?.status === 'needs_clarification') {
    return null;  // 等待用户澄清后继续
  }
  // 1.7. 如果需求分析覆盖度不足，优先触发 Gemini 深度分析
  if (state.requirement_analysis?.coverage?.status === 'unsatisfied' &&
      !state.completed_actions.includes('action-gemini-analysis')) {
    return 'action-gemini-analysis';
  }
  // 2. Check if Gemini analysis is requested or needed
  if (shouldTriggerGeminiAnalysis(state)) {
    return 'action-gemini-analysis';
  }
  // 3. Check if Gemini analysis is running
  if (state.gemini_analysis?.status === 'running') {
    // Wait for Gemini analysis to complete
    return null;  // Orchestrator will be re-triggered when CLI completes
  }
  // 4. Run diagnosis in order (only if not completed)
  const diagnosisOrder = ['context', 'memory', 'dataflow', 'agent', 'docs', 'token_consumption'];
  for (const diagType of diagnosisOrder) {
    if (state.diagnosis[diagType] === null) {
      // Check if user wants to skip this diagnosis
      if (!state.focus_areas.length || state.focus_areas.includes(diagType)) {
        return `action-diagnose-${diagType}`;
      }
      // For docs diagnosis, also check 'all' focus_area
      if (diagType === 'docs' && state.focus_areas.includes('all')) {
        return 'action-diagnose-docs';
      }
    }
  }
  // 5. All diagnosis complete, generate report if not done
  const allDiagnosisComplete = diagnosisOrder.every(
    d => state.diagnosis[d] !== null || !state.focus_areas.includes(d)
  );
  if (allDiagnosisComplete && !state.completed_actions.includes('action-generate-report')) {
    return 'action-generate-report';
  }
  // 6. Report generated, propose fixes if not done
  if (state.completed_actions.includes('action-generate-report') &&
      state.proposed_fixes.length === 0 &&
      state.issues.length > 0) {
    return 'action-propose-fixes';
  }
  // 7. Fixes proposed, check if user wants to apply
  if (state.proposed_fixes.length > 0 && state.pending_fixes.length > 0) {
    return 'action-apply-fix';
  }
  // 8. Fixes applied, verify
  if (state.applied_fixes.length > 0 &&
      state.applied_fixes.some(f => f.verification_result === 'pending')) {
    return 'action-verify';
  }
  // 9. Quality gate check
  if (state.quality_gate === 'pass') {
    return 'action-complete';
  }
  // 10. More iterations needed
  if (state.iteration_count < state.max_iterations &&
      state.quality_gate !== 'pass' &&
      state.issues.some(i => i.severity === 'critical' || i.severity === 'high')) {
    // Reset diagnosis for re-evaluation
    return 'action-diagnose-context';  // Start new iteration
  }
  // 11. Default: complete
  return 'action-complete';
 }
 /**
 * 判断是否需要触发 Gemini CLI 分析
 */
 function shouldTriggerGeminiAnalysis(state) {
  // 已完成 Gemini 分析，不再触发
  if (state.gemini_analysis?.status === 'completed') {
    return false;
  }
  // 用户显式请求
  if (state.gemini_analysis_requested === true) {
    return true;
  }
  // 发现 critical 问题且未进行深度分析
  if (state.issues.some(i => i.severity === 'critical') &&
      !state.completed_actions.includes('action-gemini-analysis')) {
    return true;
  }
  // 用户指定了需要 Gemini 分析的 focus_areas
  const geminiAreas = ['architecture', 'prompt', 'performance', 'custom'];
  if (state.focus_areas.some(area => geminiAreas.includes(area))) {
    return true;
  }
  // 标准诊断完成但问题未得到解决，需要深度分析
  const diagnosisComplete = ['context', 'memory', 'dataflow', 'agent', 'docs'].every(
    d => state.diagnosis[d] !== null
  );
  if (diagnosisComplete &&
      state.issues.length > 0 &&
      state.iteration_count > 0 &&
      !state.completed_actions.includes('action-gemini-analysis')) {
    // 第二轮迭代如果问题仍存在，触发 Gemini 分析
    return true;
  }
  return false;
 }
 ```
 ## Execution Loop
 ```javascript
 async function runOrchestrator(workDir) {
  console.log('=== Skill Tuning Orchestrator Started ===');
  let iteration = 0;
-  const MAX_LOOP_ITERATIONS = 50;  // Safety limit
+  const MAX_LOOP = 50;
-  while (iteration < MAX_LOOP_ITERATIONS) {
+  while (iteration++ < MAX_LOOP) {
-    iteration++;
+    // 1. Read state
    // 1. Read current state
    const state = JSON.parse(Read(`${workDir}/state.json`));
    console.log(`[Loop ${iteration}] Status: ${state.status}, Action: ${state.current_action}`);
-    // 2. Select next action
+    // 2. Select action
    const actionId = selectNextAction(state);
    if (!actionId) break;
-    if (!actionId) {
+    // 3. Update: mark current action (sliding window)
-      console.log('No action selected, terminating orchestrator.');
+    updateState(workDir, {
      break;
    }
    console.log(`[Loop ${iteration}] Executing: ${actionId}`);
    // 3. Update state: current action
    // FIX CTX-001: sliding window for action_history (keep last 10)
    updateState({
      current_action: actionId,
      action_history: [...state.action_history, {
        action: actionId,
-        started_at: new Date().toISOString(),
+        started_at: new Date().toISOString()
-        completed_at: null,
+      }].slice(-10)  // Keep last 10
        result: null,
        output_files: []
      }].slice(-10)  // Sliding window: prevent unbounded growth
    });
    // 4. Execute action
    try {
      const actionPrompt = Read(`phases/actions/${actionId}.md`);
-      // FIX CTX-003: Pass state path + key fields only instead of full state
+
      // Pass state path + key fields (not full state)
      const stateKeyInfo = {
        status: state.status,
        iteration_count: state.iteration_count,
        issues_by_severity: state.issues_by_severity,
        quality_gate: state.quality_gate,
        current_action: state.current_action,
        completed_actions: state.completed_actions,
        user_issue_description: state.user_issue_description,
        target_skill: { name: state.target_skill.name, path: state.target_skill.path }
      };
      const stateKeyJson = JSON.stringify(stateKeyInfo, null, 2);
      const result = await Task({
        subagent_type: 'universal-executor',
        run_in_background: false,
        prompt: `
 [CONTEXT]
-You are executing action "${actionId}" for skill-tuning workflow.
+Action: ${actionId}
 Work directory: ${workDir}
 [STATE KEY INFO]
-${stateKeyJson}
+${JSON.stringify(stateKeyInfo, null, 2)}
 [FULL STATE PATH]
 ${workDir}/state.json
-(Read full state from this file if you need additional fields)
+(Read full state from this file if needed)
 [ACTION INSTRUCTIONS]
 ${actionPrompt}
-[OUTPUT REQUIREMENT]
+[OUTPUT]
-After completing the action:
+Return JSON: { stateUpdates: {}, outputFiles: [], summary: "..." }
 1. Write any output files to the work directory
 2. Return a JSON object with:
   - stateUpdates: object with state fields to update
   - outputFiles: array of files created
   - summary: brief description of what was done
 `
      });
-      // 5. Parse result and update state
+      // 5. Parse result
-      let actionResult;
+      let actionResult = result;
-      try {
+      try { actionResult = JSON.parse(result); } catch {}
        actionResult = JSON.parse(result);
      } catch (e) {
        actionResult = {
          stateUpdates: {},
          outputFiles: [],
          summary: result
        };
      }
-      // 6. Update state: action complete
+      // 6. Update: mark complete
-      const updatedHistory = [...state.action_history];
+      updateState(workDir, {
      updatedHistory[updatedHistory.length - 1] = {
        ...updatedHistory[updatedHistory.length - 1],
        completed_at: new Date().toISOString(),
        result: 'success',
        output_files: actionResult.outputFiles || []
      };
      updateState({
        current_action: null,
        completed_actions: [...state.completed_actions, actionId],
        action_history: updatedHistory,
        ...actionResult.stateUpdates
      });
      console.log(`[Loop ${iteration}] Completed: ${actionId}`);
    } catch (error) {
-      console.log(`[Loop ${iteration}] Error in ${actionId}: ${error.message}`);
+      // Error handling (sliding window for errors)
-
+      updateState(workDir, {
      // Error handling
      // FIX CTX-002: sliding window for errors (keep last 5)
      updateState({
        current_action: null,
        errors: [...state.errors, {
          action: actionId,
          message: error.message,
-          timestamp: new Date().toISOString(),
+          timestamp: new Date().toISOString()
-          recoverable: true
+        }].slice(-5),  // Keep last 5
        }].slice(-5),  // Sliding window: prevent unbounded growth
        error_count: state.error_count + 1
      });
    }
  }
  console.log('=== Skill Tuning Orchestrator Finished ===');
 }
 ```
-## Action Catalog
+## Action Preconditions
-| Action | Purpose | Preconditions | Effects |
+| Action | Precondition |
-|--------|---------|---------------|---------|
+|--------|-------------|
-| [action-init](actions/action-init.md) | Initialize tuning session | status === 'pending' | Creates work dirs, backup, sets status='running' |
+| action-init | status='pending' |
-| [action-analyze-requirements](actions/action-analyze-requirements.md) | Analyze user requirements | init completed | Sets requirement_analysis, optimizes focus_areas |
+| action-analyze-requirements | Init complete, not done |
-| [action-diagnose-context](actions/action-diagnose-context.md) | Analyze context explosion | status === 'running' | Sets diagnosis.context |
+| action-diagnose-* | status='running', focus area includes type |
-| [action-diagnose-memory](actions/action-diagnose-memory.md) | Analyze long-tail forgetting | status === 'running' | Sets diagnosis.memory |
+| action-gemini-analysis | Requested OR critical issues OR high complexity |
-| [action-diagnose-dataflow](actions/action-diagnose-dataflow.md) | Analyze data flow issues | status === 'running' | Sets diagnosis.dataflow |
+| action-generate-report | All diagnosis complete |
-| [action-diagnose-agent](actions/action-diagnose-agent.md) | Analyze agent coordination | status === 'running' | Sets diagnosis.agent |
+| action-propose-fixes | Report generated, issues > 0 |
-| [action-diagnose-docs](actions/action-diagnose-docs.md) | Analyze documentation structure | status === 'running', focus includes 'docs' | Sets diagnosis.docs |
+| action-apply-fix | pending_fixes > 0 |
-| [action-gemini-analysis](actions/action-gemini-analysis.md) | Deep analysis via Gemini CLI | User request OR critical issues | Sets gemini_analysis, adds issues |
+| action-verify | applied_fixes with pending verification |
-| [action-generate-report](actions/action-generate-report.md) | Generate consolidated report | All diagnoses complete | Creates tuning-report.md |
+| action-complete | Quality gates pass OR max iterations |
-| [action-propose-fixes](actions/action-propose-fixes.md) | Generate fix proposals | Report generated, issues > 0 | Sets proposed_fixes |
+| action-abort | error_count >= max_errors |
 | [action-apply-fix](actions/action-apply-fix.md) | Apply selected fix | pending_fixes > 0 | Updates applied_fixes |
 | [action-verify](actions/action-verify.md) | Verify applied fixes | applied_fixes with pending verification | Updates verification_result |
 | [action-complete](actions/action-complete.md) | Finalize session | quality_gate='pass' OR max_iterations | Sets status='completed' |
 | [action-abort](actions/action-abort.md) | Abort on errors | error_count >= max_errors | Sets status='failed' |
-## Termination Conditions
+## User Interaction Points
- `status === 'completed'`: Normal completion
+1. **action-init**: Confirm target skill, describe issue
- `status === 'user_exit'`: User requested exit
+2. **action-propose-fixes**: Select which fixes to apply
- `status === 'failed'`: Unrecoverable error
+3. **action-verify**: Review verification, decide to continue or stop
- `requirement_analysis.status === 'needs_clarification'`: Waiting for user clarification (暂停，非终止)
+4. **action-complete**: Review final summary
 - `error_count >= max_errors`: Too many errors (default: 3)
 - `iteration_count >= max_iterations`: Max iterations reached (default: 5)
 - `quality_gate === 'pass'`: All quality criteria met
 ## Error Recovery
-| Error Type | Recovery Strategy |
+| Error Type | Strategy |
-|------------|-------------------|
+|------------|----------|
 | Action execution failed | Retry up to 3 times, then skip |
 | State parse error | Restore from backup |
 | File write error | Retry with alternative path |
 | User abort | Save state and exit gracefully |
-## User Interaction Points
+## Termination Conditions
-The orchestrator pauses for user input at these points:
+- Normal: `status === 'completed'`, `quality_gate === 'pass'`
-
+- User: `status === 'user_exit'`
-1. **action-init**: Confirm target skill and describe issue
+- Error: `status === 'failed'`, `error_count >= max_errors`
-2. **action-propose-fixes**: Select which fixes to apply
+- Iteration limit: `iteration_count >= max_iterations`
-3. **action-verify**: Review verification results, decide to continue or stop
+- Clarification wait: `requirement_analysis.status === 'needs_clarification'` (pause, not terminate)
 4. **action-complete**: Review final summary
--- a/.claude/skills/skill-tuning/specs/problem-taxonomy.md
+++ b/.claude/skills/skill-tuning/specs/problem-taxonomy.md
@@ -2,276 +2,174 @@
 Classification of skill execution issues with detection patterns and severity criteria.
-## When to Use
+## Quick Reference
-| Phase | Usage | Section |
+| Category | Priority | Detection | Fix Strategy |
-|-------|-------|---------|
+|----------|----------|-----------|--------------|
-| All Diagnosis Actions | Issue classification | All sections |
+| Authoring Violation | P0 | Intermediate files, state bloat, file relay | eliminate_intermediate, minimize_state |
-| action-propose-fixes | Strategy selection | Fix Mapping |
+| Data Flow Disruption | P1 | Scattered state, inconsistent formats | state_centralization, schema_enforcement |
-| action-generate-report | Severity assessment | Severity Criteria |
+| Agent Coordination | P2 | Fragile chains, no error handling | error_wrapping, result_validation |
 | Context Explosion | P3 | Unbounded history, full content passing | sliding_window, path_reference |
 | Long-tail Forgetting | P4 | Early constraint loss | constraint_injection, checkpoint_restore |
 | Token Consumption | P5 | Verbose prompts, redundant I/O | prompt_compression, lazy_loading |
 | Doc Redundancy | P6 | Repeated definitions | consolidate_to_ssot |
 | Doc Conflict | P7 | Inconsistent definitions | reconcile_definitions |
 ---
-## Problem Categories
+## 0. Authoring Principles Violation (P0)
-### 0. Authoring Principles Violation (P0)
+**Definition**: Violates skill authoring principles (simplicity, no intermediate files, context passing).
 **Definition**: 违反 skill 撰写首要准则（简洁高效、去除存储、上下文流转）。
 **Root Causes**:
 - 不必要的中间文件存储
 - State schema 过度膨胀
 - 文件中转代替上下文传递
 - 重复数据存储
 **Detection Patterns**:
-| Pattern ID | Regex/Check | Description |
+| Pattern ID | Check | Description |
-|------------|-------------|-------------|
+|------------|-------|-------------|
-| APV-001 | `/Write\([^)]*temp-|intermediate-/` | 中间文件写入 |
+| APV-001 | `/Write\([^)]*temp-\|intermediate-/` | Intermediate file writes |
-| APV-002 | `/Write\([^)]+\)[\s\S]{0,50}Read\([^)]+\)/` | 写后立即读（文件中转） |
+| APV-002 | `/Write\([^)]+\)[\s\S]{0,50}Read\([^)]+\)/` | Write-then-read relay |
-| APV-003 | State schema > 15 fields | State 字段过多 |
+| APV-003 | State schema > 15 fields | Excessive state fields |
-| APV-004 | `/_history\s*[.=].*push|concat/` | 无限增长数组 |
+| APV-004 | `/_history\s*[.=].*push\|concat/` | Unbounded array growth |
-| APV-005 | `/debug_|_cache|_temp/` in state | 调试/缓存字段残留 |
+| APV-005 | `/debug_\|_cache\|_temp/` in state | Debug/cache field residue |
-| APV-006 | Same data in multiple state fields | 重复存储 |
+| APV-006 | Same data in multiple fields | Duplicate storage |
-**Impact Levels**:
+**Impact**: Critical (>5 intermediate files), High (>20 state fields), Medium (debug fields), Low (naming issues)
 - **Critical**: 中间文件 > 5 个，严重违反原则
 - **High**: State 字段 > 20 个，或存在文件中转
 - **Medium**: 存在调试字段或轻微冗余
 - **Low**: 轻微的命名不规范
 ---
-### 1. Context Explosion (P2)
+## 1. Context Explosion (P3)
-**Definition**: Excessive token accumulation causing prompt size to grow unbounded.
+**Definition**: Unbounded token accumulation causing prompt size growth.
 **Root Causes**:
 - Unbounded conversation history
 - Full content passing instead of references
 - Missing summarization mechanisms
 - Agent returning full output instead of path+summary
 **Detection Patterns**:
-| Pattern ID | Regex/Check | Description |
+| Pattern ID | Check | Description |
-|------------|-------------|-------------|
+|------------|-------|-------------|
 | CTX-001 | `/history\s*[.=].*push\|concat/` | History array growth |
 | CTX-002 | `/JSON\.stringify\s*\(\s*state\s*\)/` | Full state serialization |
 | CTX-003 | `/Read\([^)]+\)\s*[\+,]/` | Multiple file content concatenation |
 | CTX-004 | `/return\s*\{[^}]*content:/` | Agent returning full content |
-| CTX-005 | File length > 5000 chars without summarize | Long prompt without compression |
+| CTX-005 | File > 5000 chars without summarization | Long prompts |
-**Impact Levels**:
+**Impact**: Critical (>128K tokens), High (>50K per iteration), Medium (10%+ growth), Low (manageable)
 - **Critical**: Context exceeds model limit (128K tokens)
 - **High**: Context > 50K tokens per iteration
 - **Medium**: Context grows 10%+ per iteration
 - **Low**: Potential for growth but currently manageable
 ---
-### 2. Long-tail Forgetting (P3)
+## 2. Long-tail Forgetting (P4)
-**Definition**: Loss of early instructions, constraints, or goals in long execution chains.
+**Definition**: Loss of early instructions/constraints in long chains.
 **Root Causes**:
 - No explicit constraint propagation
 - Reliance on implicit context
 - Missing checkpoint/restore mechanisms
 - State schema without requirements field
 **Detection Patterns**:
-| Pattern ID | Regex/Check | Description |
+| Pattern ID | Check | Description |
-|------------|-------------|-------------|
+|------------|-------|-------------|
-| MEM-001 | Later phases missing constraint reference | Constraint not carried forward |
+| MEM-001 | Later phases missing constraint reference | Constraint not forwarded |
 | MEM-002 | `/\[TASK\][^[]*(?!\[CONSTRAINTS\])/` | Task without constraints section |
 | MEM-003 | Key phases without checkpoint | Missing state preservation |
-| MEM-004 | State schema lacks `original_requirements` | No constraint persistence |
+| MEM-004 | State lacks `original_requirements` | No constraint persistence |
 | MEM-005 | No verification phase | Output not checked against intent |
-**Impact Levels**:
+**Impact**: Critical (goal lost), High (constraints ignored), Medium (some missing), Low (minor drift)
 - **Critical**: Original goal completely lost
 - **High**: Key constraints ignored in output
 - **Medium**: Some requirements missing
 - **Low**: Minor goal drift
 ---
-### 3. Data Flow Disruption (P0)
+## 3. Data Flow Disruption (P1)
-**Definition**: Inconsistent state management causing data loss or corruption.
+**Definition**: Inconsistent state management causing data loss/corruption.
 **Root Causes**:
 - Multiple state storage locations
 - Inconsistent field naming
 - Missing schema validation
 - Format transformation without normalization
 **Detection Patterns**:
-| Pattern ID | Regex/Check | Description |
+| Pattern ID | Check | Description |
-|------------|-------------|-------------|
+|------------|-------|-------------|
 | DF-001 | Multiple state file writes | Scattered state storage |
 | DF-002 | Same concept, different names | Field naming inconsistency |
 | DF-003 | JSON.parse without validation | Missing schema validation |
 | DF-004 | Files written but never read | Orphaned outputs |
 | DF-005 | Autonomous skill without state-schema | Undefined state structure |
-**Impact Levels**:
+**Impact**: Critical (data loss), High (state inconsistency), Medium (potential inconsistency), Low (naming)
 - **Critical**: Data loss or corruption
 - **High**: State inconsistency between phases
 - **Medium**: Potential for inconsistency
 - **Low**: Minor naming inconsistencies
 ---
-### 4. Agent Coordination Failure (P1)
+## 4. Agent Coordination Failure (P2)
 **Definition**: Fragile agent call patterns causing cascading failures.
 **Root Causes**:
 - Missing error handling in Task calls
 - No result validation
 - Inconsistent agent configurations
 - Deeply nested agent calls
 **Detection Patterns**:
-| Pattern ID | Regex/Check | Description |
+| Pattern ID | Check | Description |
-|------------|-------------|-------------|
+|------------|-------|-------------|
 | AGT-001 | Task without try-catch | Missing error handling |
 | AGT-002 | Result used without validation | No return value check |
-| AGT-003 | > 3 different agent types | Agent type proliferation |
+| AGT-003 | >3 different agent types | Agent type proliferation |
 | AGT-004 | Nested Task in prompt | Agent calling agent |
 | AGT-005 | Task used but not in allowed-tools | Tool declaration mismatch |
 | AGT-006 | Multiple return formats | Inconsistent agent output |
-**Impact Levels**:
+**Impact**: Critical (crash on failure), High (unpredictable behavior), Medium (occasional issues), Low (minor)
 - **Critical**: Workflow crash on agent failure
 - **High**: Unpredictable agent behavior
 - **Medium**: Occasional coordination issues
 - **Low**: Minor inconsistencies
 ---
-### 5. Documentation Redundancy (P5)
+## 5. Documentation Redundancy (P6)
-**Definition**: 同一定义（如 State Schema、映射表、类型定义）在多个文件中重复出现，导致维护困难和不一致风险。
+**Definition**: Same definition (State Schema, mappings, types) repeated across files.
 **Root Causes**:
 - 缺乏单一真相来源 (SSOT)
 - 复制粘贴代替引用
 - 硬编码配置代替集中管理
 **Detection Patterns**:
-| Pattern ID | Regex/Check | Description |
+| Pattern ID | Check | Description |
-|------------|-------------|-------------|
+|------------|-------|-------------|
-| DOC-RED-001 | 跨文件语义比较 | 找到 State Schema 等核心概念的重复定义 |
+| DOC-RED-001 | Cross-file semantic comparison | State Schema duplication |
-| DOC-RED-002 | 代码块 vs 规范表对比 | action 文件中硬编码与 spec 文档的重复 |
+| DOC-RED-002 | Code block vs spec comparison | Hardcoded config duplication |
-| DOC-RED-003 | `/interface\s+(\w+)/` 同名扫描 | 多处定义的 interface/type |
+| DOC-RED-003 | `/interface\s+(\w+)/` same-name scan | Interface/type duplication |
-**Impact Levels**:
+**Impact**: High (core definitions), Medium (type definitions), Low (example code)
 - **High**: 核心定义（State Schema, 映射表）重复
 - **Medium**: 类型定义重复
 - **Low**: 示例代码重复
 ---
-### 6. Token Consumption (P6)
+## 6. Token Consumption (P5)
-**Definition**: Excessive token usage from verbose prompts, large state objects, or inefficient I/O patterns.
+**Definition**: Excessive token usage from verbose prompts, large state, inefficient I/O.
 **Root Causes**:
 - Long static prompts without compression
 - State schema with too many fields
 - Full content embedding instead of path references
 - Arrays growing unbounded without sliding windows
 - Write-then-read file relay patterns
 **Detection Patterns**:
-| Pattern ID | Regex/Check | Description |
+| Pattern ID | Check | Description |
-|------------|-------------|-------------|
+|------------|-------|-------------|
 | TKN-001 | File size > 4KB | Verbose prompt files |
 | TKN-002 | State fields > 15 | Excessive state schema |
 | TKN-003 | `/Read\([^)]+\)\s*[\+,]/` | Full content passing |
 | TKN-004 | `/.push\|concat(?!.*\.slice)/` | Unbounded array growth |
 | TKN-005 | `/Write\([^)]+\)[\s\S]{0,100}Read\([^)]+\)/` | Write-then-read pattern |
-**Impact Levels**:
+**Impact**: High (multiple TKN-003/004), Medium (verbose files), Low (minor optimization)
 - **High**: Multiple TKN-003/TKN-004 issues causing significant token waste
 - **Medium**: Several verbose files or state bloat
 - **Low**: Minor optimization opportunities
 ---
-### 7. Documentation Conflict (P7)
+## 7. Documentation Conflict (P7)
-**Definition**: 同一概念在不同文件中定义不一致，导致行为不可预测和文档误导。
+**Definition**: Same concept defined inconsistently across files.
 **Root Causes**:
 - 定义更新后未同步其他位置
 - 实现与文档漂移
 - 缺乏一致性校验
 **Detection Patterns**:
-| Pattern ID | Regex/Check | Description |
+| Pattern ID | Check | Description |
-|------------|-------------|-------------|
+|------------|-------|-------------|
-| DOC-CON-001 | 键值一致性校验 | 同一键（如优先级）在不同文件中值不同 |
+| DOC-CON-001 | Key-value consistency check | Same key, different values |
-| DOC-CON-002 | 实现 vs 文档对比 | 硬编码配置与文档对应项不一致 |
+| DOC-CON-002 | Implementation vs docs comparison | Hardcoded vs documented mismatch |
-**Impact Levels**:
+**Impact**: Critical (priority/category conflicts), High (strategy mapping inconsistency), Medium (example mismatch)
 - **Critical**: 优先级/类别定义冲突
 - **High**: 策略映射不一致
 - **Medium**: 示例与实际不符
 ---
-## Severity Criteria
+## Severity Calculation
 ### Global Severity Matrix
 | Severity | Definition | Action Required |
 |----------|------------|-----------------|
 | **Critical** | Blocks execution or causes data loss | Immediate fix required |
 | **High** | Significantly impacts reliability | Should fix before deployment |
 | **Medium** | Affects quality or maintainability | Fix in next iteration |
 | **Low** | Minor improvement opportunity | Optional fix |
 ### Severity Calculation
 ```javascript
-function calculateIssueSeverity(issue) {
+function calculateSeverity(issue) {
-  const weights = {
+  const weights = { execution: 40, data_integrity: 30, frequency: 20, complexity: 10 };
    impact_on_execution: 40,  // Does it block workflow?
    data_integrity_risk: 30,  // Can it cause data loss?
    frequency: 20,            // How often does it occur?
    complexity_to_fix: 10     // How hard to fix?
  };
  let score = 0;
-  // Impact on execution
+  if (issue.blocks_execution) score += weights.execution;
-  if (issue.blocks_execution) score += weights.impact_on_execution;
+  if (issue.causes_data_loss) score += weights.data_integrity;
  else if (issue.degrades_execution) score += weights.impact_on_execution * 0.5;
  // Data integrity
  if (issue.causes_data_loss) score += weights.data_integrity_risk;
  else if (issue.causes_inconsistency) score += weights.data_integrity_risk * 0.5;
  // Frequency
  if (issue.occurs_every_run) score += weights.frequency;
-  else if (issue.occurs_sometimes) score += weights.frequency * 0.5;
+  if (issue.fix_complexity === 'low') score += weights.complexity;
  // Complexity (inverse - easier to fix = higher priority)
  if (issue.fix_complexity === 'low') score += weights.complexity_to_fix;
  else if (issue.fix_complexity === 'medium') score += weights.complexity_to_fix * 0.5;
  // Map score to severity
  if (score >= 70) return 'critical';
  if (score >= 50) return 'high';
  if (score >= 30) return 'medium';
@@ -283,36 +181,30 @@ function calculateIssueSeverity(issue) {
 ## Fix Mapping
-| Problem Type | Recommended Strategies | Priority Order |
+| Problem | Strategies (priority order) |
-|--------------|----------------------|----------------|
+|---------|---------------------------|
-| **Authoring Principles Violation** | eliminate_intermediate_files, minimize_state, context_passing | 1, 2, 3 |
+| Authoring Violation | eliminate_intermediate_files, minimize_state, context_passing |
-| Context Explosion | sliding_window, path_reference, context_summarization | 1, 2, 3 |
+| Context Explosion | sliding_window, path_reference, context_summarization |
-| Long-tail Forgetting | constraint_injection, state_constraints_field, checkpoint | 1, 2, 3 |
+| Long-tail Forgetting | constraint_injection, state_constraints_field, checkpoint |
-| Data Flow Disruption | state_centralization, schema_enforcement, field_normalization | 1, 2, 3 |
+| Data Flow Disruption | state_centralization, schema_enforcement, field_normalization |
-| Agent Coordination | error_wrapping, result_validation, flatten_nesting | 1, 2, 3 |
+| Agent Coordination | error_wrapping, result_validation, flatten_nesting |
-| **Token Consumption** | prompt_compression, lazy_loading, output_minimization, state_field_reduction | 1, 2, 3, 4 |
+| Token Consumption | prompt_compression, lazy_loading, output_minimization, state_field_reduction |
-| **Documentation Redundancy** | consolidate_to_ssot, centralize_mapping_config | 1, 2 |
+| Doc Redundancy | consolidate_to_ssot, centralize_mapping_config |
-| **Documentation Conflict** | reconcile_conflicting_definitions | 1 |
+| Doc Conflict | reconcile_conflicting_definitions |
 ---
 ## Cross-Category Dependencies
 Some issues may trigger others:
 ```
-Context Explosion ──→ Long-tail Forgetting
+Context Explosion → Long-tail Forgetting
-     (Large context causes important info to be pushed out)
+  (Large context pushes important info out)
-Data Flow Disruption ──→ Agent Coordination Failure
+Data Flow Disruption → Agent Coordination Failure
-     (Inconsistent data causes agents to fail)
+  (Inconsistent data causes agent failures)
-Agent Coordination Failure ──→ Context Explosion
+Agent Coordination Failure → Context Explosion
-     (Failed retries add to context)
+  (Failed retries add to context)
 ```
-When fixing, address in this order:
+**Fix Order**: P1 Data Flow → P2 Agent → P3 Context → P4 Memory
 1. **P0 Data Flow** - Foundation for other fixes
 2. **P1 Agent Coordination** - Stability
 3. **P2 Context Explosion** - Efficiency
 4. **P3 Long-tail Forgetting** - Quality