feat: Implement workflow tuning command and remove synthesis prompt template

- Added a new command for workflow tuning that extracts commands from reference documents or natural language, executes them via the ccw CLI, and analyzes the artifacts using Gemini. - The command includes detailed phases for setup, execution, analysis, synthesis, and reporting, ensuring a structured approach to workflow optimization. - Removed the synthesis prompt template as it is no longer needed with the new command implementation.
2026-03-21 19:08:17 +08:00 · 2026-03-20 20:25:41 +08:00
parent 7ef47c3d47
commit 8953795c49
10 changed files with 811 additions and 2486 deletions
--- a/.claude/commands/workflow-tune.md
+++ b/.claude/commands/workflow-tune.md
@@ -0,0 +1,811 @@
 ---
 name: workflow-tune
 description: Workflow tuning - extract commands from reference docs or natural language, execute each via ccw cli --tool claude --mode write, then analyze artifacts via gemini. For testing how commands execute in Claude.
 argument-hint: "<file-path> <intent> | \"step1 | step2 | step3\" | \"skill-a,skill-b\" | --file workflow.json [--depth quick|standard|deep] [-y|--yes] [--auto-fix]"
 allowed-tools: Agent(*), AskUserQuestion(*), TaskCreate(*), TaskUpdate(*), TaskList(*), Read(*), Write(*), Edit(*), Bash(*), Glob(*), Grep(*)
 ---
 # Workflow Tune
 测试 Claude command/skill 的执行效果并优化。提取可执行命令，逐步通过 `ccw cli --tool claude` 执行，分析产物质量，生成优化建议。
 ## Tool Assignment
 | Phase | Tool | Mode | Rule |
 |-------|------|------|------|
 | Execute | `claude` | `write` | `universal-rigorous-style` |
 | Analyze | `gemini` | `analysis` | `analysis-review-code-quality` |
 | Synthesize | `gemini` | `analysis` | `analysis-review-architecture` |
 ## Architecture
 ```
 Input → Parse → GenTestTask → Confirm → Setup → [resolveCmd → readMeta → assemblePrompt → Execute → STOP → Analyze → STOP]×N → Synthesize → STOP → Report
                    ↑                                       ↑
          Claude 直接生成测试任务            prompt 中注入 test_task
          (无需 CLI 调用)                  作为命令的执行输入
 ```
 ## Input Formats
 ```
 1. --file workflow.json               → JSON definition
 2. "cmd1 | cmd2 | cmd3"              → pipe-separated commands
 3. "skill-a,skill-b,skill-c"         → comma-separated skills
 4. natural language                   → semantic decomposition
   4a: <file-path> <intent>           → extract commands from reference doc via LLM
   4b: <pure intent text>             → intent-verb matching → ccw cli command assembly
 ```
 **ANTI-PATTERN**: Steps like `{ command: "分析 Phase 管线" }` are WRONG — descriptions, not commands. Correct: `{ command: "/workflow-lite-plan analyze auth module" }` or `{ command: "ccw cli -p '...' --tool claude --mode write" }`
 ## Utility: Shell Escaping
 ```javascript
 function escapeForShell(str) {
  // Replace single quotes with escaped version, wrap in single quotes
  return "'" + str.replace(/'/g, "'\\''") + "'";
 }
 ```
 ## Phase 1: Setup
 ### Step 1.1: Parse Input + Preference Collection
 ```javascript
 const args = $ARGUMENTS.trim();
 const autoYes = /\b(-y|--yes)\b/.test(args);
 // Preference collection (skip if -y)
 if (autoYes) {
  workflowPreferences = { autoYes: true, analysisDepth: 'standard', autoFix: false };
 } else {
  const prefResponse = AskUserQuestion({
    questions: [
      { question: "选择调优配置：", header: "Tune Config", multiSelect: false,
        options: [
          { label: "Quick (轻量分析)", description: "每步简要检查" },
          { label: "Standard (标准分析) (Recommended)", description: "每步详细分析" },
          { label: "Deep (深度分析)", description: "深度审查含架构建议" }
        ]
      },
      { question: "是否自动应用优化建议？", header: "Auto Fix", multiSelect: false,
        options: [
          { label: "No (仅报告) (Recommended)", description: "只分析不修改" },
          { label: "Yes (自动应用)", description: "自动应用高优先级建议" }
        ]
      }
    ]
  });
  const depthMap = { "Quick": "quick", "Standard": "standard", "Deep": "deep" };
  const selectedDepth = Object.keys(depthMap).find(k => prefResponse["Tune Config"].startsWith(k)) || "Standard";
  workflowPreferences = {
    autoYes: false,
    analysisDepth: depthMap[selectedDepth],
    autoFix: prefResponse["Auto Fix"].startsWith("Yes")
  };
 }
 // Parse --depth override
 const depthMatch = args.match(/--depth\s+(quick|standard|deep)/);
 if (depthMatch) workflowPreferences.analysisDepth = depthMatch[1];
 // ── Format Detection ──
 let steps = [], workflowName = 'unnamed-workflow', inputFormat = '';
 let projectScenario = '';  // ★ 统一虚构项目场景，所有步骤共享（在 Step 1.1a 生成）
 const fileMatch = args.match(/--file\s+"?([^\s"]+)"?/);
 if (fileMatch) {
  const wfDef = JSON.parse(Read(fileMatch[1]));
  workflowName = wfDef.name || 'unnamed-workflow';
  projectScenario = wfDef.project_scenario || wfDef.description || '';
  steps = wfDef.steps;
  inputFormat = 'json';
 }
 else if (args.includes('|')) {
  const rawSteps = args.split(/(?:--context|--depth|-y|--yes|--auto-fix)\s+("[^"]*"|\S+)/)[0];
  steps = rawSteps.split('|').map((cmd, i) => ({
    name: `step-${i + 1}`,
    command: cmd.trim(),
    expected_artifacts: [], success_criteria: ''
  }));
  inputFormat = 'pipe';
 }
 else if (/^[\w-]+(,[\w-]+)+/.test(args.split(/\s/)[0])) {
  const skillNames = args.match(/^([^\s]+)/)[1].split(',');
  steps = skillNames.map(name => ({
    name, command: `/${name}`,
    expected_artifacts: [], success_criteria: ''
  }));
  inputFormat = 'skills';
 }
 else {
  inputFormat = 'natural-language';
  let naturalLanguageInput = args.replace(/--\w+\s+"[^"]*"/g, '').replace(/--\w+\s+\S+/g, '').replace(/-y|--yes/g, '').trim();
  const filePathPattern = /(?:[A-Za-z]:[\\\/][^\s,;]+|\/[^\s,;]+\.(?:md|txt|json|yaml|yml|toml)|\.\/?[^\s,;]+\.(?:md|txt|json|yaml|yml|toml))/g;
  const detectedPaths = naturalLanguageInput.match(filePathPattern) || [];
  let referenceDocContent = null, referenceDocPath = null;
  if (detectedPaths.length > 0) {
    referenceDocPath = detectedPaths[0];
    try {
      referenceDocContent = Read(referenceDocPath);
      naturalLanguageInput = naturalLanguageInput.replace(referenceDocPath, '').trim();
    } catch (e) { referenceDocContent = null; }
  }
  // → Mode 4a/4b in Step 1.1b
 }
 // workflowContext 已移除 — 统一使用 projectScenario（在 Step 1.1a 生成）
 ```
 ### Step 1.1a: Generate Test Task (测试任务直接生成)
 > **核心概念**: 所有步骤共享一个**统一虚构项目场景**（如"在线书店网站"），每个命令根据自身能力获得该场景下的一个子任务。由当前 Claude 直接生成，不需要额外 CLI 调用。所有执行在独立沙箱目录中进行，不影响真实项目。
 ```javascript
 // ★ 测试任务直接生成 — 无需 CLI 调用
 // 来源优先级：
 //   1. JSON 定义中的 step.test_task 字段 (已有则跳过)
 //   2. 当前 Claude 直接生成
 const stepsNeedTask = steps.filter(s => !s.test_task);
 if (stepsNeedTask.length > 0) {
  // ── Step A: 生成统一项目场景 ──
  // 根据命令链的整体复杂度，选一个虚构项目作为测试场景
  // 场景必须：完全虚构、与当前工作空间无关、足够支撑所有步骤
  //
  // 场景池示例（根据步骤数量和类型选择合适规模）：
  //   1-2 步: 小型项目 — "命令行 TODO 工具" "Markdown 转 HTML 工具" "天气查询 CLI"
  //   3-4 步: 中型项目 — "在线书店网站" "团队任务看板" "博客系统"
  //   5+ 步:  大型项目 — "多租户 SaaS 平台" "电商系统" "在线教育平台"
  projectScenario = /* Claude 从上述池中选择或自创一个场景 */;
  // 例如: "在线书店网站 — 支持用户注册登录、书籍搜索浏览、购物车、订单管理、评论系统"
  // ── Step B: 为每步生成子任务 ──
  for (const step of stepsNeedTask) {
    const cmdFile = resolveCommandFile(step.command);
    const cmdMeta = readCommandMeta(cmdFile);
    const cmdDesc = (cmdMeta?.description || step.command).toLowerCase();
    // 根据命令类型分配场景下的子任务
    // 每个子任务必须按以下模板生成：
    //
    // ┌─────────────────────────────────────────────────┐
    // │ 项目: {projectScenario}                          │
    // │ 任务: {具体子任务描述}                            │
    // │ 功能点:                                          │
    // │   1. {功能点1 — 具体到接口/组件/模块}             │
    // │   2. {功能点2}                                   │
    // │   3. {功能点3}                                   │
    // │ 技术约束: {语言/框架/架构要求}                    │
    // │ 验收标准:                                        │
    // │   1. {可验证的标准1}                              │
    // │   2. {可验证的标准2}                              │
    // └─────────────────────────────────────────────────┘
    //
    // 命令类型 → 子任务映射：
    //   plan/design   → 架构设计任务: "为{场景}设计技术架构，包含模块划分、数据模型、API 设计"
    //   implement     → 功能实现任务: "实现{场景}的{某模块}，包含{具体功能点}"
    //   analyze/review→ 代码分析任务: "先在沙箱创建{场景}的{某模块}示例代码，然后分析其质量"
    //   test          → 测试任务:     "为{场景}的{某模块}编写测试，覆盖{具体场景}"
    //   fix/debug     → 修复任务:     "先在沙箱创建含已知 bug 的代码，然后诊断修复"
    //   refactor      → 重构任务:     "先在沙箱创建可工作但需重构的代码，然后重构"
    step.test_task = /* 按上述模板生成，必须包含：项目、任务、功能点、技术约束、验收标准 */;
    step.acceptance_criteria = /* 从 test_task 中提取 2-4 条可验证标准 */;
    step.complexity_level = /plan|design|architect/i.test(cmdDesc) ? 'high'
      : /test|lint|format/i.test(cmdDesc) ? 'low' : 'medium';
  }
 }
 ```
 **模拟示例** — 输入 `workflow-lite-plan,workflow-lite-execute`:
 ```
 场景: 在线书店网站 — 支持用户注册登录、书籍搜索、购物车、订单管理
 Step 1 (workflow-lite-plan → plan 类, high):
  项目: 在线书店网站
  任务: 为在线书店设计技术架构和实现计划
  功能点:
    1. 用户模块 — 注册、登录、个人信息管理
    2. 书籍模块 — 搜索、分类浏览、详情页
    3. 交易模块 — 购物车、下单、支付状态
    4. 数据模型 — User, Book, Order, CartItem 表结构设计
  技术约束: TypeScript + Express + SQLite, REST API
  验收标准:
    1. 输出包含模块划分和依赖关系
    2. 包含数据模型定义
    3. 包含 API 路由清单
    4. 包含实现步骤分解
 Step 2 (workflow-lite-execute → implement 类, medium):
  项目: 在线书店网站
  任务: 根据 Step 1 的计划，实现书籍搜索和浏览模块
  功能点:
    1. GET /api/books — 分页列表，支持按标题/作者搜索
    2. GET /api/books/:id — 书籍详情
    3. GET /api/categories — 分类列表
    4. Book 数据模型 + seed 数据
  技术约束: TypeScript + Express + SQLite, 沿用 Step 1 架构
  验收标准:
    1. API 可正常调用返回 JSON
    2. 搜索支持模糊匹配
    3. 包含至少 5 条 seed 数据
 ```
 ### Step 1.1b: Semantic Decomposition (Format 4 only)
 #### Mode 4a: Reference Document → LLM Extraction
 ```javascript
 if (inputFormat === 'natural-language' && referenceDocContent) {
  const extractPrompt = `PURPOSE: Extract ACTUAL EXECUTABLE COMMANDS from the reference document. The user wants to TEST these commands by running them.
 USER INTENT: ${naturalLanguageInput}
 REFERENCE DOCUMENT: ${referenceDocPath}
 DOCUMENT CONTENT:
 ${referenceDocContent}
 CRITICAL RULES:
 - "command" field MUST be a real executable: slash command (/skill-name args), ccw cli call, or shell command
 - CORRECT: { "command": "/workflow-lite-plan analyze auth module" }
 - CORRECT: { "command": "ccw cli -p 'review code' --tool claude --mode write" }
 - WRONG:  { "command": "分析 Phase 管线" } ← DESCRIPTION, not command
 - Default mode to "write"
 EXPECTED OUTPUT (strict JSON):
 {
  "workflow_name": "<name>",
  "project_scenario": "<虚构项目场景>",
  "steps": [{ "name": "", "command": "<executable>", "expected_artifacts": [], "success_criteria": "" }]
 }`;
  Bash({
    command: `ccw cli -p ${escapeForShell(extractPrompt)} --tool claude --mode write --rule universal-rigorous-style`,
    run_in_background: true, timeout: 300000
  });
  // ■ STOP — wait for hook callback, parse JSON → steps[]
 }
 ```
 #### Mode 4b: Pure Intent → Command Assembly
 ```javascript
 if (inputFormat === 'natural-language' && !referenceDocContent) {
  // Intent → rule mapping for ccw cli command generation
  const intentMap = [
    { pattern: /分析|analyze|审查|inspect|scan/i, name: 'analyze', rule: 'analysis-analyze-code-patterns' },
    { pattern: /评审|review|code.?review/i, name: 'review', rule: 'analysis-review-code-quality' },
    { pattern: /诊断|debug|排查|diagnose/i, name: 'diagnose', rule: 'analysis-diagnose-bug-root-cause' },
    { pattern: /安全|security|漏洞/i, name: 'security-audit', rule: 'analysis-assess-security-risks' },
    { pattern: /性能|performance|perf/i, name: 'perf-analysis', rule: 'analysis-analyze-performance' },
    { pattern: /架构|architecture/i, name: 'arch-review', rule: 'analysis-review-architecture' },
    { pattern: /修复|fix|repair|解决/i, name: 'fix', rule: 'development-debug-runtime-issues' },
    { pattern: /实现|implement|开发|create|新增/i, name: 'implement', rule: 'development-implement-feature' },
    { pattern: /重构|refactor/i, name: 'refactor', rule: 'development-refactor-codebase' },
    { pattern: /测试|test/i, name: 'test', rule: 'development-generate-tests' },
    { pattern: /规划|plan|设计|design/i, name: 'plan', rule: 'planning-plan-architecture-design' },
  ];
  const segments = naturalLanguageInput
    .split(/[，,；;、]|(?:然后|接着|之后|最后|再|并|and then|then|finally|next)\s*/i)
    .map(s => s.trim()).filter(Boolean);
  // ★ 将意图文本转化为完整的 ccw cli 命令
  steps = segments.map((segment, i) => {
    const matched = intentMap.find(m => m.pattern.test(segment));
    const rule = matched?.rule || 'universal-rigorous-style';
    // 组装真正可执行的命令
    const command = `ccw cli -p ${escapeForShell('PURPOSE: ' + segment + '\\nTASK: Execute based on intent\\nCONTEXT: @**/*')} --tool claude --mode write --rule ${rule}`;
    return {
      name: matched?.name || `step-${i + 1}`,
      command,
      original_intent: segment,  // 保留原始意图用于分析
      expected_artifacts: [], success_criteria: ''
    };
  });
 }
 ```
 ### Step 1.1c: Execution Plan Confirmation
 ```javascript
 function generateCommandDoc(steps, workflowName, projectScenario, analysisDepth) {
  const stepTable = steps.map((s, i) => {
    const cmdPreview = s.command.length > 60 ? s.command.substring(0, 57) + '...' : s.command;
    const taskPreview = (s.test_task || '-').length > 40 ? s.test_task.substring(0, 37) + '...' : (s.test_task || '-');
    return `| ${i + 1} | ${s.name} | \`${cmdPreview}\` | ${taskPreview} |`;
  }).join('\n');
  return `# Workflow Tune — Execution Plan\n\n**Workflow**: ${workflowName}\n**Test Project**: ${projectScenario}\n**Steps**: ${steps.length}\n**Depth**: ${analysisDepth}\n\n| # | Name | Command | Test Task |\n|---|------|---------|-----------|\n${stepTable}`;
 }
 const commandDoc = generateCommandDoc(steps, workflowName, projectScenario, workflowPreferences.analysisDepth);
 if (!workflowPreferences.autoYes) {
  const confirmation = AskUserQuestion({
    questions: [{
      question: commandDoc + "\n\n确认执行以上 Workflow 调优计划？", header: "Confirm Execution", multiSelect: false,
      options: [
        { label: "Execute (确认执行)", description: "按计划开始执行" },
        { label: "Cancel (取消)", description: "取消" }
      ]
    }]
  });
  if (confirmation["Confirm Execution"].startsWith("Cancel")) return;
 }
 ```
 ### Step 1.2: (Merged into Step 1.1a)
 > Test requirements (acceptance_criteria) are now generated together with test_task in Step 1.1a, avoiding an extra CLI call.
 ### Step 1.3: Create Workspace + Sandbox Project
 ```javascript
 const ts = Date.now();
 const workDir = `.workflow/.scratchpad/workflow-tune-${ts}`;
 // ★ 创建独立沙箱项目目录 — 所有命令执行在此目录中，不影响真实项目
 const sandboxDir = `${workDir}/sandbox`;
 Bash(`mkdir -p "${workDir}/steps" "${sandboxDir}"`);
 // 初始化沙箱为独立 git 仓库（部分命令依赖 git 环境）
 Bash(`cd "${sandboxDir}" && git init && echo "# Sandbox Project" > README.md && git add . && git commit -m "init sandbox"`);
 for (let i = 0; i < steps.length; i++) Bash(`mkdir -p "${workDir}/steps/step-${i + 1}/artifacts"`);
 Write(`${workDir}/command-doc.md`, commandDoc);
 const initialState = {
  status: 'running', started_at: new Date().toISOString(),
  workflow_name: workflowName, project_scenario: projectScenario,
  analysis_depth: workflowPreferences.analysisDepth, auto_fix: workflowPreferences.autoFix,
  sandbox_dir: sandboxDir,  // ★ 独立沙箱项目目录
  current_step: 0,  // ★ State machine cursor
  current_phase: 'execute',  // 'execute' | 'analyze'
  steps: steps.map((s, i) => ({
    ...s, index: i, status: 'pending',
    test_task: s.test_task || '',  // ★ 每步的测试任务
    execution: null, analysis: null,
    test_requirements: s.test_requirements || null
  })),
  gemini_session_id: null,  // ★ Updated after each gemini callback
  work_dir: workDir,
  errors: [], error_count: 0, max_errors: 3
 };
 Write(`${workDir}/workflow-state.json`, JSON.stringify(initialState, null, 2));
 Write(`${workDir}/process-log.md`, `# Process Log\n\n**Workflow**: ${workflowName}\n**Test Project**: ${projectScenario}\n**Steps**: ${steps.length}\n**Started**: ${new Date().toISOString()}\n\n---\n\n`);
 ```
 ## Phase 2: Execute Step
 ### resolveCommandFile — Slash command → file path
 ```javascript
 function resolveCommandFile(command) {
  const cmdMatch = command.match(/^\/?([^\s]+)/);
  if (!cmdMatch) return null;
  const cmdName = cmdMatch[1];
  const cmdPath = cmdName.replace(/:/g, '/');
  const searchRoots = ['.claude', '~/.claude'];
  for (const root of searchRoots) {
    const candidates = [
      `${root}/commands/${cmdPath}.md`,
      `${root}/commands/${cmdPath}/index.md`,
    ];
    for (const candidate of candidates) {
      try { Read(candidate, { limit: 1 }); return candidate; } catch {}
    }
  }
  for (const root of searchRoots) {
    const candidates = [
      `${root}/skills/${cmdName}/SKILL.md`,
      `${root}/skills/${cmdPath.replace(/\//g, '-')}/SKILL.md`,
    ];
    for (const candidate of candidates) {
      try { Read(candidate, { limit: 1 }); return candidate; } catch {}
    }
  }
  return null;
 }
 ```
 ### readCommandMeta — Read YAML frontmatter + body summary
 ```javascript
 function readCommandMeta(filePath) {
  if (!filePath) return null;
  const content = Read(filePath);
  const meta = { filePath, name: '', description: '', argumentHint: '', allowedTools: '', bodySummary: '' };
  const yamlMatch = content.match(/^---\n([\s\S]*?)\n---/);
  if (yamlMatch) {
    const yaml = yamlMatch[1];
    const nameMatch = yaml.match(/^name:\s*(.+)$/m);
    const descMatch = yaml.match(/^description:\s*(.+)$/m);
    const hintMatch = yaml.match(/^argument-hint:\s*"?(.+?)"?\s*$/m);
    const toolsMatch = yaml.match(/^allowed-tools:\s*(.+)$/m);
    if (nameMatch) meta.name = nameMatch[1].trim();
    if (descMatch) meta.description = descMatch[1].trim();
    if (hintMatch) meta.argumentHint = hintMatch[1].trim();
    if (toolsMatch) meta.allowedTools = toolsMatch[1].trim();
  }
  const bodyStart = content.indexOf('---', content.indexOf('---') + 3);
  if (bodyStart !== -1) {
    const body = content.substring(bodyStart + 3).trim();
    meta.bodySummary = body.split('\n').slice(0, 30).join('\n');
  }
  return meta;
 }
 ```
 ### assembleStepPrompt — Build execution prompt from command metadata
 ```javascript
 function assembleStepPrompt(step, stepIdx, state) {
  // ── 1. Resolve command file + metadata ──
  const isSlashCmd = step.command.startsWith('/');
  const cmdFile = isSlashCmd ? resolveCommandFile(step.command) : null;
  const cmdMeta = readCommandMeta(cmdFile);
  const cmdArgs = isSlashCmd ? step.command.replace(/^\/?[^\s]+\s*/, '').trim() : '';
  // ── 2. Prior/next step context ──
  const prevStep = stepIdx > 0 ? state.steps[stepIdx - 1] : null;
  const nextStep = stepIdx < state.steps.length - 1 ? state.steps[stepIdx + 1] : null;
  const priorContext = prevStep
    ? `PRIOR STEP: "${prevStep.name}" — ${prevStep.command}\n  Status: ${prevStep.status} | Artifacts: ${prevStep.execution?.artifact_count || 0}`
    : 'PRIOR STEP: None (first step)';
  const nextContext = nextStep
    ? `NEXT STEP: "${nextStep.name}" — ${nextStep.command}\n  Ensure output is consumable by next step`
    : 'NEXT STEP: None (last step)';
  // ── 3. Acceptance criteria (from test_task generation) ──
  const criteria = step.acceptance_criteria || [];
  const testReqSection = criteria.length > 0
    ? `ACCEPTANCE CRITERIA:\n${criteria.map((c, i) => `  ${i + 1}. ${c}`).join('\n')}`
    : '';
  // ── 4. Test task — the concrete scenario to drive execution ──
  const testTask = step.test_task || '';
  const testTaskSection = testTask
    ? `TEST TASK (用此任务驱动命令执行):\n  ${testTask}`
    : '';
  // ── 5. Build prompt based on whether command has metadata ──
  if (cmdMeta) {
    // Slash command with resolved file — rich context prompt
    return `PURPOSE: Execute workflow step "${step.name}" (${stepIdx + 1}/${state.steps.length}).
 COMMAND DEFINITION:
  Name: ${cmdMeta.name}
  Description: ${cmdMeta.description}
  Argument Format: ${cmdMeta.argumentHint || 'none'}
  Allowed Tools: ${cmdMeta.allowedTools || 'default'}
  Source: ${cmdMeta.filePath}
 COMMAND TO EXECUTE: ${step.command}
 ARGUMENTS: ${cmdArgs || '(no arguments)'}
 ${testTaskSection}
 COMMAND REFERENCE (first 30 lines):
 ${cmdMeta.bodySummary}
 PROJECT: ${state.project_scenario}
 SANDBOX PROJECT: ${state.sandbox_dir}
 OUTPUT DIR: ${state.work_dir}/steps/step-${stepIdx + 1}
 ${priorContext}
 ${nextContext}
 ${testReqSection}
 TASK: Execute the command as described in COMMAND DEFINITION, using TEST TASK as the input/scenario. Use the COMMAND REFERENCE to understand expected behavior. All work happens in the SANDBOX PROJECT directory (an isolated empty project, NOT the real workspace). Auto-confirm all prompts.
 CONSTRAINTS: Stay scoped to this step only. Follow the command's own execution flow. The TEST TASK is the real work — treat it as the $ARGUMENTS input to the command. Do NOT read/modify files outside SANDBOX PROJECT.`;
  } else {
    // Shell command, ccw cli command, or unresolved command
    return `PURPOSE: Execute workflow step "${step.name}" (${stepIdx + 1}/${state.steps.length}).
 COMMAND: ${step.command}
 ${testTaskSection}
 PROJECT: ${state.project_scenario}
 SANDBOX PROJECT: ${state.sandbox_dir}
 OUTPUT DIR: ${state.work_dir}/steps/step-${stepIdx + 1}
 ${priorContext}
 ${nextContext}
 ${testReqSection}
 TASK: Execute the COMMAND above with TEST TASK as the input scenario. All work happens in the SANDBOX PROJECT directory (an isolated empty project). Auto-confirm all prompts.
 CONSTRAINTS: Stay scoped to this step only. The TEST TASK is the real work to execute. Do NOT read/modify files outside SANDBOX PROJECT.`;
  }
 }
 ```
 ### Step Execution
 ```javascript
 const stepIdx = state.current_step;
 const step = state.steps[stepIdx];
 const stepDir = `${state.work_dir}/steps/step-${stepIdx + 1}`;
 // Pre-execution: snapshot sandbox directory files
 const preFiles = Bash(`find "${state.sandbox_dir}" -type f 2>/dev/null | sort`).stdout.trim();
 Write(`${stepDir}/pre-exec-snapshot.txt`, preFiles || '(empty)');
 const startTime = Date.now();
 const prompt = assembleStepPrompt(step, stepIdx, state);
 // ★ All steps execute via ccw cli --tool claude --mode write
 // ★ --cd 指向沙箱目录（独立项目），不影响真实工作空间
 Bash({
  command: `ccw cli -p ${escapeForShell(prompt)} --tool claude --mode write --rule universal-rigorous-style --cd "${state.sandbox_dir}"`,
  run_in_background: true, timeout: 600000
 });
 // ■ STOP — wait for hook callback
 ```
 ### Post-Execute Callback Handler
 ```javascript
 // ★ This runs after receiving the ccw cli callback
 const duration = Date.now() - startTime;
 // Collect artifacts by scanning sandbox (not git diff — sandbox is an independent project)
 const postFiles = Bash(`find "${state.sandbox_dir}" -type f -newer "${stepDir}/pre-exec-snapshot.txt" 2>/dev/null | sort`).stdout.trim();
 const newArtifacts = postFiles ? postFiles.split('\n').filter(f => !f.endsWith('.git/')) : [];
 const artifactManifest = {
  step: step.name, step_index: stepIdx,
  success: true, duration_ms: duration,
  artifacts: newArtifacts.map(f => ({
    path: f,
    type: f.endsWith('.md') ? 'markdown' : f.endsWith('.json') ? 'json' : 'other'
  })),
  collected_at: new Date().toISOString()
 };
 Write(`${stepDir}/artifacts-manifest.json`, JSON.stringify(artifactManifest, null, 2));
 // Update state
 state.steps[stepIdx].status = 'executed';
 state.steps[stepIdx].execution = {
  success: true, duration_ms: duration,
  artifact_count: newArtifacts.length
 };
 state.current_phase = 'analyze';
 Write(`${state.work_dir}/workflow-state.json`, JSON.stringify(state, null, 2));
 // → Proceed to Phase 3 for this step
 ```
 ## Phase 3: Analyze Step (per step, via gemini)
 ```javascript
 const manifest = JSON.parse(Read(`${stepDir}/artifacts-manifest.json`));
 // Build artifact content for analysis
 let artifactSummary = '';
 if (state.analysis_depth === 'quick') {
  artifactSummary = manifest.artifacts.map(a => `- ${a.path} (${a.type})`).join('\n');
 } else {
  const maxLines = state.analysis_depth === 'deep' ? 300 : 150;
  artifactSummary = manifest.artifacts.map(a => {
    try { return `--- ${a.path} ---\n${Read(a.path, { limit: maxLines })}`; }
    catch { return `--- ${a.path} --- [unreadable]`; }
  }).join('\n\n');
 }
 const criteria = step.acceptance_criteria || [];
 const testTaskDesc = step.test_task ? `TEST TASK: ${step.test_task}` : '';
 const criteriaSection = criteria.length > 0
  ? `ACCEPTANCE CRITERIA:\n${criteria.map((c, i) => `  ${i + 1}. ${c}`).join('\n')}`
  : '';
 const analysisPrompt = `PURPOSE: Evaluate execution quality of step "${step.name}" (${stepIdx + 1}/${state.steps.length}).
 WORKFLOW: ${state.workflow_name} — ${state.project_scenario}
 COMMAND: ${step.command}
 ${testTaskDesc}
 ${criteriaSection}
 EXECUTION: Duration ${step.execution.duration_ms}ms | Artifacts: ${manifest.artifacts.length}
 ARTIFACTS:\n${artifactSummary}
 EXPECTED OUTPUT (strict JSON):
 { "quality_score": <0-100>, "requirement_match": { "pass": <bool>, "criteria_met": [], "criteria_missed": [], "fail_signals_detected": [] }, "execution_assessment": { "success": <bool>, "completeness": "", "notes": "" }, "artifact_assessment": { "count": <n>, "quality": "", "key_outputs": [], "missing_outputs": [] }, "issues": [{ "severity": "critical|high|medium|low", "description": "", "suggestion": "" }], "optimization_opportunities": [{ "area": "", "description": "", "impact": "high|medium|low" }], "step_summary": "" }`;
 let cliCommand = `ccw cli -p ${escapeForShell(analysisPrompt)} --tool gemini --mode analysis --rule analysis-review-code-quality`;
 if (state.gemini_session_id) cliCommand += ` --resume ${state.gemini_session_id}`;
 Bash({ command: cliCommand, run_in_background: true, timeout: 300000 });
 // ■ STOP — wait for hook callback
 ```
 ### Post-Analyze Callback Handler
 ```javascript
 // ★ Parse analysis result JSON from callback
 const analysisResult = /* parsed from callback output */;
 // ★ Capture gemini session ID for resume chain
 // Session ID is in stderr: [CCW_EXEC_ID=gem-xxxxxx-xxxx]
 state.gemini_session_id = /* captured from callback exec_id */;
 Write(`${stepDir}/step-${stepIdx + 1}-analysis.json`, JSON.stringify(analysisResult, null, 2));
 // Update state
 state.steps[stepIdx].analysis = {
  quality_score: analysisResult.quality_score,
  requirement_pass: analysisResult.requirement_match?.pass,
  issue_count: (analysisResult.issues || []).length
 };
 state.steps[stepIdx].status = 'completed';
 // Append to process log
 const logEntry = `## Step ${stepIdx + 1}: ${step.name}\n- Score: ${analysisResult.quality_score}/100\n- Req: ${analysisResult.requirement_match?.pass ? 'PASS' : 'FAIL'}\n- Issues: ${(analysisResult.issues || []).length}\n- Summary: ${analysisResult.step_summary}\n\n`;
 Edit(`${state.work_dir}/process-log.md`, /* append logEntry */);
 // ★ Advance state machine
 state.current_step = stepIdx + 1;
 state.current_phase = 'execute';
 Write(`${state.work_dir}/workflow-state.json`, JSON.stringify(state, null, 2));
 // ★ Decision: advance or synthesize
 if (state.current_step < state.steps.length) {
  // → Back to Phase 2 for next step
 } else {
  // → Phase 4: Synthesize
 }
 ```
 ## Step Loop — State Machine
 ```
 NOT a sync for-loop. Each step follows this state machine:
  ┌─────────────────────────────────────────────────────┐
  │ state.current_step = N, state.current_phase = X     │
  ├─────────────────────────────────────────────────────┤
  │ phase='execute' → Phase 2 → ccw cli claude → STOP  │
  │   callback → collect artifacts → phase='analyze'    │
  │ phase='analyze' → Phase 3 → ccw cli gemini → STOP  │
  │   callback → save analysis → current_step++         │
  │   if current_step < total → phase='execute' (loop)  │
  │   else → Phase 4 (synthesize)                       │
  └─────────────────────────────────────────────────────┘
 Error handling:
  - Execute timeout → retry once, then mark failed, advance
  - Analyze failure → retry without --resume, then skip analysis
  - 3+ consecutive errors → terminate, jump to Phase 5 partial report
 ```
 ## Phase 4: Synthesize (via gemini)
 ```javascript
 const stepAnalyses = state.steps.map((step, i) => {
  try { return { step: step.name, content: Read(`${state.work_dir}/steps/step-${i + 1}/step-${i + 1}-analysis.json`) }; }
  catch { return { step: step.name, content: '[Not available]' }; }
 });
 const scores = state.steps.map(s => s.analysis?.quality_score).filter(Boolean);
 const avgScore = scores.length > 0 ? Math.round(scores.reduce((a, b) => a + b, 0) / scores.length) : 0;
 const synthesisPrompt = `PURPOSE: Synthesize all step analyses into holistic workflow assessment with actionable optimization plan.
 WORKFLOW: ${state.workflow_name} — ${state.project_scenario}
 Steps: ${state.steps.length} | Avg Quality: ${avgScore}/100
 STEP ANALYSES:\n${stepAnalyses.map(a => `### ${a.step}\n${a.content}`).join('\n\n---\n\n')}
 Evaluate: coherence across steps, handoff quality, redundancy, bottlenecks.
 EXPECTED OUTPUT (strict JSON):
 { "workflow_score": <0-100>, "coherence": { "score": <0-100>, "assessment": "", "gaps": [] }, "bottlenecks": [{ "step": "", "issue": "", "suggestion": "" }], "per_step_improvements": [{ "step": "", "priority": "high|medium|low", "action": "" }], "workflow_improvements": [{ "area": "", "description": "", "impact": "high|medium|low" }], "summary": "" }`;
 let cliCommand = `ccw cli -p ${escapeForShell(synthesisPrompt)} --tool gemini --mode analysis --rule analysis-review-architecture`;
 if (state.gemini_session_id) cliCommand += ` --resume ${state.gemini_session_id}`;
 Bash({ command: cliCommand, run_in_background: true, timeout: 300000 });
 // ■ STOP — wait for hook callback → parse JSON, write synthesis.json, update state
 ```
 ## Phase 5: Report
 ```javascript
 const synthesis = JSON.parse(Read(`${state.work_dir}/synthesis.json`));
 const scores = state.steps.map(s => s.analysis?.quality_score).filter(Boolean);
 const avgScore = scores.length > 0 ? Math.round(scores.reduce((a, b) => a + b, 0) / scores.length) : 0;
 const totalIssues = state.steps.reduce((sum, s) => sum + (s.analysis?.issue_count || 0), 0);
 const stepTable = state.steps.map((s, i) => {
  const reqStr = s.analysis?.requirement_pass === true ? 'PASS' : s.analysis?.requirement_pass === false ? 'FAIL' : '-';
  return `| ${i + 1} | ${s.name} | ${s.execution?.success ? 'OK' : 'FAIL'} | ${reqStr} | ${s.analysis?.quality_score || '-'} | ${s.analysis?.issue_count || 0} |`;
 }).join('\n');
 const improvements = (synthesis.per_step_improvements || [])
  .filter(imp => imp.priority === 'high')
  .map(imp => `- **${imp.step}**: ${imp.action}`)
  .join('\n');
 const report = `# Workflow Tune Report
 | Field | Value |
 |---|---|
 | Workflow | ${state.workflow_name} |
 | Test Project | ${state.project_scenario} |
 | Workflow Score | ${synthesis.workflow_score || avgScore}/100 |
 | Avg Step Score | ${avgScore}/100 |
 | Total Issues | ${totalIssues} |
 | Coherence | ${synthesis.coherence?.score || '-'}/100 |
 ## Step Results
 | # | Step | Exec | Req | Quality | Issues |
 |---|------|------|-----|---------|--------|
 ${stepTable}
 ## High Priority Improvements
 ${improvements || 'None'}
 ## Workflow-Level Improvements
 ${(synthesis.workflow_improvements || []).map(w => `- **${w.area}** (${w.impact}): ${w.description}`).join('\n') || 'None'}
 ## Bottlenecks
 ${(synthesis.bottlenecks || []).map(b => `- **${b.step}**: ${b.issue} → ${b.suggestion}`).join('\n') || 'None'}
 ## Summary
 ${synthesis.summary || 'N/A'}
 `;
 Write(`${state.work_dir}/final-report.md`, report);
 state.status = 'completed';
 Write(`${state.work_dir}/workflow-state.json`, JSON.stringify(state, null, 2));
 // Output report to user
 ```
 ## Resume Chain
 ```
 Step 1 Execute → ccw cli claude --mode write --rule universal-rigorous-style --cd step-1/ → STOP → callback → artifacts
 Step 1 Analyze → ccw cli gemini --mode analysis --rule analysis-review-code-quality         → STOP → callback → gemini_session_id = exec_id
 Step 2 Execute → ccw cli claude --mode write --rule universal-rigorous-style --cd step-2/ → STOP → callback → artifacts
 Step 2 Analyze → ccw cli gemini --mode analysis --resume gemini_session_id                 → STOP → callback → gemini_session_id = exec_id
  ...
 Synthesize   → ccw cli gemini --mode analysis --resume gemini_session_id                   → STOP → callback → synthesis
 Report       → local generation (no CLI call)
 ```
 ## Error Handling
 | Phase | Error | Recovery |
 |-------|-------|----------|
 | Execute | CLI timeout | Retry once, then mark step failed and advance |
 | Execute | Command not found | Skip step, note in process-log |
 | Analyze | CLI fails | Retry without --resume, then skip analysis |
 | Synthesize | CLI fails | Generate report from step analyses only |
 | Any | 3+ consecutive errors | Terminate, produce partial report |
 ## Core Rules
 1. **STOP After Each CLI Call**: Every `ccw cli` call runs in background — STOP output immediately, wait for hook callback
 2. **State Machine**: Advance via `current_step` + `current_phase`, never use sync loops for async operations
 3. **Test Task Drives Execution**: 每个命令必须有 test_task（完整需求说明），作为命令的 $ARGUMENTS 输入。test_task 由当前 Claude 直接根据命令链复杂度生成，不需要额外 CLI 调用
 4. **All Execution via claude**: `ccw cli --tool claude --mode write --rule universal-rigorous-style`
 5. **All Analysis via gemini**: `ccw cli --tool gemini --mode analysis`, chained via `--resume`
 6. **Session Capture**: After each gemini callback, capture exec_id → `gemini_session_id` for resume chain
 7. **Sandbox Isolation**: 所有命令在独立沙箱目录（`sandbox/`）中执行，使用虚构测试任务，不影响真实项目
 8. **Artifact Collection**: Scan sandbox filesystem (not git diff), compare pre/post snapshots
 9. **Prompt Assembly**: Every step goes through `assembleStepPrompt()` — resolves command file, reads YAML metadata, injects test_task, builds rich context
 10. **Auto-Confirm**: All prompts auto-confirmed, no blocking interactions during execution
--- a/.claude/skills/workflow-tune/SKILL.md
+++ b/.claude/skills/workflow-tune/SKILL.md
@@ -1,495 +0,0 @@
 ---
 name: workflow-tune
 description: Workflow tuning skill for multi-command/skill pipelines. Executes each step sequentially, inspects artifacts after each command, analyzes quality via ccw cli resume, builds process documentation, and generates optimization suggestions. Triggers on "workflow tune", "tune workflow", "workflow optimization".
 allowed-tools: Skill, Agent, AskUserQuestion, TaskCreate, TaskUpdate, TaskList, Read, Write, Edit, Bash, Glob, Grep
 ---
 # Workflow Tune
 Tune multi-step workflows composed of commands or skills. Execute each step, inspect artifacts, analyze via ccw cli resume, build process documentation, and produce actionable optimization suggestions.
 ## Architecture Overview
 ```
 ┌──────────────────────────────────────────────────────────────────────┐
 │  Workflow Tune Orchestrator (SKILL.md)                                │
 │  → Parse → Decompose → Confirm → Setup → Step Loop → Synthesize     │
 └──────────────────────┬───────────────────────────────────────────────┘
                       │
   ┌───────────────────┼───────────────────────────────────┐
   ↓                   ↓                                   ↓
 ┌──────────┐   ┌─────────────────────────────┐     ┌──────────────┐
 │ Phase 1  │   │  Step Loop (2→3 per step)   │     │ Phase 4 + 5  │
 │ Setup    │   │  ┌─────┐  ┌─────┐           │     │ Synthesize + │
 │          │──→│  │ P2  │→ │ P3  │           │────→│ Report       │
 │ Parse +  │   │  │Exec │  │Anal │           │     │              │
 │ Decomp + │   │  └─────┘  └─────┘           │     └──────────────┘
 │ Confirm  │   │       ↑       │  next step  │
 └──────────┘   │       └───────┘             │
               └─────────────────────────────┘
 Phase 1 Detail:
  Input → [Format 1-3: direct parse] ──→ Command Doc → Confirm → Init
       → [Format 4: natural lang]   ──→ Semantic Decompose → Command Doc → Confirm → Init
 ```
 ## Key Design Principles
 1. **Test-First Evaluation**: Auto-generate per-step acceptance criteria before execution — judge results against concrete requirements, not vague quality
 2. **Step-by-Step Execution**: Each workflow step executes independently, artifacts inspected before proceeding
 3. **Resume-Based Analysis**: Uses ccw cli `--resume` to maintain analysis context across steps
 4. **Process Documentation**: Running `process-log.md` accumulates observations per step
 5. **Two-Tool Pipeline**: Claude/target tool (execute) + Gemini (analyze) = complementary perspectives
 6. **Pure Orchestrator**: SKILL.md coordinates only — execution detail in phase files
 7. **Progressive Phase Loading**: Phase docs read only when that phase executes
 ## Interactive Preference Collection
 ```javascript
 // ★ Auto mode detection
 const autoYes = /\b(-y|--yes)\b/.test($ARGUMENTS)
 if (autoYes) {
  workflowPreferences = {
    autoYes: true,
    analysisDepth: 'standard',
    autoFix: false
  }
 } else {
  const prefResponse = AskUserQuestion({
    questions: [
      {
        question: "选择 Workflow 调优配置：",
        header: "Tune Config",
        multiSelect: false,
        options: [
          { label: "Quick (轻量分析)", description: "每步简要检查，快速产出建议" },
          { label: "Standard (标准分析) (Recommended)", description: "每步详细分析，完整过程文档" },
          { label: "Deep (深度分析)", description: "每步深度审查，含性能和架构建议" }
        ]
      },
      {
        question: "是否自动应用优化建议？",
        header: "Auto Fix",
        multiSelect: false,
        options: [
          { label: "No (仅生成报告) (Recommended)", description: "只分析，不修改" },
          { label: "Yes (自动应用)", description: "分析后自动应用高优先级建议" }
        ]
      }
    ]
  })
  const depthMap = {
    "Quick": "quick",
    "Standard": "standard",
    "Deep": "deep"
  }
  const selectedDepth = Object.keys(depthMap).find(k =>
    prefResponse["Tune Config"].startsWith(k)
  ) || "Standard"
  workflowPreferences = {
    autoYes: false,
    analysisDepth: depthMap[selectedDepth],
    autoFix: prefResponse["Auto Fix"].startsWith("Yes")
  }
 }
 ```
 ## Input Processing
 ```
 $ARGUMENTS → Parse:
  ├─ Workflow definition: one of:
  │   ├─ Format 1: Inline steps — "step1 | step2 | step3" (pipe-separated commands)
  │   ├─ Format 2: Skill names — "skill-a,skill-b,skill-c" (comma-separated)
  │   ├─ Format 3: File path — "--file workflow.json" (JSON definition)
  │   └─ Format 4: Natural language — free-text description, auto-decomposed into steps
  ├─ Test context: --context "description of what the workflow should achieve"
  └─ Flags: --depth quick|standard|deep, -y/--yes, --auto-fix
 ```
 ### Format Detection Priority
 ```
 1. --file flag present         → Format 3 (JSON)
 2. Contains pipe "|"           → Format 1 (inline commands)
 3. Matches skill-name pattern  → Format 2 (comma-separated skills)
 4. Everything else             → Format 4 (natural language → semantic decomposition)
   4a. Contains file path?     → Read file as reference doc, extract workflow steps via LLM
   4b. Pure intent text        → Direct intent-verb matching (intentMap)
 ```
 ### Format 4: Semantic Decomposition (Natural Language)
 When input is free-text (e.g., "分析 src 目录代码质量，做代码评审，然后修复高优先级问题"), the orchestrator:
 #### 4a. Reference Document Mode (input contains file path)
 When the input contains a file path (e.g., `d:\maestro2\guide\command-usage-guide.md 提取核心工作流`), the orchestrator:
 1. **Detect File Path**: Extract file path from input via path pattern matching
 2. **Read Reference Doc**: Read the file content as workflow reference material
 3. **Extract Workflow via LLM**: Use Gemini to analyze the document + user intent, extract executable workflow steps
 4. **Step Chain Generation**: LLM returns structured step chain with commands, tools, and execution order
 5. **Command Doc + Confirmation**: Same as 4b below
 #### 4b. Pure Intent Mode (no file path)
 1. **Semantic Parse**: Identify intent verbs and targets → map to available skills/commands
 2. **Step Chain Generation**: Produce ordered step chain with tool/mode selection
 3. **Command Doc**: Generate formatted execution plan document
 4. **User Confirmation**: Display plan, ask user to confirm/edit before execution
 ```javascript
 // ★ 4a: If input contains file path → read file, extract workflow via LLM
 //   Detect paths: Windows (D:\path\file.md), Unix (/path/file.md), relative (./file.md)
 //   Read file content → send to Gemini with user intent → get executable step chain
 //   See phases/01-setup.md Step 1.1b Mode 4a
 // ★ 4b: Pure intent text → regex-based intent-to-tool mapping
 const intentMap = {
  '分析|analyze|审查|inspect|scan': { tool: 'gemini', mode: 'analysis', rule: 'analysis-analyze-code-patterns' },
  '评审|review|code review': { tool: 'gemini', mode: 'analysis', rule: 'analysis-review-code-quality' },
  // ... (full map in phases/01-setup.md Step 1.1b Mode 4b)
 };
 // Match input segments to intents, produce step chain
 // See phases/01-setup.md Step 1.1b for full algorithm
 ```
 ### Workflow JSON Format (when using --file)
 ```json
 {
  "name": "my-workflow",
  "description": "What this workflow achieves",
  "steps": [
    {
      "name": "step-1-name",
      "type": "skill|command|ccw-cli",
      "command": "/skill-name args" | "ccw cli -p '...' --tool gemini --mode analysis",
      "expected_artifacts": ["output.md", "report.json"],
      "success_criteria": "description of what success looks like"
    }
  ]
 }
 ```
 ### Inline Step Parsing
 ```javascript
 // Pipe-separated: each segment is a command
 // "ccw cli -p 'analyze' --tool gemini --mode analysis | /review-code src/ | ccw cli -p 'fix' --tool claude --mode write"
 const steps = input.split('|').map((cmd, i) => ({
  name: `step-${i + 1}`,
  type: cmd.trim().startsWith('/') ? 'skill' : 'command',
  command: cmd.trim(),
  expected_artifacts: [],
  success_criteria: ''
 }));
 ```
 ## Pre-Execution Confirmation
 After parsing (all formats) or decomposition (Format 4), generate a **Command Document** and ask for user confirmation before executing.
 ### Command Document Format
 ```markdown
 # Workflow Tune — Execution Plan
 **Workflow**: {name}
 **Goal**: {context}
 **Steps**: {count}
 **Analysis Depth**: {depth}
 ## Step Chain
 | # | Name | Type | Command | Tool | Mode |
 |---|------|------|---------|------|------|
 | 1 | {name} | {type} | {command} | {tool} | {mode} |
 | 2 | ... | ... | ... | ... | ... |
 ## Execution Flow
 ```
 Step 1: {name}
  → Command: {command}
  → Expected: {artifacts}
  → Feeds into: Step 2
       ↓
 Step 2: {name}
  → Command: {command}
  → Expected: {artifacts}
  → Feeds into: Step 3
       ↓
  ...
 ```
 ## Estimated Scope
 - Total CLI calls: {N} (execute) + {N} (analyze) + 1 (synthesize)
 - Analysis tool: gemini (--resume chain)
 - Process documentation: process-log.md (accumulated)
 ```
 ### Confirmation Flow
 ```javascript
 // ★ Skip confirmation only if -y/--yes flag
 if (!workflowPreferences.autoYes) {
  // Display command document to user
  // Output the formatted plan (direct text output, NOT a file)
  const confirmation = AskUserQuestion({
    questions: [{
      question: "确认执行以上 Workflow 调优计划？",
      header: "Confirm Execution",
      multiSelect: false,
      options: [
        { label: "Execute (确认执行)", description: "按计划开始执行" },
        { label: "Edit steps (修改步骤)", description: "我想调整某些步骤" },
        { label: "Cancel (取消)", description: "取消本次调优" }
      ]
    }]
  });
  if (confirmation["Confirm Execution"].startsWith("Cancel")) {
    // Abort workflow
    return;
  }
  if (confirmation["Confirm Execution"].startsWith("Edit")) {
    // Ask user for modifications
    const editResponse = AskUserQuestion({
      questions: [{
        question: "请描述要修改的内容（如：删除步骤2、步骤3改用codex、在步骤1后加入安全扫描）：",
        header: "Edit Steps"
      }]
    });
    // Apply user edits to steps[] → re-display command doc → re-confirm
    // (recursive confirmation loop until Execute or Cancel)
  }
 }
 ```
 ## Execution Flow
 > **COMPACT DIRECTIVE**: Context compression MUST check TaskUpdate phase status.
 > The phase currently marked `in_progress` is the active execution phase — preserve its FULL content.
 > Only compress phases marked `completed` or `pending`.
 ### Phase 1: Setup (one-time)
 Read and execute: `Ref: phases/01-setup.md`
 - Parse workflow steps from input (Format 1-3: direct parse, Format 4: semantic decomposition)
 - Generate Command Document (formatted execution plan)
 - **User Confirmation**: Display plan, wait for confirm/edit/cancel
 - **Generate Test Requirements**: Auto-create per-step acceptance criteria via Gemini (expected outputs, content signals, quality thresholds, pass/fail criteria, handoff contracts)
 - Create workspace at `.workflow/.scratchpad/workflow-tune-{ts}/`
 - Initialize workflow-state.json (with test_requirements per step)
 - Create process-log.md template
 Output: `workDir`, `steps[]` (with test_requirements), `workflowContext`, `commandDoc`, initialized state
 ### Step Loop (Phase 2 + Phase 3, per step)
 ```javascript
 // Track analysis session ID for resume chain
 let analysisSessionId = null;
 for (let stepIdx = 0; stepIdx < state.steps.length; stepIdx++) {
  const step = state.steps[stepIdx];
  TaskUpdate(stepLoopTask, {
    subject: `Step ${stepIdx + 1}/${state.steps.length}: ${step.name}`,
    status: 'in_progress'
  });
  // === Phase 2: Execute Step ===
  // Read: phases/02-step-execute.md
  // Execute command/skill → collect artifacts
  // Write step-{N}-artifacts-manifest.json
  // === Phase 3: Analyze Step ===
  // Read: phases/03-step-analyze.md
  // Inspect artifacts → ccw cli gemini --resume analysisSessionId
  // Write step-{N}-analysis.md → append to process-log.md
  // Update analysisSessionId for next step's resume
  // Update state
  state.steps[stepIdx].status = 'completed';
  Write(`${state.work_dir}/workflow-state.json`, JSON.stringify(state, null, 2));
 }
 ```
 ### Phase 2: Execute Step (per step)
 Read and execute: `Ref: phases/02-step-execute.md`
 - Create step working directory
 - Execute command/skill via ccw cli or Skill tool
 - Collect output artifacts
 - Write artifacts manifest
 ### Phase 3: Analyze Step (per step)
 Read and execute: `Ref: phases/03-step-analyze.md`
 - Inspect step artifacts (file list, content summary, quality signals)
 - **Compare actual output against test requirements** (pass_criteria, content_signals, fail_signals, handoff_contract)
 - Build analysis prompt with step context + test requirements + previous process log
 - Execute: `ccw cli --tool gemini --mode analysis [--resume sessionId]`
 - Parse analysis → write step-{N}-analysis.md (with requirement match: PASS/FAIL)
 - Append findings to process-log.md
 - Return analysis session ID for resume chain
 ### Phase 4: Synthesize (one-time)
 Read and execute: `Ref: phases/04-synthesize.md`
 - Read complete process-log.md + all step analyses
 - Build synthesis prompt with full workflow context
 - Execute: `ccw cli --tool gemini --mode analysis --resume analysisSessionId`
 - Generate cross-step optimization insights
 - Write synthesis.md
 ### Phase 5: Optimization Report (one-time)
 Read and execute: `Ref: phases/05-optimize-report.md`
 - Aggregate all analyses and synthesis
 - Generate structured optimization report
 - Optionally apply high-priority fixes (if autoFix enabled)
 - Write final-report.md
 - Display summary to user
 **Phase Reference Documents**:
 | Phase | Document | Purpose | Compact |
 |-------|----------|---------|---------|
 | 1 | [phases/01-setup.md](phases/01-setup.md) | Initialize workspace and state | TaskUpdate driven |
 | 2 | [phases/02-step-execute.md](phases/02-step-execute.md) | Execute workflow step | TaskUpdate driven |
 | 3 | [phases/03-step-analyze.md](phases/03-step-analyze.md) | Analyze step artifacts | TaskUpdate driven + resume |
 | 4 | [phases/04-synthesize.md](phases/04-synthesize.md) | Cross-step synthesis | TaskUpdate driven + resume |
 | 5 | [phases/05-optimize-report.md](phases/05-optimize-report.md) | Generate final report | TaskUpdate driven |
 ## Data Flow
 ```
 User Input (workflow steps / natural language + context)
    ↓
 Phase 1: Setup
    ├─ [Format 1-3] Direct parse → steps[]
    ├─ [Format 4a]  File path detected → Read doc → LLM extract → steps[]
    ├─ [Format 4b]  Pure intent text → Regex intent matching → steps[]
    ↓
    Command Document (formatted plan)
    ↓
    User Confirmation (Execute / Edit / Cancel)
    ↓ (Execute confirmed)
    ↓
    Generate Test Requirements (Gemini) → per-step acceptance criteria
    ↓ workDir, steps[] (with test_requirements), workflow-state.json, process-log.md
    ↓
 ┌─→ Phase 2: Execute Step N (ccw cli / Skill)
 │   ↓ step-N/ artifacts
 │   ↓
 │   Phase 3: Analyze Step N (ccw cli gemini --resume)
 │   ↓ step-N-analysis.md, process-log.md updated
 │   ↓ analysisSessionId carried forward
 │   ↓
 │   [More steps?]─── YES ──→ next step (Phase 2)
 │   ↓ NO
 │   ↓
 └───┘
    ↓
 Phase 4: Synthesize (ccw cli gemini --resume)
    ↓ synthesis.md
    ↓
 Phase 5: Report
    ↓ final-report.md + optional auto-fix
    ↓
 Done
 ```
 ## TaskUpdate Pattern
 ```javascript
 // Initial state
 TaskCreate({ subject: "Phase 1: Setup workspace", activeForm: "Parsing workflow" })
 TaskCreate({ subject: "Step Loop", activeForm: "Executing steps" })
 TaskCreate({ subject: "Phase 4-5: Synthesize & Report", activeForm: "Pending" })
 // Per-step tracking
 for (const step of state.steps) {
  TaskCreate({
    subject: `Step: ${step.name}`,
    activeForm: `Pending`,
    description: `${step.type}: ${step.command}`
  })
 }
 // During step execution
 TaskUpdate(stepTask, {
  subject: `Step: ${step.name} — Executing`,
  activeForm: `Running ${step.command}`
 })
 // After step analysis
 TaskUpdate(stepTask, {
  subject: `Step: ${step.name} — Analyzed`,
  activeForm: `Quality: ${stepQuality} | Issues: ${issueCount}`,
  status: 'completed'
 })
 // Final
 TaskUpdate(synthesisTask, {
  subject: `Synthesis & Report (${state.steps.length} steps, ${totalIssues} issues)`,
  status: 'completed'
 })
 ```
 ## Resume Chain Strategy
 ```
 Step 1 Execute → artifacts
 Step 1 Analyze → ccw cli gemini --mode analysis → sessionId_1
 Step 2 Execute → artifacts
 Step 2 Analyze → ccw cli gemini --mode analysis --resume sessionId_1 → sessionId_2
  ...
 Step N Analyze → sessionId_N
 Synthesize   → ccw cli gemini --mode analysis --resume sessionId_N → final
 ```
 Each analysis step resumes the previous session, maintaining full context of:
 - All prior step observations
 - Accumulated quality patterns
 - Cross-step dependency insights
 ## Error Handling
 | Phase | Error | Recovery |
 |-------|-------|----------|
 | 2: Execute | CLI timeout/crash | Retry once, then record failure and continue to next step |
 | 2: Execute | Skill not found | Skip step, note in process-log |
 | 3: Analyze | CLI fails | Retry without --resume, start fresh session |
 | 3: Analyze | Resume session not found | Start fresh analysis session |
 | 4: Synthesize | CLI fails | Generate report from individual step analyses only |
 | Any | 3+ consecutive errors | Terminate with partial report |
 **Error Budget**: Each step gets 1 retry. 3 consecutive failures triggers early termination.
 ## Core Rules
 1. **Start Immediately**: First action is preference collection → Phase 1 setup
 2. **Progressive Loading**: Read phase doc ONLY when that phase is about to execute
 3. **Inspect Before Proceed**: Always check step artifacts before moving to next step
 4. **Background CLI**: ccw cli runs in background, wait for hook callback before proceeding
 5. **Resume Chain**: Maintain analysis session continuity via --resume
 6. **Process Documentation**: Every step observation goes into process-log.md
 7. **Single State Source**: `workflow-state.json` is the only source of truth
 8. **DO NOT STOP**: Continuous execution until all steps processed
--- a/.claude/skills/workflow-tune/phases/01-setup.md
+++ b/.claude/skills/workflow-tune/phases/01-setup.md
@@ -1,670 +0,0 @@
 # Phase 1: Setup
 Initialize workspace, parse workflow definition, semantic decomposition for natural language, generate command document, user confirmation, create state and process log.
 ## Objective
 - Parse workflow steps from user input (Format 1-3: direct parse, Format 4: semantic decomposition)
 - Generate Command Document (formatted execution plan)
 - User confirmation: Execute / Edit steps / Cancel
 - Validate step commands/skill paths
 - Create isolated workspace directory
 - Initialize workflow-state.json and process-log.md
 ## Execution
 ### Step 1.1: Parse Input
 ```javascript
 const args = $ARGUMENTS.trim();
 // Detect input format
 let steps = [];
 let workflowName = 'unnamed-workflow';
 let workflowContext = '';
 // Format 1: JSON file (--file path)
 const fileMatch = args.match(/--file\s+"?([^\s"]+)"?/);
 if (fileMatch) {
  const wfDef = JSON.parse(Read(fileMatch[1]));
  workflowName = wfDef.name || 'unnamed-workflow';
  workflowContext = wfDef.description || '';
  steps = wfDef.steps;
 }
 // Format 2: Pipe-separated commands ("cmd1 | cmd2 | cmd3")
 else if (args.includes('|')) {
  const rawSteps = args.split(/(?:--context|--depth|-y|--yes|--auto-fix)\s+("[^"]*"|\S+)/)[0];
  steps = rawSteps.split('|').map((cmd, i) => ({
    name: `step-${i + 1}`,
    type: cmd.trim().startsWith('/') ? 'skill'
        : cmd.trim().startsWith('ccw cli') ? 'ccw-cli'
        : 'command',
    command: cmd.trim(),
    expected_artifacts: [],
    success_criteria: ''
  }));
 }
 // Format 3: Comma-separated skill names (matches pattern: word,word or word-word,word-word)
 else if (/^[\w-]+(,[\w-]+)+/.test(args.split(/\s/)[0])) {
  const skillPart = args.match(/^([^\s]+)/);
  const skillNames = skillPart ? skillPart[1].split(',') : [];
  steps = skillNames.map((name, i) => {
    const skillPath = name.startsWith('.claude/') ? name : `.claude/skills/${name}`;
    return {
      name: name.replace('.claude/skills/', ''),
      type: 'skill',
      command: `/${name.replace('.claude/skills/', '')}`,
      skill_path: skillPath,
      expected_artifacts: [],
      success_criteria: ''
    };
  });
 }
 // Format 4: Natural language → semantic decomposition
 else {
  inputFormat = 'natural-language';
  naturalLanguageInput = args.replace(/--\w+\s+"[^"]*"/g, '').replace(/--\w+\s+\S+/g, '').replace(/-y|--yes/g, '').trim();
  // ★ 4a: Detect file paths in input (Windows absolute, Unix absolute, or relative paths)
  const filePathPattern = /(?:[A-Za-z]:[\\\/][^\s,;，；、]+|\/[^\s,;，；、]+\.(?:md|txt|json|yaml|yml|toml)|\.\/?[^\s,;，；、]+\.(?:md|txt|json|yaml|yml|toml))/g;
  const detectedPaths = naturalLanguageInput.match(filePathPattern) || [];
  referenceDocContent = null;
  referenceDocPath = null;
  if (detectedPaths.length > 0) {
    // Read first detected file as reference document
    referenceDocPath = detectedPaths[0];
    try {
      referenceDocContent = Read(referenceDocPath);
      // Remove file path from natural language input to get pure user intent
      naturalLanguageInput = naturalLanguageInput.replace(referenceDocPath, '').trim();
    } catch (e) {
      // File not readable — fall through to 4b (pure intent mode)
      referenceDocContent = null;
    }
  }
  // Steps will be populated in Step 1.1b
  steps = [];
 }
 // Parse --context
 const contextMatch = args.match(/--context\s+"([^"]+)"/);
 workflowContext = contextMatch ? contextMatch[1] : workflowContext;
 // Parse --depth
 const depthMatch = args.match(/--depth\s+(quick|standard|deep)/);
 if (depthMatch) {
  workflowPreferences.analysisDepth = depthMatch[1];
 }
 // If no context provided, ask user
 if (!workflowContext) {
  const response = AskUserQuestion({
    questions: [{
      question: "请描述这个 workflow 的目标和预期效果：",
      header: "Workflow Context",
      multiSelect: false,
      options: [
        { label: "General quality check", description: "通用质量检查，评估步骤间衔接" },
        { label: "Custom description", description: "自定义描述 workflow 目标" }
      ]
    }]
  });
  workflowContext = response["Workflow Context"];
 }
 ```
 ### Step 1.1b: Semantic Decomposition (Format 4 only)
 > Skip this step if `inputFormat !== 'natural-language'`.
 Two sub-modes based on whether a reference document was detected in Step 1.1:
 #### Mode 4a: Reference Document → LLM Extraction
 When `referenceDocContent` is available, use Gemini to extract executable workflow steps from the document, guided by the user's intent text.
 ```javascript
 if (inputFormat === 'natural-language' && referenceDocContent) {
  // ★ 4a: Extract workflow steps from reference document via LLM
  const extractPrompt = `PURPOSE: Extract an executable workflow step chain from the reference document below, guided by the user's intent.
 USER INTENT: ${naturalLanguageInput}
 REFERENCE DOCUMENT PATH: ${referenceDocPath}
 REFERENCE DOCUMENT CONTENT:
 ${referenceDocContent}
 TASK:
 1. Read the document and identify the workflow/process it describes (commands, steps, phases, procedures)
 2. Filter by user intent — only extract the steps the user wants to test/tune
 3. For each step, determine:
   - The actual command to execute (shell command, CLI invocation, or skill name)
   - The execution order and dependencies
   - What tool to use (gemini/claude/codex/qwen) and mode (default: write)
 4. Generate a step chain that can be directly executed
 IMPORTANT:
 - Extract REAL executable commands from the document, not analysis tasks about the document
 - The user wants to RUN these workflow steps, not analyze the document itself
 - If the document describes CLI commands like "maestro init", "maestro plan", etc., those are the steps to extract
 - Preserve the original command syntax from the document
 - Map each command to appropriate tool/mode for ccw cli execution, OR mark as 'command' type for direct shell execution
 - Default mode to "write" — almost all steps produce output artifacts (files, reports, configs), even analysis steps need write permission to save results
 EXPECTED OUTPUT (strict JSON, no markdown):
 {
  "workflow_name": "<descriptive name>",
  "workflow_context": "<what this workflow achieves>",
  "steps": [
    {
      "name": "<step name>",
      "type": "command|ccw-cli|skill",
      "command": "<the actual command to execute>",
      "tool": "<gemini|claude|codex|qwen or null for shell commands>",
      "mode": "<write (default) | analysis (read-only, rare) | null for shell commands>",
      "rule": "<rule template or null>",
      "original_text": "<source text from document>",
      "expected_artifacts": ["<expected output files>"],
      "success_criteria": "<what success looks like>"
    }
  ]
 }
 CONSTRAINTS: Output ONLY valid JSON. Extract executable steps, NOT document analysis tasks.`;
  Bash({
    command: `ccw cli -p "${escapeForShell(extractPrompt)}" --tool gemini --mode analysis --rule universal-rigorous-style`,
    run_in_background: true,
    timeout: 300000
  });
  // STOP — wait for hook callback
  // After callback: parse JSON response into steps[]
  const extractOutput = /* CLI output from callback */;
  const extractJsonMatch = extractOutput.match(/\{[\s\S]*\}/);
  if (extractJsonMatch) {
    try {
      const extracted = JSON.parse(extractJsonMatch[0]);
      workflowName = extracted.workflow_name || 'doc-workflow';
      workflowContext = extracted.workflow_context || naturalLanguageInput;
      steps = (extracted.steps || []).map((s, i) => ({
        name: s.name || `step-${i + 1}`,
        type: s.type || 'command',
        command: s.command,
        tool: s.tool || null,
        mode: s.mode || null,
        rule: s.rule || null,
        original_text: s.original_text || '',
        expected_artifacts: s.expected_artifacts || [],
        success_criteria: s.success_criteria || ''
      }));
    } catch (e) {
      // JSON parse failed — fall through to 4b intent matching
      referenceDocContent = null;
    }
  }
  if (steps.length === 0) {
    // Extraction produced no steps — fall through to 4b
    referenceDocContent = null;
  }
 }
 ```
 #### Mode 4b: Pure Intent → Regex Matching
 When no reference document, or when 4a extraction failed, decompose by intent-verb matching.
 ```javascript
 if (inputFormat === 'natural-language' && !referenceDocContent) {
  // Intent-to-tool mapping (regex patterns → tool config)
  const intentMap = [
    { pattern: /分析|analyze|审查|inspect|scan/i, name: 'analyze', tool: 'gemini', mode: 'analysis', rule: 'analysis-analyze-code-patterns' },
    { pattern: /评审|review|code.?review/i, name: 'review', tool: 'gemini', mode: 'analysis', rule: 'analysis-review-code-quality' },
    { pattern: /诊断|debug|排查|diagnose/i, name: 'diagnose', tool: 'gemini', mode: 'analysis', rule: 'analysis-diagnose-bug-root-cause' },
    { pattern: /安全|security|漏洞|vulnerability/i, name: 'security-audit', tool: 'gemini', mode: 'analysis', rule: 'analysis-assess-security-risks' },
    { pattern: /性能|performance|perf/i, name: 'perf-analysis', tool: 'gemini', mode: 'analysis', rule: 'analysis-analyze-performance' },
    { pattern: /架构|architecture/i, name: 'arch-review', tool: 'gemini', mode: 'analysis', rule: 'analysis-review-architecture' },
    { pattern: /修复|fix|repair|解决/i, name: 'fix', tool: 'claude', mode: 'write', rule: 'development-debug-runtime-issues' },
    { pattern: /实现|implement|开发|create|新增/i, name: 'implement', tool: 'claude', mode: 'write', rule: 'development-implement-feature' },
    { pattern: /重构|refactor/i, name: 'refactor', tool: 'claude', mode: 'write', rule: 'development-refactor-codebase' },
    { pattern: /测试|test|generate.?test/i, name: 'test', tool: 'claude', mode: 'write', rule: 'development-generate-tests' },
    { pattern: /规划|plan|设计|design/i, name: 'plan', tool: 'gemini', mode: 'analysis', rule: 'planning-plan-architecture-design' },
    { pattern: /拆解|breakdown|分解/i, name: 'breakdown', tool: 'gemini', mode: 'analysis', rule: 'planning-breakdown-task-steps' },
  ];
  // Segment input by Chinese/English delimiters: 、，,；然后/接着/最后/之后 etc.
  const segments = naturalLanguageInput
    .split(/[，,；;、]|(?:然后|接着|之后|最后|再|并|and then|then|finally|next)\s*/i)
    .map(s => s.trim())
    .filter(Boolean);
  // Match each segment to an intent (with ambiguity resolution)
  steps = segments.map((segment, i) => {
    const allMatches = intentMap.filter(m => m.pattern.test(segment));
    let matched = allMatches[0] || null;
    // ★ Ambiguity resolution: if multiple intents match, ask user
    if (allMatches.length > 1) {
      const disambig = AskUserQuestion({
        questions: [{
          question: `"${segment}" 匹配到多个意图，请选择最符合的：`,
          header: `Disambiguate Step ${i + 1}`,
          multiSelect: false,
          options: allMatches.map(m => ({
            label: m.name,
            description: `Tool: ${m.tool}, Mode: ${m.mode}, Rule: ${m.rule}`
          }))
        }]
      });
      const chosen = disambig[`Disambiguate Step ${i + 1}`];
      matched = allMatches.find(m => m.name === chosen) || allMatches[0];
    }
    if (matched) {
      // Extract target scope from segment (e.g., "分析 src 目录" → scope = "src")
      const scopeMatch = segment.match(/(?:目录|文件|模块|directory|file|module)?\s*[：:]?\s*(\S+)/);
      const scope = scopeMatch ? scopeMatch[1].replace(/[的地得]$/, '') : '**/*';
      return {
        name: `${matched.name}`,
        type: 'ccw-cli',
        command: `ccw cli -p "${segment}" --tool ${matched.tool} --mode ${matched.mode} --rule ${matched.rule}`,
        tool: matched.tool,
        mode: matched.mode,
        rule: matched.rule,
        original_text: segment,
        expected_artifacts: [],
        success_criteria: ''
      };
    } else {
      // Unmatched segment → generic analysis step
      return {
        name: `step-${i + 1}`,
        type: 'ccw-cli',
        command: `ccw cli -p "${segment}" --tool gemini --mode analysis`,
        tool: 'gemini',
        mode: 'analysis',
        rule: 'universal-rigorous-style',
        original_text: segment,
        expected_artifacts: [],
        success_criteria: ''
      };
    }
  });
  // Deduplicate: if same intent name appears twice, suffix with index
  const nameCount = {};
  steps.forEach(s => {
    nameCount[s.name] = (nameCount[s.name] || 0) + 1;
    if (nameCount[s.name] > 1) {
      s.name = `${s.name}-${nameCount[s.name]}`;
    }
  });
 }
 // Common: set workflow context and name for Format 4
 if (inputFormat === 'natural-language') {
  if (!workflowContext) {
    workflowContext = naturalLanguageInput;
  }
  if (!workflowName || workflowName === 'unnamed-workflow') {
    workflowName = referenceDocPath ? 'doc-workflow' : 'nl-workflow';
  }
 }
 ```
 ### Step 1.1c: Generate Command Document
 Generate a formatted execution plan for user review. This runs for ALL input formats, not just Format 4.
 ```javascript
 function generateCommandDoc(steps, workflowName, workflowContext, analysisDepth) {
  const stepTable = steps.map((s, i) => {
    const tool = s.tool || (s.type === 'skill' ? '-' : 'claude');
    const mode = s.mode || (s.type === 'skill' ? '-' : 'write');
    const cmdPreview = s.command.length > 60 ? s.command.substring(0, 57) + '...' : s.command;
    return `| ${i + 1} | ${s.name} | ${s.type} | \`${cmdPreview}\` | ${tool} | ${mode} |`;
  }).join('\n');
  const flowDiagram = steps.map((s, i) => {
    const arrow = i < steps.length - 1 ? '\n     ↓' : '';
    const feedsInto = i < steps.length - 1 ? `Feeds into: Step ${i + 2} (${steps[i + 1].name})` : 'Final step';
    const originalText = s.original_text ? `\n  Source: "${s.original_text}"` : '';
    return `Step ${i + 1}: ${s.name}
  Command: ${s.command}
  Type: ${s.type} | Tool: ${s.tool || '-'} | Mode: ${s.mode || '-'}${originalText}
  ${feedsInto}${arrow}`;
  }).join('\n');
  const totalCli = steps.filter(s => s.type === 'ccw-cli').length;
  const totalSkill = steps.filter(s => s.type === 'skill').length;
  const totalCmd = steps.filter(s => s.type === 'command').length;
  return `# Workflow Tune — Execution Plan
 **Workflow**: ${workflowName}
 **Goal**: ${workflowContext}
 **Steps**: ${steps.length}
 **Analysis Depth**: ${analysisDepth}
 ## Step Chain
 | # | Name | Type | Command | Tool | Mode |
 |---|------|------|---------|------|------|
 ${stepTable}
 ## Execution Flow
 \`\`\`
 ${flowDiagram}
 \`\`\`
 ## Estimated Scope
 - CLI execute calls: ${totalCli}
 - Skill invocations: ${totalSkill}
 - Shell commands: ${totalCmd}
 - Analysis calls (gemini --resume chain): ${steps.length} (per-step) + 1 (synthesis)
 - Process documentation: process-log.md (accumulated)
 - Final output: final-report.md with optimization recommendations
 `;
 }
 const commandDoc = generateCommandDoc(steps, workflowName, workflowContext, workflowPreferences.analysisDepth);
 // Output command document to user (direct text output)
 // The orchestrator displays this as formatted text before confirmation
 ```
 ### Step 1.1d: Pre-Execution Confirmation
 ```javascript
 // ★ Skip confirmation if -y/--yes auto mode
 if (!workflowPreferences.autoYes) {
  // Display commandDoc to user as formatted text output
  // Then ask for confirmation
  const confirmation = AskUserQuestion({
    questions: [{
      question: "确认执行以上 Workflow 调优计划？",
      header: "Confirm Execution",
      multiSelect: false,
      options: [
        { label: "Execute (确认执行)", description: "按计划开始执行所有步骤" },
        { label: "Edit steps (修改步骤)", description: "调整步骤顺序、增删步骤、更换工具" },
        { label: "Cancel (取消)", description: "取消本次调优" }
      ]
    }]
  });
  const choice = confirmation["Confirm Execution"];
  if (choice.startsWith("Cancel")) {
    // Abort: no workspace created, no state written
    // Output: "Workflow tune cancelled."
    return;
  }
  if (choice.startsWith("Edit")) {
    // Enter edit loop: ask user what to change, apply, re-display, re-confirm
    let editing = true;
    while (editing) {
      const editResponse = AskUserQuestion({
        questions: [{
          question: "请描述要修改的内容：\n" +
            "  - 删除步骤: '删除步骤2' 或 'remove step 2'\n" +
            "  - 添加步骤: '在步骤1后加入安全扫描' 或 'add security scan after step 1'\n" +
            "  - 修改工具: '步骤3改用codex' 或 'step 3 use codex'\n" +
            "  - 调换顺序: '步骤2和步骤3互换' 或 'swap step 2 and 3'\n" +
            "  - 修改命令: '步骤1命令改为 ccw cli -p \"...\" --tool gemini'",
          header: "Edit Steps"
        }]
      });
      const editText = editResponse["Edit Steps"];
      // Apply edits to steps[] based on user instruction
      // The orchestrator interprets the edit instruction and modifies steps:
      //
      // Delete: filter out the specified step, re-index
      // Add: insert new step at specified position
      // Modify tool: update the step's tool/mode/command
      // Swap: exchange positions of two steps
      // Modify command: replace command string
      //
      // After applying edits, re-generate command doc and re-display
      const updatedCommandDoc = generateCommandDoc(steps, workflowName, workflowContext, workflowPreferences.analysisDepth);
      // Display updatedCommandDoc to user
      const reconfirm = AskUserQuestion({
        questions: [{
          question: "修改后的计划如上，是否确认？",
          header: "Confirm Execution",
          multiSelect: false,
          options: [
            { label: "Execute (确认执行)", description: "按修改后的计划执行" },
            { label: "Edit more (继续修改)", description: "还需要调整" },
            { label: "Cancel (取消)", description: "取消本次调优" }
          ]
        }]
      });
      const reChoice = reconfirm["Confirm Execution"];
      if (reChoice.startsWith("Execute")) {
        editing = false;
      } else if (reChoice.startsWith("Cancel")) {
        return;  // Abort
      }
      // else: continue editing loop
    }
  }
  // choice === "Execute" → proceed to workspace creation
 }
 // Save command doc for reference
 // Will be written to workspace after Step 1.3
 ```
 ### Step 1.2: Validate Steps
 ```javascript
 for (const step of steps) {
  if (step.type === 'skill' && step.skill_path) {
    const skillFiles = Glob(`${step.skill_path}/SKILL.md`);
    if (skillFiles.length === 0) {
      step.validation = 'warning';
      step.validation_msg = `Skill not found: ${step.skill_path}`;
    } else {
      step.validation = 'ok';
    }
  } else {
    // Command-type steps: basic validation (non-empty)
    step.validation = step.command && step.command.trim() ? 'ok' : 'invalid';
  }
 }
 const invalidSteps = steps.filter(s => s.validation === 'invalid');
 if (invalidSteps.length > 0) {
  throw new Error(`Invalid steps: ${invalidSteps.map(s => s.name).join(', ')}`);
 }
 ```
 ### Step 1.2b: Generate Test Requirements (Acceptance Criteria)
 > 调优的前提：为每一步生成跟任务匹配的验收标准。没有预期基准，就无法判断命令执行是否达标。
 用 Gemini 根据 step command + workflow context + 上下游关系，自动推断每步的验收标准。
 ```javascript
 // Build step chain description for context
 const stepChainDesc = steps.map((s, i) =>
  `Step ${i + 1}: ${s.name} (${s.type}) — ${s.command}`
 ).join('\n');
 const reqGenPrompt = `PURPOSE: Generate concrete acceptance criteria (test requirements) for each step in a workflow pipeline. These criteria will be used to objectively judge whether each step's execution succeeded or failed.
 WORKFLOW:
 Name: ${workflowName}
 Goal: ${workflowContext}
 STEP CHAIN:
 ${stepChainDesc}
 TASK:
 For each step, generate:
 1. **expected_outputs** — what files/artifacts should be produced (specific filenames or patterns)
 2. **content_signals** — what content patterns indicate success (keywords, structures, data shapes)
 3. **quality_thresholds** — minimum quality bar (e.g., "no empty files", "JSON must be parseable", "must contain at least N items")
 4. **pass_criteria** — 1-2 sentence description of what "pass" looks like for this step
 5. **fail_signals** — what patterns indicate failure (error messages, empty output, wrong format)
 6. **handoff_contract** — what this step must provide for the next step to work (data format, required fields)
 CONTEXT RULES:
 - Infer from the command what the step is supposed to do
 - Consider workflow goal when judging what "good enough" means
 - Each step's handoff_contract should match what the next step needs as input
 - Be specific: "report.md with ## Summary section" not "a report file"
 EXPECTED OUTPUT (strict JSON, no markdown):
 {
  "step_requirements": [
    {
      "step_index": 0,
      "step_name": "<name>",
      "expected_outputs": ["<file or pattern>"],
      "content_signals": ["<keyword or pattern that indicates success>"],
      "quality_thresholds": ["<minimum bar>"],
      "pass_criteria": "<what pass looks like>",
      "fail_signals": ["<pattern that indicates failure>"],
      "handoff_contract": "<what next step needs from this step>"
    }
  ]
 }
 CONSTRAINTS: Be specific to each command, output ONLY JSON`;
 Bash({
  command: `ccw cli -p "${escapeForShell(reqGenPrompt)}" --tool gemini --mode analysis --rule universal-rigorous-style`,
  run_in_background: true,
  timeout: 300000
 });
 // STOP — wait for hook callback
 // After callback: parse JSON, attach requirements to each step
 const reqOutput = /* CLI output from callback */;
 const reqJsonMatch = reqOutput.match(/\{[\s\S]*\}/);
 if (reqJsonMatch) {
  try {
    const reqData = JSON.parse(reqJsonMatch[0]);
    (reqData.step_requirements || []).forEach(req => {
      const idx = req.step_index;
      if (idx >= 0 && idx < steps.length) {
        steps[idx].test_requirements = {
          expected_outputs: req.expected_outputs || [],
          content_signals: req.content_signals || [],
          quality_thresholds: req.quality_thresholds || [],
          pass_criteria: req.pass_criteria || '',
          fail_signals: req.fail_signals || [],
          handoff_contract: req.handoff_contract || ''
        };
      }
    });
  } catch (e) {
    // Fallback: proceed without generated requirements
    // Steps will use any manually provided success_criteria
  }
 }
 // Capture session ID for resume chain start
 const reqSessionMatch = reqOutput.match(/\[CCW_EXEC_ID=([^\]]+)\]/);
 const reqSessionId = reqSessionMatch ? reqSessionMatch[1] : null;
 ```
 ### Step 1.3: Create Workspace
 ```javascript
 const ts = Date.now();
 const workDir = `.workflow/.scratchpad/workflow-tune-${ts}`;
 Bash(`mkdir -p "${workDir}/steps"`);
 // Create per-step directories
 for (let i = 0; i < steps.length; i++) {
  Bash(`mkdir -p "${workDir}/steps/step-${i + 1}/artifacts"`);
 }
 ```
 ### Step 1.3b: Save Command Document
 ```javascript
 // Save confirmed command doc to workspace for reference
 Write(`${workDir}/command-doc.md`, commandDoc);
 ```
 ### Step 1.4: Initialize State
 ```javascript
 const initialState = {
  status: 'running',
  started_at: new Date().toISOString(),
  updated_at: new Date().toISOString(),
  workflow_name: workflowName,
  workflow_context: workflowContext,
  analysis_depth: workflowPreferences.analysisDepth,
  auto_fix: workflowPreferences.autoFix,
  steps: steps.map((s, i) => ({
    ...s,
    index: i,
    status: 'pending',
    execution: null,
    analysis: null,
    test_requirements: s.test_requirements || null  // from Step 1.2b
  })),
  analysis_session_id: reqSessionId || null,  // resume chain starts from requirements generation
  process_log_entries: [],
  synthesis: null,
  errors: [],
  error_count: 0,
  max_errors: 3,
  work_dir: workDir
 };
 Write(`${workDir}/workflow-state.json`, JSON.stringify(initialState, null, 2));
 ```
 ### Step 1.5: Initialize Process Log
 ```javascript
 const processLog = `# Workflow Tune Process Log
 **Workflow**: ${workflowName}
 **Context**: ${workflowContext}
 **Steps**: ${steps.length}
 **Analysis Depth**: ${workflowPreferences.analysisDepth}
 **Started**: ${new Date().toISOString()}
 ---
 `;
 Write(`${workDir}/process-log.md`, processLog);
 ```
 ## Output
 - **Variables**: `workDir`, `steps[]`, `workflowContext`, `commandDoc`, initialized state
 - **Files**: `workflow-state.json`, `process-log.md`, `command-doc.md`, per-step directories
 - **User Confirmation**: Execution plan confirmed (or cancelled → abort)
 - **TaskUpdate**: Mark Phase 1 completed, start Step Loop
--- a/.claude/skills/workflow-tune/phases/02-step-execute.md
+++ b/.claude/skills/workflow-tune/phases/02-step-execute.md
@@ -1,197 +0,0 @@
 # Phase 2: Execute Step
 > **COMPACT SENTINEL [Phase 2: Execute Step]**
 > This phase contains 4 execution steps (Step 2.1 -- 2.4).
 > If you can read this sentinel but cannot find the full Step protocol below, context has been compressed.
 > Recovery: `Read("phases/02-step-execute.md")`
 Execute a single workflow step and collect its output artifacts.
 ## Objective
 - Determine step execution method (skill invoke / ccw cli / shell command)
 - Execute step with appropriate tool
 - Collect output artifacts into step directory
 - Write artifacts manifest
 ## Execution
 ### Step 2.1: Prepare Step Directory
 ```javascript
 const stepIdx = currentStepIndex;  // from orchestrator loop
 const step = state.steps[stepIdx];
 const stepDir = `${state.work_dir}/steps/step-${stepIdx + 1}`;
 const artifactsDir = `${stepDir}/artifacts`;
 // Capture pre-execution state (git status, file timestamps)
 const preGitStatus = Bash('git status --porcelain 2>/dev/null || echo "not a git repo"').stdout;
 // ★ Warn if dirty git working directory (first step only)
 if (stepIdx === 0 && preGitStatus.trim() && preGitStatus.trim() !== 'not a git repo') {
  const dirtyLines = preGitStatus.trim().split('\n').length;
  // Log warning — artifact collection via git diff may be unreliable
  // This is informational; does not block execution
  console.warn(`⚠ Dirty git working directory detected (${dirtyLines} changed files). Artifact collection via git diff may include pre-existing changes.`);
 }
 const preExecSnapshot = {
  timestamp: new Date().toISOString(),
  git_status: preGitStatus,
  working_files: Glob('**/*.{ts,js,md,json}').slice(0, 50)  // sample
 };
 Write(`${stepDir}/pre-exec-snapshot.json`, JSON.stringify(preExecSnapshot, null, 2));
 ```
 ### Step 2.2: Execute by Step Type
 ```javascript
 let executionResult = { success: false, method: '', output: '', duration: 0 };
 const startTime = Date.now();
 switch (step.type) {
  case 'skill': {
    // Skill invocation — use Skill tool
    // Extract skill name and arguments from command
    const skillCmd = step.command.replace(/^\//, '');
    const [skillName, ...skillArgs] = skillCmd.split(/\s+/);
    // Execute skill (this runs synchronously within current context)
    // Note: Skill execution produces artifacts in the working directory
    // We capture changes by comparing pre/post state
    Skill({
      name: skillName,
      arguments: skillArgs.join(' ')
    });
    executionResult.method = 'skill';
    executionResult.success = true;
    break;
  }
  case 'ccw-cli': {
    // Direct ccw cli command
    const cliCommand = step.command;
    Bash({
      command: cliCommand,
      run_in_background: true,
      timeout: 600000  // 10 minutes
    });
    // STOP — wait for hook callback
    // After callback:
    executionResult.method = 'ccw-cli';
    executionResult.success = true;
    break;
  }
  case 'command': {
    // Generic shell command
    const result = Bash({
      command: step.command,
      timeout: 300000  // 5 minutes
    });
    executionResult.method = 'command';
    executionResult.output = result.stdout || '';
    executionResult.success = result.exitCode === 0;
    // Save command output
    if (executionResult.output) {
      Write(`${artifactsDir}/command-output.txt`, executionResult.output);
    }
    if (result.stderr) {
      Write(`${artifactsDir}/command-stderr.txt`, result.stderr);
    }
    break;
  }
 }
 executionResult.duration = Date.now() - startTime;
 ```
 ### Step 2.3: Collect Artifacts
 ```javascript
 // Capture post-execution state
 const postExecSnapshot = {
  timestamp: new Date().toISOString(),
  git_status: Bash('git status --porcelain 2>/dev/null || echo "not a git repo"').stdout,
  working_files: Glob('**/*.{ts,js,md,json}').slice(0, 50)
 };
 // Detect changed/new files by comparing snapshots
 const preFiles = new Set(preExecSnapshot.working_files);
 const newOrChanged = postExecSnapshot.working_files.filter(f => !preFiles.has(f));
 // Also check git diff for modified files
 const gitDiff = Bash('git diff --name-only 2>/dev/null || true').stdout.trim().split('\n').filter(Boolean);
 // Collect all artifacts (new files + git-changed files + declared expected_artifacts)
 const declaredArtifacts = (step.expected_artifacts || []).filter(f => {
  // Verify declared artifacts actually exist
  const exists = Glob(f);
  return exists.length > 0;
 }).flatMap(f => Glob(f));
 const allArtifacts = [...new Set([...newOrChanged, ...gitDiff, ...declaredArtifacts])];
 // Copy detected artifacts to step artifacts dir (or record references)
 const artifactManifest = {
  step: step.name,
  step_index: stepIdx,
  execution_method: executionResult.method,
  success: executionResult.success,
  duration_ms: executionResult.duration,
  artifacts: allArtifacts.map(f => ({
    path: f,
    type: f.endsWith('.md') ? 'markdown' : f.endsWith('.json') ? 'json' : 'other',
    size: 'unknown'  // Can be filled by stat if needed
  })),
  // For skill type: also check .workflow/.scratchpad for generated files
  scratchpad_files: step.type === 'skill'
    ? Glob('.workflow/.scratchpad/**/*').filter(f => {
        // Only include files created after step started
        return true;  // Heuristic: include recent scratchpad files
      }).slice(0, 20)
    : [],
  collected_at: new Date().toISOString()
 };
 Write(`${stepDir}/artifacts-manifest.json`, JSON.stringify(artifactManifest, null, 2));
 ```
 ### Step 2.4: Update State
 ```javascript
 state.steps[stepIdx].status = 'executed';
 state.steps[stepIdx].execution = {
  method: executionResult.method,
  success: executionResult.success,
  duration_ms: executionResult.duration,
  artifacts_dir: artifactsDir,
  manifest_path: `${stepDir}/artifacts-manifest.json`,
  artifact_count: artifactManifest.artifacts.length,
  started_at: preExecSnapshot.timestamp,
  completed_at: new Date().toISOString()
 };
 state.updated_at = new Date().toISOString();
 Write(`${state.work_dir}/workflow-state.json`, JSON.stringify(state, null, 2));
 ```
 ## Error Handling
 | Error | Recovery |
 |-------|----------|
 | Skill not found | Record failure, set success=false, continue to Phase 3 |
 | CLI timeout (10min) | Retry once with shorter timeout, then record failure |
 | Command exit non-zero | Record stderr, set success=false, continue to Phase 3 |
 | No artifacts detected | Continue to Phase 3 — analysis evaluates step definition quality |
 ## Output
 - **Files**: `pre-exec-snapshot.json`, `artifacts-manifest.json`, `artifacts/` (if command type)
 - **State**: `steps[stepIdx].execution` updated
 - **Next**: Phase 3 (Analyze Step)
--- a/.claude/skills/workflow-tune/phases/03-step-analyze.md
+++ b/.claude/skills/workflow-tune/phases/03-step-analyze.md
@@ -1,386 +0,0 @@
 # Phase 3: Analyze Step
 > **COMPACT SENTINEL [Phase 3: Analyze Step]**
 > This phase contains 5 execution steps (Step 3.1 -- 3.5).
 > If you can read this sentinel but cannot find the full Step protocol below, context has been compressed.
 > Recovery: `Read("phases/03-step-analyze.md")`
 Analyze a completed step's artifacts and quality using `ccw cli --tool gemini --mode analysis`. Uses `--resume` to maintain context across step analyses, building a continuous analysis chain.
 ## Objective
 - Inspect step artifacts (file list, content, quality signals)
 - Build analysis prompt with step context + prior process log
 - Execute via ccw cli Gemini with resume chain
 - Parse analysis results → write step-{N}-analysis.md
 - Append findings to process-log.md
 - Return updated session ID for resume chain
 ## Execution
 ### Step 3.1: Inspect Artifacts
 ```javascript
 const stepIdx = currentStepIndex;
 const step = state.steps[stepIdx];
 const stepDir = `${state.work_dir}/steps/step-${stepIdx + 1}`;
 // Read artifacts manifest
 const manifest = JSON.parse(Read(`${stepDir}/artifacts-manifest.json`));
 // Build artifact summary based on analysis depth
 let artifactSummary = '';
 if (state.analysis_depth === 'quick') {
  // Quick: just file list and sizes
  artifactSummary = `Artifacts (${manifest.artifacts.length} files):\n` +
    manifest.artifacts.map(a => `- ${a.path} (${a.type})`).join('\n');
 } else {
  // Standard/Deep: include file content summaries
  artifactSummary = manifest.artifacts.map(a => {
    const maxLines = state.analysis_depth === 'deep' ? 300 : 150;
    try {
      const content = Read(a.path, { limit: maxLines });
      return `--- ${a.path} (${a.type}) ---\n${content}`;
    } catch {
      return `--- ${a.path} --- [unreadable]`;
    }
  }).join('\n\n');
  // Deep: also include scratchpad files
  if (state.analysis_depth === 'deep' && manifest.scratchpad_files?.length > 0) {
    artifactSummary += '\n\n--- Scratchpad Files ---\n' +
      manifest.scratchpad_files.slice(0, 5).map(f => {
        const content = Read(f, { limit: 100 });
        return `--- ${f} ---\n${content}`;
      }).join('\n\n');
  }
 }
 // Execution result summary
 const execSummary = `Execution: ${step.execution.method} | ` +
  `Success: ${step.execution.success} | ` +
  `Duration: ${step.execution.duration_ms}ms | ` +
  `Artifacts: ${manifest.artifacts.length} files`;
 ```
 ### Step 3.2: Build Prior Context
 ```javascript
 // Build accumulated process log context for this analysis
 const priorProcessLog = Read(`${state.work_dir}/process-log.md`);
 // Build step chain context (what came before, what comes after)
 const stepChainContext = state.steps.map((s, i) => {
  const status = i < stepIdx ? 'completed' : i === stepIdx ? 'CURRENT' : 'pending';
  const score = s.analysis?.quality_score || '-';
  return `${i + 1}. [${status}] ${s.name} (${s.type}) — Quality: ${score}`;
 }).join('\n');
 // Previous step handoff context (if not first step)
 let handoffContext = '';
 if (stepIdx > 0) {
  const prevStep = state.steps[stepIdx - 1];
  const prevAnalysis = prevStep.analysis;
  if (prevAnalysis) {
    handoffContext = `PREVIOUS STEP OUTPUT SUMMARY:
 Step "${prevStep.name}" produced ${prevStep.execution?.artifact_count || 0} artifacts.
 Quality: ${prevAnalysis.quality_score}/100
 Key outputs: ${prevAnalysis.key_outputs?.join(', ') || 'unknown'}
 Handoff notes: ${prevAnalysis.handoff_notes || 'none'}`;
  }
 }
 ```
 ### Step 3.3: Construct Analysis Prompt
 ```javascript
 // Ref: templates/step-analysis-prompt.md
 const depthInstructions = {
  quick: 'Provide brief assessment (3-5 bullet points). Focus on: execution success, output completeness, obvious issues.',
  standard: 'Provide detailed assessment. Cover: execution quality, output completeness, artifact quality, step-to-step handoff readiness, potential issues.',
  deep: 'Provide exhaustive assessment. Cover: execution quality, output completeness and correctness, artifact quality and structure, step-to-step handoff integrity, error handling, performance signals, architecture implications, edge cases.'
 };
 // ★ Build test requirements section (the evaluation baseline)
 const testReqs = step.test_requirements;
 let testReqSection = '';
 if (testReqs) {
  testReqSection = `
 TEST REQUIREMENTS (Acceptance Criteria — use these as the PRIMARY evaluation baseline):
  Pass Criteria: ${testReqs.pass_criteria}
  Expected Outputs: ${(testReqs.expected_outputs || []).join(', ') || 'not specified'}
  Content Signals (patterns that indicate success): ${(testReqs.content_signals || []).join(', ') || 'not specified'}
  Quality Thresholds: ${(testReqs.quality_thresholds || []).join(', ') || 'not specified'}
  Fail Signals (patterns that indicate failure): ${(testReqs.fail_signals || []).join(', ') || 'not specified'}
  Handoff Contract (what next step needs): ${testReqs.handoff_contract || 'not specified'}
 IMPORTANT: Score quality_score based on how well the actual output matches these test requirements.
  - 90-100: All pass_criteria met, all expected_outputs present, content_signals found, no fail_signals
  - 70-89: Most criteria met, minor gaps
  - 50-69: Partial match, significant gaps
  - 0-49: Fail — fail_signals present or pass_criteria not met`;
 } else {
  testReqSection = `
 NOTE: No pre-generated test requirements for this step. Evaluate based on general quality signals and workflow context.`;
 }
 const analysisPrompt = `PURPOSE: Evaluate workflow step "${step.name}" (step ${stepIdx + 1}/${state.steps.length}) against its acceptance criteria. Judge whether the command execution met the pre-defined test requirements.
 WORKFLOW CONTEXT:
 Name: ${state.workflow_name}
 Goal: ${state.workflow_context}
 Step Chain:
 ${stepChainContext}
 CURRENT STEP:
 Name: ${step.name}
 Type: ${step.type}
 Command: ${step.command}
 ${step.success_criteria ? `Success Criteria: ${step.success_criteria}` : ''}
 ${testReqSection}
 EXECUTION RESULT:
 ${execSummary}
 ${handoffContext}
 STEP ARTIFACTS:
 ${artifactSummary}
 ANALYSIS DEPTH: ${state.analysis_depth}
 ${depthInstructions[state.analysis_depth]}
 TASK:
 1. **Requirement Matching**: Compare actual output against test requirements (pass_criteria, expected_outputs, content_signals)
 2. **Fail Signal Detection**: Check for any fail_signals in the output
 3. **Handoff Contract Verification**: Does the output satisfy handoff_contract for the next step?
 4. **Gap Analysis**: What's missing between actual output and requirements?
 5. **Quality Score**: Rate 0-100 based on requirement fulfillment (NOT general quality)
 EXPECTED OUTPUT (strict JSON, no markdown):
 {
  "quality_score": <0-100>,
  "requirement_match": {
    "pass": <true|false>,
    "criteria_met": ["<which pass_criteria were satisfied>"],
    "criteria_missed": ["<which pass_criteria were NOT satisfied>"],
    "expected_outputs_found": ["<expected files that exist>"],
    "expected_outputs_missing": ["<expected files that are absent>"],
    "content_signals_found": ["<success patterns detected in output>"],
    "content_signals_missing": ["<success patterns NOT found>"],
    "fail_signals_detected": ["<failure patterns found, if any>"]
  },
  "execution_assessment": {
    "success": <true|false>,
    "completeness": "<complete|partial|failed>",
    "notes": "<brief assessment>"
  },
  "artifact_assessment": {
    "count": <number>,
    "quality": "<high|medium|low>",
    "key_outputs": ["<main output 1>", "<main output 2>"],
    "missing_outputs": ["<expected but missing>"]
  },
  "handoff_assessment": {
    "ready": <true|false>,
    "contract_satisfied": <true|false|null>,
    "next_step_compatible": <true|false|null>,
    "handoff_notes": "<what next step should know>"
  },
  "issues": [
    { "severity": "high|medium|low", "description": "<issue>", "suggestion": "<fix>" }
  ],
  "optimization_opportunities": [
    { "area": "<area>", "description": "<opportunity>", "impact": "high|medium|low" }
  ],
  "step_summary": "<1-2 sentence summary for process log>"
 }
 CONSTRAINTS: Be specific, reference artifact content where possible, score against requirements not general quality, output ONLY JSON`;
 ```
 ### Step 3.4: Execute via ccw cli Gemini with Resume
 ```javascript
 function escapeForShell(str) {
  return str.replace(/"/g, '\\"').replace(/\$/g, '\\$').replace(/`/g, '\\`');
 }
 // Build CLI command with optional resume
 let cliCommand = `ccw cli -p "${escapeForShell(analysisPrompt)}" --tool gemini --mode analysis`;
 // Resume from previous step's analysis session (maintains context chain)
 if (state.analysis_session_id) {
  cliCommand += ` --resume ${state.analysis_session_id}`;
 }
 Bash({
  command: cliCommand,
  run_in_background: true,
  timeout: 300000  // 5 minutes
 });
 // STOP — wait for hook callback
 ```
 ### Step 3.5: Parse Results and Update Process Log
 After CLI completes:
 ```javascript
 const rawOutput = /* CLI output from callback */;
 // Extract session ID from CLI output for resume chain
 const sessionIdMatch = rawOutput.match(/\[CCW_EXEC_ID=([^\]]+)\]/);
 if (sessionIdMatch) {
  state.analysis_session_id = sessionIdMatch[1];
 }
 // Parse JSON
 const jsonMatch = rawOutput.match(/\{[\s\S]*\}/);
 let analysis;
 if (jsonMatch) {
  try {
    analysis = JSON.parse(jsonMatch[0]);
  } catch (e) {
    // Fallback: extract score heuristically
    const scoreMatch = rawOutput.match(/"quality_score"\s*:\s*(\d+)/);
    analysis = {
      quality_score: scoreMatch ? parseInt(scoreMatch[1]) : 50,
      execution_assessment: { success: step.execution.success, completeness: 'unknown', notes: 'Parse failed' },
      artifact_assessment: { count: manifest.artifacts.length, quality: 'unknown', key_outputs: [], missing_outputs: [] },
      handoff_assessment: { ready: true, next_step_compatible: null, handoff_notes: '' },
      issues: [{ severity: 'low', description: 'Analysis output parsing failed', suggestion: 'Review raw output' }],
      optimization_opportunities: [],
      step_summary: 'Analysis parsing failed — raw output saved for manual review'
    };
  }
 } else {
  analysis = {
    quality_score: 50,
    step_summary: 'No structured analysis output received'
  };
 }
 // Write step analysis file
 const reqMatch = analysis.requirement_match;
 const reqMatchSection = reqMatch ? `
 ## Requirement Match — ${reqMatch.pass ? 'PASS ✓' : 'FAIL ✗'}
 ### Criteria Met
 ${(reqMatch.criteria_met || []).map(c => `- ✓ ${c}`).join('\n') || '- None'}
 ### Criteria Missed
 ${(reqMatch.criteria_missed || []).map(c => `- ✗ ${c}`).join('\n') || '- None'}
 ### Expected Outputs
 - Found: ${(reqMatch.expected_outputs_found || []).join(', ') || 'None'}
 - Missing: ${(reqMatch.expected_outputs_missing || []).join(', ') || 'None'}
 ### Content Signals
 - Detected: ${(reqMatch.content_signals_found || []).join(', ') || 'None'}
 - Missing: ${(reqMatch.content_signals_missing || []).join(', ') || 'None'}
 ### Fail Signals
 ${(reqMatch.fail_signals_detected || []).length > 0
  ? (reqMatch.fail_signals_detected || []).map(f => `- ⚠ ${f}`).join('\n')
  : '- None detected'}
 ` : '';
 const stepAnalysisReport = `# Step ${stepIdx + 1} Analysis: ${step.name}
 **Quality Score**: ${analysis.quality_score}/100
 **Requirement Match**: ${reqMatch ? (reqMatch.pass ? 'PASS' : 'FAIL') : 'N/A (no test requirements)'}
 **Date**: ${new Date().toISOString()}
 ${reqMatchSection}
 ## Execution
 - Success: ${analysis.execution_assessment?.success}
 - Completeness: ${analysis.execution_assessment?.completeness}
 - Notes: ${analysis.execution_assessment?.notes}
 ## Artifacts
 - Count: ${analysis.artifact_assessment?.count}
 - Quality: ${analysis.artifact_assessment?.quality}
 - Key Outputs: ${analysis.artifact_assessment?.key_outputs?.join(', ') || 'N/A'}
 - Missing: ${analysis.artifact_assessment?.missing_outputs?.join(', ') || 'None'}
 ## Handoff Readiness
 - Ready: ${analysis.handoff_assessment?.ready}
 - Contract Satisfied: ${analysis.handoff_assessment?.contract_satisfied}
 - Next Step Compatible: ${analysis.handoff_assessment?.next_step_compatible}
 - Notes: ${analysis.handoff_assessment?.handoff_notes}
 ## Issues
 ${(analysis.issues || []).map(i => `- [${i.severity}] ${i.description} → ${i.suggestion}`).join('\n') || 'None'}
 ## Optimization Opportunities
 ${(analysis.optimization_opportunities || []).map(o => `- [${o.impact}] ${o.area}: ${o.description}`).join('\n') || 'None'}
 `;
 Write(`${stepDir}/step-${stepIdx + 1}-analysis.md`, stepAnalysisReport);
 // Append to process log
 const reqPassStr = reqMatch ? (reqMatch.pass ? 'PASS' : 'FAIL') : 'N/A';
 const processLogEntry = `
 ## Step ${stepIdx + 1}: ${step.name} — Score: ${analysis.quality_score}/100 | Req: ${reqPassStr}
 **Command**: \`${step.command}\`
 **Result**: ${analysis.execution_assessment?.completeness || 'unknown'} | ${analysis.artifact_assessment?.count || 0} artifacts
 **Requirement Match**: ${reqPassStr}${reqMatch ? ` — Met: ${(reqMatch.criteria_met || []).length}, Missed: ${(reqMatch.criteria_missed || []).length}, Fail Signals: ${(reqMatch.fail_signals_detected || []).length}` : ''}
 **Summary**: ${analysis.step_summary || 'No summary'}
 **Issues**: ${(analysis.issues || []).filter(i => i.severity === 'high').map(i => i.description).join('; ') || 'None critical'}
 **Handoff**: ${analysis.handoff_assessment?.contract_satisfied ? 'Contract satisfied' : analysis.handoff_assessment?.handoff_notes || 'Ready'}
 ---
 `;
 // Append to process-log.md
 const currentLog = Read(`${state.work_dir}/process-log.md`);
 Write(`${state.work_dir}/process-log.md`, currentLog + processLogEntry);
 // Update state
 state.steps[stepIdx].analysis = {
  quality_score: analysis.quality_score,
  requirement_pass: reqMatch?.pass ?? null,
  criteria_met_count: (reqMatch?.criteria_met || []).length,
  criteria_missed_count: (reqMatch?.criteria_missed || []).length,
  fail_signals_count: (reqMatch?.fail_signals_detected || []).length,
  key_outputs: analysis.artifact_assessment?.key_outputs || [],
  handoff_notes: analysis.handoff_assessment?.handoff_notes || '',
  contract_satisfied: analysis.handoff_assessment?.contract_satisfied ?? null,
  issue_count: (analysis.issues || []).length,
  high_issues: (analysis.issues || []).filter(i => i.severity === 'high').length,
  optimization_count: (analysis.optimization_opportunities || []).length,
  analysis_file: `${stepDir}/step-${stepIdx + 1}-analysis.md`
 };
 state.steps[stepIdx].status = 'analyzed';
 state.process_log_entries.push({
  step_index: stepIdx,
  step_name: step.name,
  quality_score: analysis.quality_score,
  summary: analysis.step_summary,
  timestamp: new Date().toISOString()
 });
 state.updated_at = new Date().toISOString();
 Write(`${state.work_dir}/workflow-state.json`, JSON.stringify(state, null, 2));
 ```
 ## Error Handling
 | Error | Recovery |
 |-------|----------|
 | CLI timeout | Retry once without --resume (fresh session) |
 | Resume session not found | Start fresh analysis session, continue |
 | JSON parse fails | Extract score heuristically, save raw output |
 | No output | Default score 50, minimal process log entry |
 ## Output
 - **Files**: `step-{N}-analysis.md`, updated `process-log.md`
 - **State**: `steps[stepIdx].analysis` updated, `analysis_session_id` updated
 - **Next**: Phase 2 for next step, or Phase 4 (Synthesize) if all steps done
--- a/.claude/skills/workflow-tune/phases/04-synthesize.md
+++ b/.claude/skills/workflow-tune/phases/04-synthesize.md
@@ -1,257 +0,0 @@
 # Phase 4: Synthesize
 > **COMPACT SENTINEL [Phase 4: Synthesize]**
 > This phase contains 4 execution steps (Step 4.1 -- 4.4).
 > If you can read this sentinel but cannot find the full Step protocol below, context has been compressed.
 > Recovery: `Read("phases/04-synthesize.md")`
 Synthesize all step analyses into cross-step insights. Evaluates the workflow as a whole: step ordering, handoff quality, redundancy, bottlenecks, and overall coherence.
 ## Objective
 - Read complete process-log.md and all step analyses
 - Build synthesis prompt with full workflow context
 - Execute via ccw cli Gemini with resume chain
 - Generate cross-step optimization insights
 - Write synthesis.md
 ## Execution
 ### Step 4.1: Gather All Analyses
 ```javascript
 // Read process log
 const processLog = Read(`${state.work_dir}/process-log.md`);
 // Read all step analysis files
 const stepAnalyses = state.steps.map((step, i) => {
  const analysisFile = `${state.work_dir}/steps/step-${i + 1}/step-${i + 1}-analysis.md`;
  try {
    return { step: step.name, index: i, content: Read(analysisFile) };
  } catch {
    return { step: step.name, index: i, content: '[Analysis not available]' };
  }
 });
 // Build score summary
 const scoreSummary = state.steps.map((s, i) =>
  `Step ${i + 1} (${s.name}): ${s.analysis?.quality_score || '-'}/100 | Issues: ${s.analysis?.issue_count || 0} (${s.analysis?.high_issues || 0} high)`
 ).join('\n');
 // Compute aggregate stats
 const scores = state.steps.map(s => s.analysis?.quality_score).filter(Boolean);
 const avgScore = scores.length > 0 ? Math.round(scores.reduce((a, b) => a + b, 0) / scores.length) : 0;
 const minScore = scores.length > 0 ? Math.min(...scores) : 0;
 const totalIssues = state.steps.reduce((sum, s) => sum + (s.analysis?.issue_count || 0), 0);
 const totalHighIssues = state.steps.reduce((sum, s) => sum + (s.analysis?.high_issues || 0), 0);
 ```
 ### Step 4.2: Construct Synthesis Prompt
 ```javascript
 // Ref: templates/synthesis-prompt.md
 const synthesisPrompt = `PURPOSE: Synthesize all step analyses into a holistic workflow optimization assessment. Evaluate cross-step concerns: ordering, handoff quality, redundancy, bottlenecks, and overall workflow coherence.
 WORKFLOW OVERVIEW:
 Name: ${state.workflow_name}
 Goal: ${state.workflow_context}
 Steps: ${state.steps.length}
 Average Quality: ${avgScore}/100
 Weakest Step: ${minScore}/100
 Total Issues: ${totalIssues} (${totalHighIssues} high severity)
 SCORE SUMMARY:
 ${scoreSummary}
 COMPLETE PROCESS LOG:
 ${processLog}
 DETAILED STEP ANALYSES:
 ${stepAnalyses.map(a => `### ${a.step} (Step ${a.index + 1})\n${a.content}`).join('\n\n---\n\n')}
 TASK:
 1. **Workflow Coherence**: Do steps form a logical sequence? Any missing steps?
 2. **Handoff Quality**: Are step outputs well-consumed by subsequent steps? Data format mismatches?
 3. **Redundancy Detection**: Do any steps duplicate work? Overlapping concerns?
 4. **Bottleneck Identification**: Which steps are bottlenecks (lowest quality, longest duration)?
 5. **Step Ordering**: Would reordering steps improve outcomes?
 6. **Missing Steps**: Are there gaps in the pipeline that need additional steps?
 7. **Per-Step Optimization**: Top 3 improvements per underperforming step
 8. **Workflow-Level Optimization**: Structural changes to the overall pipeline
 EXPECTED OUTPUT (strict JSON, no markdown):
 {
  "workflow_score": <0-100>,
  "coherence": {
    "score": <0-100>,
    "assessment": "<logical flow evaluation>",
    "gaps": ["<missing step or transition>"]
  },
  "handoff_quality": {
    "score": <0-100>,
    "issues": [
      { "from_step": "<step name>", "to_step": "<step name>", "issue": "<description>", "fix": "<suggestion>" }
    ]
  },
  "redundancy": {
    "found": <true|false>,
    "items": [
      { "steps": ["<step1>", "<step2>"], "description": "<what overlaps>", "recommendation": "<merge or remove>" }
    ]
  },
  "bottlenecks": [
    { "step": "<step name>", "reason": "<why it's a bottleneck>", "impact": "high|medium|low", "fix": "<suggestion>" }
  ],
  "ordering_suggestions": [
    { "current": "<current order description>", "proposed": "<new order>", "rationale": "<why>" }
  ],
  "per_step_improvements": [
    { "step": "<step name>", "improvements": [
      { "priority": "high|medium|low", "description": "<what to change>", "rationale": "<why>" }
    ]}
  ],
  "workflow_improvements": [
    { "priority": "high|medium|low", "category": "structure|handoff|performance|quality", "description": "<change>", "rationale": "<why>", "affected_steps": ["<step names>"] }
  ],
  "summary": "<2-3 sentence executive summary of workflow health and top recommendations>"
 }
 CONSTRAINTS: Be specific, reference step names and artifact details, output ONLY JSON`;
 ```
 ### Step 4.3: Execute via ccw cli Gemini with Resume
 ```javascript
 function escapeForShell(str) {
  return str.replace(/"/g, '\\"').replace(/\$/g, '\\$').replace(/`/g, '\\`');
 }
 let cliCommand = `ccw cli -p "${escapeForShell(synthesisPrompt)}" --tool gemini --mode analysis`;
 // Resume from the last step's analysis session
 if (state.analysis_session_id) {
  cliCommand += ` --resume ${state.analysis_session_id}`;
 }
 Bash({
  command: cliCommand,
  run_in_background: true,
  timeout: 300000
 });
 // STOP — wait for hook callback
 ```
 ### Step 4.4: Parse Results and Write Synthesis
 After CLI completes:
 ```javascript
 const rawOutput = /* CLI output from callback */;
 const jsonMatch = rawOutput.match(/\{[\s\S]*\}/);
 let synthesis;
 if (jsonMatch) {
  try {
    synthesis = JSON.parse(jsonMatch[0]);
  } catch {
    synthesis = {
      workflow_score: avgScore,
      summary: 'Synthesis parsing failed — individual step analyses available',
      workflow_improvements: [],
      per_step_improvements: [],
      bottlenecks: [],
      handoff_quality: { score: 0, issues: [] },
      coherence: { score: 0, assessment: 'Parse error' },
      redundancy: { found: false, items: [] },
      ordering_suggestions: []
    };
  }
 } else {
  synthesis = {
    workflow_score: avgScore,
    summary: 'No synthesis output received',
    workflow_improvements: [],
    per_step_improvements: []
  };
 }
 // Write synthesis report
 const synthesisReport = `# Workflow Synthesis
 **Workflow Score**: ${synthesis.workflow_score}/100
 **Date**: ${new Date().toISOString()}
 ## Executive Summary
 ${synthesis.summary}
 ## Coherence (${synthesis.coherence?.score || '-'}/100)
 ${synthesis.coherence?.assessment || 'N/A'}
 ${(synthesis.coherence?.gaps || []).length > 0 ? '\n### Gaps\n' + synthesis.coherence.gaps.map(g => `- ${g}`).join('\n') : ''}
 ## Handoff Quality (${synthesis.handoff_quality?.score || '-'}/100)
 ${(synthesis.handoff_quality?.issues || []).map(i =>
  `- **${i.from_step} → ${i.to_step}**: ${i.issue}\n  Fix: ${i.fix}`
 ).join('\n') || 'No handoff issues'}
 ## Redundancy
 ${synthesis.redundancy?.found ? (synthesis.redundancy.items || []).map(r =>
  `- Steps ${r.steps.join(', ')}: ${r.description} → ${r.recommendation}`
 ).join('\n') : 'No redundancy detected'}
 ## Bottlenecks
 ${(synthesis.bottlenecks || []).map(b =>
  `- **${b.step}** [${b.impact}]: ${b.reason}\n  Fix: ${b.fix}`
 ).join('\n') || 'No bottlenecks'}
 ## Ordering Suggestions
 ${(synthesis.ordering_suggestions || []).map(o =>
  `- Current: ${o.current}\n  Proposed: ${o.proposed}\n  Rationale: ${o.rationale}`
 ).join('\n') || 'Current ordering is optimal'}
 ## Per-Step Improvements
 ${(synthesis.per_step_improvements || []).map(s =>
  `### ${s.step}\n` + (s.improvements || []).map(i =>
    `- [${i.priority}] ${i.description} — ${i.rationale}`
  ).join('\n')
 ).join('\n\n') || 'No per-step improvements'}
 ## Workflow-Level Improvements
 ${(synthesis.workflow_improvements || []).map((w, i) =>
  `### ${i + 1}. [${w.priority}] ${w.description}\n- Category: ${w.category}\n- Rationale: ${w.rationale}\n- Affected: ${(w.affected_steps || []).join(', ')}`
 ).join('\n\n') || 'No workflow-level improvements'}
 `;
 Write(`${state.work_dir}/synthesis.md`, synthesisReport);
 // Update state
 state.synthesis = {
  workflow_score: synthesis.workflow_score,
  summary: synthesis.summary,
  improvement_count: (synthesis.workflow_improvements || []).length +
    (synthesis.per_step_improvements || []).reduce((sum, s) => sum + (s.improvements || []).length, 0),
  high_priority_count: (synthesis.workflow_improvements || []).filter(w => w.priority === 'high').length,
  bottleneck_count: (synthesis.bottlenecks || []).length,
  handoff_issue_count: (synthesis.handoff_quality?.issues || []).length,
  synthesis_file: `${state.work_dir}/synthesis.md`,
  raw_data: synthesis
 };
 state.updated_at = new Date().toISOString();
 Write(`${state.work_dir}/workflow-state.json`, JSON.stringify(state, null, 2));
 ```
 ## Error Handling
 | Error | Recovery |
 |-------|----------|
 | CLI timeout | Generate synthesis from individual step analyses only (no cross-step) |
 | Resume fails | Start fresh analysis session |
 | JSON parse fails | Use step-level data to construct minimal synthesis |
 ## Output
 - **Files**: `synthesis.md`
 - **State**: `state.synthesis` updated
 - **Next**: Phase 5 (Optimization Report)
--- a/.claude/skills/workflow-tune/phases/05-optimize-report.md
+++ b/.claude/skills/workflow-tune/phases/05-optimize-report.md
@@ -1,246 +0,0 @@
 # Phase 5: Optimization Report
 > **COMPACT SENTINEL [Phase 5: Report]**
 > This phase contains 4 execution steps (Step 5.1 -- 5.4).
 > If you can read this sentinel but cannot find the full Step protocol below, context has been compressed.
 > Recovery: `Read("phases/05-optimize-report.md")`
 Generate the final optimization report and optionally apply high-priority fixes.
 ## Objective
 - Read complete state, process log, synthesis
 - Generate structured final report
 - Optionally apply auto-fix (if enabled)
 - Write final-report.md
 - Display summary to user
 ## Execution
 ### Step 5.1: Read Complete State
 ```javascript
 const state = JSON.parse(Read(`${state.work_dir}/workflow-state.json`));
 const processLog = Read(`${state.work_dir}/process-log.md`);
 const synthesis = state.synthesis;
 state.status = 'completed';
 state.updated_at = new Date().toISOString();
 ```
 ### Step 5.2: Generate Report
 ```javascript
 // Compute stats
 const scores = state.steps.map(s => s.analysis?.quality_score).filter(Boolean);
 const avgScore = scores.length > 0 ? Math.round(scores.reduce((a, b) => a + b, 0) / scores.length) : 0;
 const minStep = state.steps.reduce((min, s) =>
  (s.analysis?.quality_score || 100) < (min.analysis?.quality_score || 100) ? s : min
 , state.steps[0]);
 const totalIssues = state.steps.reduce((sum, s) => sum + (s.analysis?.issue_count || 0), 0);
 const totalHighIssues = state.steps.reduce((sum, s) => sum + (s.analysis?.high_issues || 0), 0);
 // Step quality table (with requirement match)
 const stepTable = state.steps.map((s, i) => {
  const reqPass = s.analysis?.requirement_pass;
  const reqStr = reqPass === true ? 'PASS' : reqPass === false ? 'FAIL' : '-';
  return `| ${i + 1} | ${s.name} | ${s.type} | ${s.execution?.success ? 'OK' : 'FAIL'} | ${reqStr} | ${s.analysis?.quality_score || '-'} | ${s.analysis?.issue_count || 0} | ${s.analysis?.high_issues || 0} |`;
 }).join('\n');
 // Collect all improvements (workflow-level + per-step)
 const allImprovements = [];
 if (synthesis?.raw_data?.workflow_improvements) {
  synthesis.raw_data.workflow_improvements.forEach(w => {
    allImprovements.push({
      scope: 'workflow',
      priority: w.priority,
      description: w.description,
      rationale: w.rationale,
      category: w.category,
      affected: w.affected_steps || []
    });
  });
 }
 if (synthesis?.raw_data?.per_step_improvements) {
  synthesis.raw_data.per_step_improvements.forEach(s => {
    (s.improvements || []).forEach(imp => {
      allImprovements.push({
        scope: s.step,
        priority: imp.priority,
        description: imp.description,
        rationale: imp.rationale,
        category: 'step',
        affected: [s.step]
      });
    });
  });
 }
 // Sort by priority
 const priorityOrder = { high: 0, medium: 1, low: 2 };
 allImprovements.sort((a, b) => (priorityOrder[a.priority] || 2) - (priorityOrder[b.priority] || 2));
 const report = `# Workflow Tune — Final Report
 ## Summary
 | Field | Value |
 |-------|-------|
 | **Workflow** | ${state.workflow_name} |
 | **Goal** | ${state.workflow_context} |
 | **Steps** | ${state.steps.length} |
 | **Workflow Score** | ${synthesis?.workflow_score || avgScore}/100 |
 | **Average Step Quality** | ${avgScore}/100 |
 | **Weakest Step** | ${minStep.name} (${minStep.analysis?.quality_score || '-'}/100) |
 | **Total Issues** | ${totalIssues} (${totalHighIssues} high severity) |
 | **Analysis Depth** | ${state.analysis_depth} |
 | **Started** | ${state.started_at} |
 | **Completed** | ${state.updated_at} |
 ## Step Quality Matrix
 | # | Step | Type | Exec | Req Match | Quality | Issues | High |
 |---|------|------|------|-----------|---------|--------|------|
 ${stepTable}
 ## Workflow Flow Assessment
 ### Coherence: ${synthesis?.raw_data?.coherence?.score || '-'}/100
 ${synthesis?.raw_data?.coherence?.assessment || 'Not evaluated'}
 ### Handoff Quality: ${synthesis?.raw_data?.handoff_quality?.score || '-'}/100
 ${(synthesis?.raw_data?.handoff_quality?.issues || []).map(i =>
  `- **${i.from_step} → ${i.to_step}**: ${i.issue}`
 ).join('\n') || 'No handoff issues'}
 ### Bottlenecks
 ${(synthesis?.raw_data?.bottlenecks || []).map(b =>
  `- **${b.step}** [${b.impact}]: ${b.reason}`
 ).join('\n') || 'No bottlenecks identified'}
 ## Optimization Recommendations
 ### Priority: HIGH
 ${allImprovements.filter(i => i.priority === 'high').map((i, idx) =>
  `${idx + 1}. **[${i.scope}]** ${i.description}\n   - Rationale: ${i.rationale}\n   - Affected: ${i.affected.join(', ')}`
 ).join('\n') || 'None'}
 ### Priority: MEDIUM
 ${allImprovements.filter(i => i.priority === 'medium').map((i, idx) =>
  `${idx + 1}. **[${i.scope}]** ${i.description}\n   - Rationale: ${i.rationale}`
 ).join('\n') || 'None'}
 ### Priority: LOW
 ${allImprovements.filter(i => i.priority === 'low').map((i, idx) =>
  `${idx + 1}. **[${i.scope}]** ${i.description}`
 ).join('\n') || 'None'}
 ## Process Documentation
 Full process log: \`${state.work_dir}/process-log.md\`
 Synthesis: \`${state.work_dir}/synthesis.md\`
 ### Per-Step Analysis Files
 | Step | Analysis File |
 |------|---------------|
 ${state.steps.map((s, i) =>
  `| ${s.name} | \`${state.work_dir}/steps/step-${i + 1}/step-${i + 1}-analysis.md\` |`
 ).join('\n')}
 ## Artifact Locations
 | Path | Description |
 |------|-------------|
 | \`${state.work_dir}/workflow-state.json\` | Complete state |
 | \`${state.work_dir}/process-log.md\` | Accumulated process log |
 | \`${state.work_dir}/synthesis.md\` | Cross-step synthesis |
 | \`${state.work_dir}/final-report.md\` | This report |
 `;
 Write(`${state.work_dir}/final-report.md`, report);
 ```
 ### Step 5.3: Optional Auto-Fix
 ```javascript
 if (state.auto_fix && allImprovements.filter(i => i.priority === 'high').length > 0) {
  const highPriorityFixes = allImprovements.filter(i => i.priority === 'high');
  // ★ Safety: confirm with user before applying auto-fixes
  const fixList = highPriorityFixes.map((f, i) =>
    `${i + 1}. [${f.scope}] ${f.description}\n   Affected: ${f.affected.join(', ')}`
  ).join('\n');
  const autoFixConfirm = AskUserQuestion({
    questions: [{
      question: `以下 ${highPriorityFixes.length} 项高优先级优化将被自动应用：\n\n${fixList}\n\n确认应用？`,
      header: "Auto-Fix Confirmation",
      multiSelect: false,
      options: [
        { label: "Apply (应用)", description: "自动应用以上高优先级修复" },
        { label: "Skip (跳过)", description: "跳过自动修复，仅保留报告" }
      ]
    }]
  });
  if (autoFixConfirm["Auto-Fix Confirmation"].startsWith("Skip")) {
    // Skip auto-fix, just log it
    state.auto_fix_skipped = true;
  } else {
  Agent({
    subagent_type: 'general-purpose',
    run_in_background: false,
    description: 'Apply high-priority workflow optimizations',
    prompt: `## Task: Apply High-Priority Workflow Optimizations
 You are applying the top optimization suggestions from a workflow analysis.
 ## Improvements to Apply (HIGH priority only)
 ${highPriorityFixes.map((f, i) =>
  `${i + 1}. [${f.scope}] ${f.description}\n   Rationale: ${f.rationale}\n   Affected: ${f.affected.join(', ')}`
 ).join('\n')}
 ## Workflow Steps
 ${state.steps.map((s, i) => `${i + 1}. ${s.name} (${s.type}): ${s.command}`).join('\n')}
 ## Rules
 1. Read each affected file BEFORE modifying
 2. Apply ONLY the high-priority suggestions
 3. Preserve existing code style
 4. Write a changes summary to: ${state.work_dir}/auto-fix-changes.md
 `
  });
  } // end Apply branch
 }
 ```
 ### Step 5.4: Display Summary
 Output to user:
 ```
 Workflow Tune Complete!
 Workflow: {name}
 Steps: {count}
 Workflow Score: {score}/100
 Average Step Quality: {avgScore}/100
 Weakest Step: {name} ({score}/100)
 Step Scores: {step1}={score1} → {step2}={score2} → ... → {stepN}={scoreN}
 Issues: {total} ({high} high priority)
 Improvements: {count} ({highCount} high priority)
 Full report: {workDir}/final-report.md
 Process log: {workDir}/process-log.md
 ```
 ## Output
 - **Files**: `final-report.md`, optionally `auto-fix-changes.md`
 - **State**: `status = completed`
 - **Next**: Workflow complete. Return control to user.
--- a/.claude/skills/workflow-tune/specs/workflow-eval-criteria.md
+++ b/.claude/skills/workflow-tune/specs/workflow-eval-criteria.md
@@ -1,57 +0,0 @@
 # Workflow Evaluation Criteria
 Workflow 调优评估标准，由 Phase 03 (Analyze Step) 和 Phase 04 (Synthesize) 引用。
 ## Per-Step Dimensions
 | Dimension | Description |
 |-----------|-------------|
 | Execution Success | 命令是否成功执行，退出码是否正确 |
 | Output Completeness | 产物是否齐全，预期文件是否生成 |
 | Artifact Quality | 产物内容质量 — 非空、格式正确、内容有意义 |
 | Handoff Readiness | 产物是否满足下一步的输入要求，格式兼容性 |
 ## Per-Step Scoring Guide
 | Range | Level | Description |
 |-------|-------|-------------|
 | 90-100 | Excellent | 执行完美，产物高质量，下游可直接消费 |
 | 80-89 | Good | 执行成功，产物基本完整，微调即可衔接 |
 | 70-79 | Adequate | 执行成功但产物有缺失或质量一般 |
 | 60-69 | Needs Work | 部分失败或产物质量差，衔接困难 |
 | 0-59 | Poor | 执行失败或产物无法使用 |
 ## Workflow-Level Dimensions
 | Dimension | Description |
 |-----------|-------------|
 | Coherence | 步骤间的逻辑顺序是否合理，是否形成完整流程 |
 | Handoff Quality | 步骤间的数据传递是否顺畅，格式是否匹配 |
 | Redundancy | 是否存在步骤间的工作重叠或重复 |
 | Efficiency | 整体流程是否高效，有无不必要的步骤 |
 | Completeness | 是否覆盖所有必要环节，有无遗漏 |
 ## Analysis Depth Profiles
 ### Quick
 - 每步 3-5 要点
 - 关注: 执行成功、产出完整、明显问题
 - 跨步骤: 基本衔接检查
 ### Standard
 - 每步详细评估
 - 关注: 执行质量、产出完整性、产物质量、衔接就绪度、潜在问题
 - 跨步骤: 衔接质量、冗余检测、瓶颈识别
 ### Deep
 - 每步深度审查
 - 关注: 执行质量、产出正确性、结构质量、衔接完整性、错误处理、性能信号、架构影响、边界情况
 - 跨步骤: 全面流程优化、重排建议、缺失步骤检测、架构改进
 ## Issue Severity Guide
 | Severity | Description | Example |
 |----------|-------------|---------|
 | High | 阻断流程或导致错误结果 | 步骤执行失败、产物格式不兼容、关键数据丢失 |
 | Medium | 影响质量但不阻断 | 产物不完整、衔接需手动调整、冗余步骤 |
 | Low | 可改进但不影响功能 | 输出格式不一致、可优化的步骤顺序 |
--- a/.claude/skills/workflow-tune/templates/step-analysis-prompt.md
+++ b/.claude/skills/workflow-tune/templates/step-analysis-prompt.md
@@ -1,88 +0,0 @@
 # Step Analysis Prompt Template
 Phase 03 使用此模板构造 ccw cli 提示词，让 Gemini 分析单个步骤的执行结果和产物质量。
 ## Template
 ```
 PURPOSE: Analyze the output of workflow step "${stepName}" (step ${stepIndex}/${totalSteps}) to assess quality, identify issues, and evaluate handoff readiness for the next step.
 WORKFLOW CONTEXT:
 Name: ${workflowName}
 Goal: ${workflowContext}
 Step Chain:
 ${stepChainContext}
 CURRENT STEP:
 Name: ${stepName}
 Type: ${stepType}
 Command: ${stepCommand}
 ${successCriteria}
 EXECUTION RESULT:
 ${execSummary}
 ${handoffContext}
 STEP ARTIFACTS:
 ${artifactSummary}
 ANALYSIS DEPTH: ${analysisDepth}
 ${depthInstructions}
 TASK:
 1. Assess step execution quality (did it succeed? complete output?)
 2. Evaluate artifact quality (content correctness, completeness, format)
 3. Check handoff readiness (can the next step consume this output?)
 4. Identify issues, risks, or optimization opportunities
 5. Rate overall step quality 0-100
 EXPECTED OUTPUT (strict JSON, no markdown):
 {
  "quality_score": <0-100>,
  "execution_assessment": {
    "success": <true|false>,
    "completeness": "<complete|partial|failed>",
    "notes": "<brief assessment>"
  },
  "artifact_assessment": {
    "count": <number>,
    "quality": "<high|medium|low>",
    "key_outputs": ["<main output 1>", "<main output 2>"],
    "missing_outputs": ["<expected but missing>"]
  },
  "handoff_assessment": {
    "ready": <true|false>,
    "next_step_compatible": <true|false|null>,
    "handoff_notes": "<what next step should know>"
  },
  "issues": [
    { "severity": "high|medium|low", "description": "<issue>", "suggestion": "<fix>" }
  ],
  "optimization_opportunities": [
    { "area": "<area>", "description": "<opportunity>", "impact": "high|medium|low" }
  ],
  "step_summary": "<1-2 sentence summary for process log>"
 }
 CONSTRAINTS: Be specific, reference artifact content where possible, output ONLY JSON
 ```
 ## Variable Substitution
 | Variable | Source | Description |
 |----------|--------|-------------|
 | `${stepName}` | workflow-state.json | 当前步骤名称 |
 | `${stepIndex}` | orchestrator loop | 当前步骤序号 (1-based) |
 | `${totalSteps}` | workflow-state.json | 总步骤数 |
 | `${workflowName}` | workflow-state.json | Workflow 名称 |
 | `${workflowContext}` | workflow-state.json | Workflow 目标描述 |
 | `${stepChainContext}` | Phase 03 builds | 所有步骤状态概览 |
 | `${stepType}` | workflow-state.json | 步骤类型 (skill/ccw-cli/command) |
 | `${stepCommand}` | workflow-state.json | 步骤命令 |
 | `${successCriteria}` | workflow-state.json | 成功标准 (如有) |
 | `${execSummary}` | Phase 03 builds | 执行结果摘要 |
 | `${handoffContext}` | Phase 03 builds | 上一步的产出摘要 (非首步) |
 | `${artifactSummary}` | Phase 03 builds | 产物内容摘要 |
 | `${analysisDepth}` | workflow-state.json | 分析深度 (quick/standard/deep) |
 | `${depthInstructions}` | Phase 03 maps | 对应深度的分析指令 |
--- a/.claude/skills/workflow-tune/templates/synthesis-prompt.md
+++ b/.claude/skills/workflow-tune/templates/synthesis-prompt.md
@@ -1,90 +0,0 @@
 # Synthesis Prompt Template
 Phase 04 使用此模板构造 ccw cli 提示词，让 Gemini 综合所有步骤分析，产出跨步骤优化建议。
 ## Template
 ```
 PURPOSE: Synthesize all step analyses into a holistic workflow optimization assessment. Evaluate cross-step concerns: ordering, handoff quality, redundancy, bottlenecks, and overall workflow coherence.
 WORKFLOW OVERVIEW:
 Name: ${workflowName}
 Goal: ${workflowContext}
 Steps: ${stepCount}
 Average Quality: ${avgScore}/100
 Weakest Step: ${minScore}/100
 Total Issues: ${totalIssues} (${totalHighIssues} high severity)
 SCORE SUMMARY:
 ${scoreSummary}
 COMPLETE PROCESS LOG:
 ${processLog}
 DETAILED STEP ANALYSES:
 ${stepAnalyses}
 TASK:
 1. **Workflow Coherence**: Do steps form a logical sequence? Any missing steps?
 2. **Handoff Quality**: Are step outputs well-consumed by subsequent steps? Data format mismatches?
 3. **Redundancy Detection**: Do any steps duplicate work? Overlapping concerns?
 4. **Bottleneck Identification**: Which steps are bottlenecks (lowest quality, longest duration)?
 5. **Step Ordering**: Would reordering steps improve outcomes?
 6. **Missing Steps**: Are there gaps in the pipeline that need additional steps?
 7. **Per-Step Optimization**: Top 3 improvements per underperforming step
 8. **Workflow-Level Optimization**: Structural changes to the overall pipeline
 EXPECTED OUTPUT (strict JSON, no markdown):
 {
  "workflow_score": <0-100>,
  "coherence": {
    "score": <0-100>,
    "assessment": "<logical flow evaluation>",
    "gaps": ["<missing step or transition>"]
  },
  "handoff_quality": {
    "score": <0-100>,
    "issues": [
      { "from_step": "<step name>", "to_step": "<step name>", "issue": "<description>", "fix": "<suggestion>" }
    ]
  },
  "redundancy": {
    "found": <true|false>,
    "items": [
      { "steps": ["<step1>", "<step2>"], "description": "<what overlaps>", "recommendation": "<merge or remove>" }
    ]
  },
  "bottlenecks": [
    { "step": "<step name>", "reason": "<why it's a bottleneck>", "impact": "high|medium|low", "fix": "<suggestion>" }
  ],
  "ordering_suggestions": [
    { "current": "<current order description>", "proposed": "<new order>", "rationale": "<why>" }
  ],
  "per_step_improvements": [
    { "step": "<step name>", "improvements": [
      { "priority": "high|medium|low", "description": "<what to change>", "rationale": "<why>" }
    ]}
  ],
  "workflow_improvements": [
    { "priority": "high|medium|low", "category": "structure|handoff|performance|quality", "description": "<change>", "rationale": "<why>", "affected_steps": ["<step names>"] }
  ],
  "summary": "<2-3 sentence executive summary of workflow health and top recommendations>"
 }
 CONSTRAINTS: Be specific, reference step names and artifact details, output ONLY JSON
 ```
 ## Variable Substitution
 | Variable | Source | Description |
 |----------|--------|-------------|
 | `${workflowName}` | workflow-state.json | Workflow 名称 |
 | `${workflowContext}` | workflow-state.json | Workflow 目标 |
 | `${stepCount}` | workflow-state.json | 总步骤数 |
 | `${avgScore}` | Phase 04 computes | 所有步骤平均分 |
 | `${minScore}` | Phase 04 computes | 最低步骤分 |
 | `${totalIssues}` | Phase 04 computes | 总问题数 |
 | `${totalHighIssues}` | Phase 04 computes | 高优先级问题数 |
 | `${scoreSummary}` | Phase 04 builds | 每步分数一行 |
 | `${processLog}` | process-log.md | 完整过程日志 |
 | `${stepAnalyses}` | Phase 04 reads | 所有 step-N-analysis.md 内容 |