feat: Implement workflow tuning command and remove synthesis prompt template

- Added a new command for workflow tuning that extracts commands from reference documents or natural language, executes them via the ccw CLI, and analyzes the artifacts using Gemini.
- The command includes detailed phases for setup, execution, analysis, synthesis, and reporting, ensuring a structured approach to workflow optimization.
- Removed the synthesis prompt template as it is no longer needed with the new command implementation.
This commit is contained in:
catlog22
2026-03-20 20:25:41 +08:00
parent 7ef47c3d47
commit 8953795c49
10 changed files with 811 additions and 2486 deletions

View File

@@ -1,495 +0,0 @@
---
name: workflow-tune
description: Workflow tuning skill for multi-command/skill pipelines. Executes each step sequentially, inspects artifacts after each command, analyzes quality via ccw cli resume, builds process documentation, and generates optimization suggestions. Triggers on "workflow tune", "tune workflow", "workflow optimization".
allowed-tools: Skill, Agent, AskUserQuestion, TaskCreate, TaskUpdate, TaskList, Read, Write, Edit, Bash, Glob, Grep
---
# Workflow Tune
Tune multi-step workflows composed of commands or skills. Execute each step, inspect artifacts, analyze via ccw cli resume, build process documentation, and produce actionable optimization suggestions.
## Architecture Overview
```
┌──────────────────────────────────────────────────────────────────────┐
│ Workflow Tune Orchestrator (SKILL.md) │
│ → Parse → Decompose → Confirm → Setup → Step Loop → Synthesize │
└──────────────────────┬───────────────────────────────────────────────┘
┌───────────────────┼───────────────────────────────────┐
↓ ↓ ↓
┌──────────┐ ┌─────────────────────────────┐ ┌──────────────┐
│ Phase 1 │ │ Step Loop (2→3 per step) │ │ Phase 4 + 5 │
│ Setup │ │ ┌─────┐ ┌─────┐ │ │ Synthesize + │
│ │──→│ │ P2 │→ │ P3 │ │────→│ Report │
│ Parse + │ │ │Exec │ │Anal │ │ │ │
│ Decomp + │ │ └─────┘ └─────┘ │ └──────────────┘
│ Confirm │ │ ↑ │ next step │
└──────────┘ │ └───────┘ │
└─────────────────────────────┘
Phase 1 Detail:
Input → [Format 1-3: direct parse] ──→ Command Doc → Confirm → Init
→ [Format 4: natural lang] ──→ Semantic Decompose → Command Doc → Confirm → Init
```
## Key Design Principles
1. **Test-First Evaluation**: Auto-generate per-step acceptance criteria before execution — judge results against concrete requirements, not vague quality
2. **Step-by-Step Execution**: Each workflow step executes independently, artifacts inspected before proceeding
3. **Resume-Based Analysis**: Uses ccw cli `--resume` to maintain analysis context across steps
4. **Process Documentation**: Running `process-log.md` accumulates observations per step
5. **Two-Tool Pipeline**: Claude/target tool (execute) + Gemini (analyze) = complementary perspectives
6. **Pure Orchestrator**: SKILL.md coordinates only — execution detail in phase files
7. **Progressive Phase Loading**: Phase docs read only when that phase executes
## Interactive Preference Collection
```javascript
// ★ Auto mode detection
const autoYes = /\b(-y|--yes)\b/.test($ARGUMENTS)
if (autoYes) {
workflowPreferences = {
autoYes: true,
analysisDepth: 'standard',
autoFix: false
}
} else {
const prefResponse = AskUserQuestion({
questions: [
{
question: "选择 Workflow 调优配置:",
header: "Tune Config",
multiSelect: false,
options: [
{ label: "Quick (轻量分析)", description: "每步简要检查,快速产出建议" },
{ label: "Standard (标准分析) (Recommended)", description: "每步详细分析,完整过程文档" },
{ label: "Deep (深度分析)", description: "每步深度审查,含性能和架构建议" }
]
},
{
question: "是否自动应用优化建议?",
header: "Auto Fix",
multiSelect: false,
options: [
{ label: "No (仅生成报告) (Recommended)", description: "只分析,不修改" },
{ label: "Yes (自动应用)", description: "分析后自动应用高优先级建议" }
]
}
]
})
const depthMap = {
"Quick": "quick",
"Standard": "standard",
"Deep": "deep"
}
const selectedDepth = Object.keys(depthMap).find(k =>
prefResponse["Tune Config"].startsWith(k)
) || "Standard"
workflowPreferences = {
autoYes: false,
analysisDepth: depthMap[selectedDepth],
autoFix: prefResponse["Auto Fix"].startsWith("Yes")
}
}
```
## Input Processing
```
$ARGUMENTS → Parse:
├─ Workflow definition: one of:
│ ├─ Format 1: Inline steps — "step1 | step2 | step3" (pipe-separated commands)
│ ├─ Format 2: Skill names — "skill-a,skill-b,skill-c" (comma-separated)
│ ├─ Format 3: File path — "--file workflow.json" (JSON definition)
│ └─ Format 4: Natural language — free-text description, auto-decomposed into steps
├─ Test context: --context "description of what the workflow should achieve"
└─ Flags: --depth quick|standard|deep, -y/--yes, --auto-fix
```
### Format Detection Priority
```
1. --file flag present → Format 3 (JSON)
2. Contains pipe "|" → Format 1 (inline commands)
3. Matches skill-name pattern → Format 2 (comma-separated skills)
4. Everything else → Format 4 (natural language → semantic decomposition)
4a. Contains file path? → Read file as reference doc, extract workflow steps via LLM
4b. Pure intent text → Direct intent-verb matching (intentMap)
```
### Format 4: Semantic Decomposition (Natural Language)
When input is free-text (e.g., "分析 src 目录代码质量,做代码评审,然后修复高优先级问题"), the orchestrator:
#### 4a. Reference Document Mode (input contains file path)
When the input contains a file path (e.g., `d:\maestro2\guide\command-usage-guide.md 提取核心工作流`), the orchestrator:
1. **Detect File Path**: Extract file path from input via path pattern matching
2. **Read Reference Doc**: Read the file content as workflow reference material
3. **Extract Workflow via LLM**: Use Gemini to analyze the document + user intent, extract executable workflow steps
4. **Step Chain Generation**: LLM returns structured step chain with commands, tools, and execution order
5. **Command Doc + Confirmation**: Same as 4b below
#### 4b. Pure Intent Mode (no file path)
1. **Semantic Parse**: Identify intent verbs and targets → map to available skills/commands
2. **Step Chain Generation**: Produce ordered step chain with tool/mode selection
3. **Command Doc**: Generate formatted execution plan document
4. **User Confirmation**: Display plan, ask user to confirm/edit before execution
```javascript
// ★ 4a: If input contains file path → read file, extract workflow via LLM
// Detect paths: Windows (D:\path\file.md), Unix (/path/file.md), relative (./file.md)
// Read file content → send to Gemini with user intent → get executable step chain
// See phases/01-setup.md Step 1.1b Mode 4a
// ★ 4b: Pure intent text → regex-based intent-to-tool mapping
const intentMap = {
'分析|analyze|审查|inspect|scan': { tool: 'gemini', mode: 'analysis', rule: 'analysis-analyze-code-patterns' },
'评审|review|code review': { tool: 'gemini', mode: 'analysis', rule: 'analysis-review-code-quality' },
// ... (full map in phases/01-setup.md Step 1.1b Mode 4b)
};
// Match input segments to intents, produce step chain
// See phases/01-setup.md Step 1.1b for full algorithm
```
### Workflow JSON Format (when using --file)
```json
{
"name": "my-workflow",
"description": "What this workflow achieves",
"steps": [
{
"name": "step-1-name",
"type": "skill|command|ccw-cli",
"command": "/skill-name args" | "ccw cli -p '...' --tool gemini --mode analysis",
"expected_artifacts": ["output.md", "report.json"],
"success_criteria": "description of what success looks like"
}
]
}
```
### Inline Step Parsing
```javascript
// Pipe-separated: each segment is a command
// "ccw cli -p 'analyze' --tool gemini --mode analysis | /review-code src/ | ccw cli -p 'fix' --tool claude --mode write"
const steps = input.split('|').map((cmd, i) => ({
name: `step-${i + 1}`,
type: cmd.trim().startsWith('/') ? 'skill' : 'command',
command: cmd.trim(),
expected_artifacts: [],
success_criteria: ''
}));
```
## Pre-Execution Confirmation
After parsing (all formats) or decomposition (Format 4), generate a **Command Document** and ask for user confirmation before executing.
### Command Document Format
```markdown
# Workflow Tune — Execution Plan
**Workflow**: {name}
**Goal**: {context}
**Steps**: {count}
**Analysis Depth**: {depth}
## Step Chain
| # | Name | Type | Command | Tool | Mode |
|---|------|------|---------|------|------|
| 1 | {name} | {type} | {command} | {tool} | {mode} |
| 2 | ... | ... | ... | ... | ... |
## Execution Flow
```
Step 1: {name}
→ Command: {command}
→ Expected: {artifacts}
→ Feeds into: Step 2
Step 2: {name}
→ Command: {command}
→ Expected: {artifacts}
→ Feeds into: Step 3
...
```
## Estimated Scope
- Total CLI calls: {N} (execute) + {N} (analyze) + 1 (synthesize)
- Analysis tool: gemini (--resume chain)
- Process documentation: process-log.md (accumulated)
```
### Confirmation Flow
```javascript
// ★ Skip confirmation only if -y/--yes flag
if (!workflowPreferences.autoYes) {
// Display command document to user
// Output the formatted plan (direct text output, NOT a file)
const confirmation = AskUserQuestion({
questions: [{
question: "确认执行以上 Workflow 调优计划?",
header: "Confirm Execution",
multiSelect: false,
options: [
{ label: "Execute (确认执行)", description: "按计划开始执行" },
{ label: "Edit steps (修改步骤)", description: "我想调整某些步骤" },
{ label: "Cancel (取消)", description: "取消本次调优" }
]
}]
});
if (confirmation["Confirm Execution"].startsWith("Cancel")) {
// Abort workflow
return;
}
if (confirmation["Confirm Execution"].startsWith("Edit")) {
// Ask user for modifications
const editResponse = AskUserQuestion({
questions: [{
question: "请描述要修改的内容删除步骤2、步骤3改用codex、在步骤1后加入安全扫描",
header: "Edit Steps"
}]
});
// Apply user edits to steps[] → re-display command doc → re-confirm
// (recursive confirmation loop until Execute or Cancel)
}
}
```
## Execution Flow
> **COMPACT DIRECTIVE**: Context compression MUST check TaskUpdate phase status.
> The phase currently marked `in_progress` is the active execution phase — preserve its FULL content.
> Only compress phases marked `completed` or `pending`.
### Phase 1: Setup (one-time)
Read and execute: `Ref: phases/01-setup.md`
- Parse workflow steps from input (Format 1-3: direct parse, Format 4: semantic decomposition)
- Generate Command Document (formatted execution plan)
- **User Confirmation**: Display plan, wait for confirm/edit/cancel
- **Generate Test Requirements**: Auto-create per-step acceptance criteria via Gemini (expected outputs, content signals, quality thresholds, pass/fail criteria, handoff contracts)
- Create workspace at `.workflow/.scratchpad/workflow-tune-{ts}/`
- Initialize workflow-state.json (with test_requirements per step)
- Create process-log.md template
Output: `workDir`, `steps[]` (with test_requirements), `workflowContext`, `commandDoc`, initialized state
### Step Loop (Phase 2 + Phase 3, per step)
```javascript
// Track analysis session ID for resume chain
let analysisSessionId = null;
for (let stepIdx = 0; stepIdx < state.steps.length; stepIdx++) {
const step = state.steps[stepIdx];
TaskUpdate(stepLoopTask, {
subject: `Step ${stepIdx + 1}/${state.steps.length}: ${step.name}`,
status: 'in_progress'
});
// === Phase 2: Execute Step ===
// Read: phases/02-step-execute.md
// Execute command/skill → collect artifacts
// Write step-{N}-artifacts-manifest.json
// === Phase 3: Analyze Step ===
// Read: phases/03-step-analyze.md
// Inspect artifacts → ccw cli gemini --resume analysisSessionId
// Write step-{N}-analysis.md → append to process-log.md
// Update analysisSessionId for next step's resume
// Update state
state.steps[stepIdx].status = 'completed';
Write(`${state.work_dir}/workflow-state.json`, JSON.stringify(state, null, 2));
}
```
### Phase 2: Execute Step (per step)
Read and execute: `Ref: phases/02-step-execute.md`
- Create step working directory
- Execute command/skill via ccw cli or Skill tool
- Collect output artifacts
- Write artifacts manifest
### Phase 3: Analyze Step (per step)
Read and execute: `Ref: phases/03-step-analyze.md`
- Inspect step artifacts (file list, content summary, quality signals)
- **Compare actual output against test requirements** (pass_criteria, content_signals, fail_signals, handoff_contract)
- Build analysis prompt with step context + test requirements + previous process log
- Execute: `ccw cli --tool gemini --mode analysis [--resume sessionId]`
- Parse analysis → write step-{N}-analysis.md (with requirement match: PASS/FAIL)
- Append findings to process-log.md
- Return analysis session ID for resume chain
### Phase 4: Synthesize (one-time)
Read and execute: `Ref: phases/04-synthesize.md`
- Read complete process-log.md + all step analyses
- Build synthesis prompt with full workflow context
- Execute: `ccw cli --tool gemini --mode analysis --resume analysisSessionId`
- Generate cross-step optimization insights
- Write synthesis.md
### Phase 5: Optimization Report (one-time)
Read and execute: `Ref: phases/05-optimize-report.md`
- Aggregate all analyses and synthesis
- Generate structured optimization report
- Optionally apply high-priority fixes (if autoFix enabled)
- Write final-report.md
- Display summary to user
**Phase Reference Documents**:
| Phase | Document | Purpose | Compact |
|-------|----------|---------|---------|
| 1 | [phases/01-setup.md](phases/01-setup.md) | Initialize workspace and state | TaskUpdate driven |
| 2 | [phases/02-step-execute.md](phases/02-step-execute.md) | Execute workflow step | TaskUpdate driven |
| 3 | [phases/03-step-analyze.md](phases/03-step-analyze.md) | Analyze step artifacts | TaskUpdate driven + resume |
| 4 | [phases/04-synthesize.md](phases/04-synthesize.md) | Cross-step synthesis | TaskUpdate driven + resume |
| 5 | [phases/05-optimize-report.md](phases/05-optimize-report.md) | Generate final report | TaskUpdate driven |
## Data Flow
```
User Input (workflow steps / natural language + context)
Phase 1: Setup
├─ [Format 1-3] Direct parse → steps[]
├─ [Format 4a] File path detected → Read doc → LLM extract → steps[]
├─ [Format 4b] Pure intent text → Regex intent matching → steps[]
Command Document (formatted plan)
User Confirmation (Execute / Edit / Cancel)
↓ (Execute confirmed)
Generate Test Requirements (Gemini) → per-step acceptance criteria
↓ workDir, steps[] (with test_requirements), workflow-state.json, process-log.md
┌─→ Phase 2: Execute Step N (ccw cli / Skill)
│ ↓ step-N/ artifacts
│ ↓
│ Phase 3: Analyze Step N (ccw cli gemini --resume)
│ ↓ step-N-analysis.md, process-log.md updated
│ ↓ analysisSessionId carried forward
│ ↓
│ [More steps?]─── YES ──→ next step (Phase 2)
│ ↓ NO
│ ↓
└───┘
Phase 4: Synthesize (ccw cli gemini --resume)
↓ synthesis.md
Phase 5: Report
↓ final-report.md + optional auto-fix
Done
```
## TaskUpdate Pattern
```javascript
// Initial state
TaskCreate({ subject: "Phase 1: Setup workspace", activeForm: "Parsing workflow" })
TaskCreate({ subject: "Step Loop", activeForm: "Executing steps" })
TaskCreate({ subject: "Phase 4-5: Synthesize & Report", activeForm: "Pending" })
// Per-step tracking
for (const step of state.steps) {
TaskCreate({
subject: `Step: ${step.name}`,
activeForm: `Pending`,
description: `${step.type}: ${step.command}`
})
}
// During step execution
TaskUpdate(stepTask, {
subject: `Step: ${step.name} — Executing`,
activeForm: `Running ${step.command}`
})
// After step analysis
TaskUpdate(stepTask, {
subject: `Step: ${step.name} — Analyzed`,
activeForm: `Quality: ${stepQuality} | Issues: ${issueCount}`,
status: 'completed'
})
// Final
TaskUpdate(synthesisTask, {
subject: `Synthesis & Report (${state.steps.length} steps, ${totalIssues} issues)`,
status: 'completed'
})
```
## Resume Chain Strategy
```
Step 1 Execute → artifacts
Step 1 Analyze → ccw cli gemini --mode analysis → sessionId_1
Step 2 Execute → artifacts
Step 2 Analyze → ccw cli gemini --mode analysis --resume sessionId_1 → sessionId_2
...
Step N Analyze → sessionId_N
Synthesize → ccw cli gemini --mode analysis --resume sessionId_N → final
```
Each analysis step resumes the previous session, maintaining full context of:
- All prior step observations
- Accumulated quality patterns
- Cross-step dependency insights
## Error Handling
| Phase | Error | Recovery |
|-------|-------|----------|
| 2: Execute | CLI timeout/crash | Retry once, then record failure and continue to next step |
| 2: Execute | Skill not found | Skip step, note in process-log |
| 3: Analyze | CLI fails | Retry without --resume, start fresh session |
| 3: Analyze | Resume session not found | Start fresh analysis session |
| 4: Synthesize | CLI fails | Generate report from individual step analyses only |
| Any | 3+ consecutive errors | Terminate with partial report |
**Error Budget**: Each step gets 1 retry. 3 consecutive failures triggers early termination.
## Core Rules
1. **Start Immediately**: First action is preference collection → Phase 1 setup
2. **Progressive Loading**: Read phase doc ONLY when that phase is about to execute
3. **Inspect Before Proceed**: Always check step artifacts before moving to next step
4. **Background CLI**: ccw cli runs in background, wait for hook callback before proceeding
5. **Resume Chain**: Maintain analysis session continuity via --resume
6. **Process Documentation**: Every step observation goes into process-log.md
7. **Single State Source**: `workflow-state.json` is the only source of truth
8. **DO NOT STOP**: Continuous execution until all steps processed

View File

@@ -1,670 +0,0 @@
# Phase 1: Setup
Initialize workspace, parse workflow definition, semantic decomposition for natural language, generate command document, user confirmation, create state and process log.
## Objective
- Parse workflow steps from user input (Format 1-3: direct parse, Format 4: semantic decomposition)
- Generate Command Document (formatted execution plan)
- User confirmation: Execute / Edit steps / Cancel
- Validate step commands/skill paths
- Create isolated workspace directory
- Initialize workflow-state.json and process-log.md
## Execution
### Step 1.1: Parse Input
```javascript
const args = $ARGUMENTS.trim();
// Detect input format
let steps = [];
let workflowName = 'unnamed-workflow';
let workflowContext = '';
// Format 1: JSON file (--file path)
const fileMatch = args.match(/--file\s+"?([^\s"]+)"?/);
if (fileMatch) {
const wfDef = JSON.parse(Read(fileMatch[1]));
workflowName = wfDef.name || 'unnamed-workflow';
workflowContext = wfDef.description || '';
steps = wfDef.steps;
}
// Format 2: Pipe-separated commands ("cmd1 | cmd2 | cmd3")
else if (args.includes('|')) {
const rawSteps = args.split(/(?:--context|--depth|-y|--yes|--auto-fix)\s+("[^"]*"|\S+)/)[0];
steps = rawSteps.split('|').map((cmd, i) => ({
name: `step-${i + 1}`,
type: cmd.trim().startsWith('/') ? 'skill'
: cmd.trim().startsWith('ccw cli') ? 'ccw-cli'
: 'command',
command: cmd.trim(),
expected_artifacts: [],
success_criteria: ''
}));
}
// Format 3: Comma-separated skill names (matches pattern: word,word or word-word,word-word)
else if (/^[\w-]+(,[\w-]+)+/.test(args.split(/\s/)[0])) {
const skillPart = args.match(/^([^\s]+)/);
const skillNames = skillPart ? skillPart[1].split(',') : [];
steps = skillNames.map((name, i) => {
const skillPath = name.startsWith('.claude/') ? name : `.claude/skills/${name}`;
return {
name: name.replace('.claude/skills/', ''),
type: 'skill',
command: `/${name.replace('.claude/skills/', '')}`,
skill_path: skillPath,
expected_artifacts: [],
success_criteria: ''
};
});
}
// Format 4: Natural language → semantic decomposition
else {
inputFormat = 'natural-language';
naturalLanguageInput = args.replace(/--\w+\s+"[^"]*"/g, '').replace(/--\w+\s+\S+/g, '').replace(/-y|--yes/g, '').trim();
// ★ 4a: Detect file paths in input (Windows absolute, Unix absolute, or relative paths)
const filePathPattern = /(?:[A-Za-z]:[\\\/][^\s,;,;、]+|\/[^\s,;,;、]+\.(?:md|txt|json|yaml|yml|toml)|\.\/?[^\s,;,;、]+\.(?:md|txt|json|yaml|yml|toml))/g;
const detectedPaths = naturalLanguageInput.match(filePathPattern) || [];
referenceDocContent = null;
referenceDocPath = null;
if (detectedPaths.length > 0) {
// Read first detected file as reference document
referenceDocPath = detectedPaths[0];
try {
referenceDocContent = Read(referenceDocPath);
// Remove file path from natural language input to get pure user intent
naturalLanguageInput = naturalLanguageInput.replace(referenceDocPath, '').trim();
} catch (e) {
// File not readable — fall through to 4b (pure intent mode)
referenceDocContent = null;
}
}
// Steps will be populated in Step 1.1b
steps = [];
}
// Parse --context
const contextMatch = args.match(/--context\s+"([^"]+)"/);
workflowContext = contextMatch ? contextMatch[1] : workflowContext;
// Parse --depth
const depthMatch = args.match(/--depth\s+(quick|standard|deep)/);
if (depthMatch) {
workflowPreferences.analysisDepth = depthMatch[1];
}
// If no context provided, ask user
if (!workflowContext) {
const response = AskUserQuestion({
questions: [{
question: "请描述这个 workflow 的目标和预期效果:",
header: "Workflow Context",
multiSelect: false,
options: [
{ label: "General quality check", description: "通用质量检查,评估步骤间衔接" },
{ label: "Custom description", description: "自定义描述 workflow 目标" }
]
}]
});
workflowContext = response["Workflow Context"];
}
```
### Step 1.1b: Semantic Decomposition (Format 4 only)
> Skip this step if `inputFormat !== 'natural-language'`.
Two sub-modes based on whether a reference document was detected in Step 1.1:
#### Mode 4a: Reference Document → LLM Extraction
When `referenceDocContent` is available, use Gemini to extract executable workflow steps from the document, guided by the user's intent text.
```javascript
if (inputFormat === 'natural-language' && referenceDocContent) {
// ★ 4a: Extract workflow steps from reference document via LLM
const extractPrompt = `PURPOSE: Extract an executable workflow step chain from the reference document below, guided by the user's intent.
USER INTENT: ${naturalLanguageInput}
REFERENCE DOCUMENT PATH: ${referenceDocPath}
REFERENCE DOCUMENT CONTENT:
${referenceDocContent}
TASK:
1. Read the document and identify the workflow/process it describes (commands, steps, phases, procedures)
2. Filter by user intent — only extract the steps the user wants to test/tune
3. For each step, determine:
- The actual command to execute (shell command, CLI invocation, or skill name)
- The execution order and dependencies
- What tool to use (gemini/claude/codex/qwen) and mode (default: write)
4. Generate a step chain that can be directly executed
IMPORTANT:
- Extract REAL executable commands from the document, not analysis tasks about the document
- The user wants to RUN these workflow steps, not analyze the document itself
- If the document describes CLI commands like "maestro init", "maestro plan", etc., those are the steps to extract
- Preserve the original command syntax from the document
- Map each command to appropriate tool/mode for ccw cli execution, OR mark as 'command' type for direct shell execution
- Default mode to "write" — almost all steps produce output artifacts (files, reports, configs), even analysis steps need write permission to save results
EXPECTED OUTPUT (strict JSON, no markdown):
{
"workflow_name": "<descriptive name>",
"workflow_context": "<what this workflow achieves>",
"steps": [
{
"name": "<step name>",
"type": "command|ccw-cli|skill",
"command": "<the actual command to execute>",
"tool": "<gemini|claude|codex|qwen or null for shell commands>",
"mode": "<write (default) | analysis (read-only, rare) | null for shell commands>",
"rule": "<rule template or null>",
"original_text": "<source text from document>",
"expected_artifacts": ["<expected output files>"],
"success_criteria": "<what success looks like>"
}
]
}
CONSTRAINTS: Output ONLY valid JSON. Extract executable steps, NOT document analysis tasks.`;
Bash({
command: `ccw cli -p "${escapeForShell(extractPrompt)}" --tool gemini --mode analysis --rule universal-rigorous-style`,
run_in_background: true,
timeout: 300000
});
// STOP — wait for hook callback
// After callback: parse JSON response into steps[]
const extractOutput = /* CLI output from callback */;
const extractJsonMatch = extractOutput.match(/\{[\s\S]*\}/);
if (extractJsonMatch) {
try {
const extracted = JSON.parse(extractJsonMatch[0]);
workflowName = extracted.workflow_name || 'doc-workflow';
workflowContext = extracted.workflow_context || naturalLanguageInput;
steps = (extracted.steps || []).map((s, i) => ({
name: s.name || `step-${i + 1}`,
type: s.type || 'command',
command: s.command,
tool: s.tool || null,
mode: s.mode || null,
rule: s.rule || null,
original_text: s.original_text || '',
expected_artifacts: s.expected_artifacts || [],
success_criteria: s.success_criteria || ''
}));
} catch (e) {
// JSON parse failed — fall through to 4b intent matching
referenceDocContent = null;
}
}
if (steps.length === 0) {
// Extraction produced no steps — fall through to 4b
referenceDocContent = null;
}
}
```
#### Mode 4b: Pure Intent → Regex Matching
When no reference document, or when 4a extraction failed, decompose by intent-verb matching.
```javascript
if (inputFormat === 'natural-language' && !referenceDocContent) {
// Intent-to-tool mapping (regex patterns → tool config)
const intentMap = [
{ pattern: /分析|analyze|审查|inspect|scan/i, name: 'analyze', tool: 'gemini', mode: 'analysis', rule: 'analysis-analyze-code-patterns' },
{ pattern: /评审|review|code.?review/i, name: 'review', tool: 'gemini', mode: 'analysis', rule: 'analysis-review-code-quality' },
{ pattern: /诊断|debug|排查|diagnose/i, name: 'diagnose', tool: 'gemini', mode: 'analysis', rule: 'analysis-diagnose-bug-root-cause' },
{ pattern: /安全|security|漏洞|vulnerability/i, name: 'security-audit', tool: 'gemini', mode: 'analysis', rule: 'analysis-assess-security-risks' },
{ pattern: /性能|performance|perf/i, name: 'perf-analysis', tool: 'gemini', mode: 'analysis', rule: 'analysis-analyze-performance' },
{ pattern: /架构|architecture/i, name: 'arch-review', tool: 'gemini', mode: 'analysis', rule: 'analysis-review-architecture' },
{ pattern: /修复|fix|repair|解决/i, name: 'fix', tool: 'claude', mode: 'write', rule: 'development-debug-runtime-issues' },
{ pattern: /实现|implement|开发|create|新增/i, name: 'implement', tool: 'claude', mode: 'write', rule: 'development-implement-feature' },
{ pattern: /重构|refactor/i, name: 'refactor', tool: 'claude', mode: 'write', rule: 'development-refactor-codebase' },
{ pattern: /测试|test|generate.?test/i, name: 'test', tool: 'claude', mode: 'write', rule: 'development-generate-tests' },
{ pattern: /规划|plan|设计|design/i, name: 'plan', tool: 'gemini', mode: 'analysis', rule: 'planning-plan-architecture-design' },
{ pattern: /拆解|breakdown|分解/i, name: 'breakdown', tool: 'gemini', mode: 'analysis', rule: 'planning-breakdown-task-steps' },
];
// Segment input by Chinese/English delimiters: 、,,;然后/接着/最后/之后 etc.
const segments = naturalLanguageInput
.split(/[,;、]|(?:然后|接着|之后|最后|再|并|and then|then|finally|next)\s*/i)
.map(s => s.trim())
.filter(Boolean);
// Match each segment to an intent (with ambiguity resolution)
steps = segments.map((segment, i) => {
const allMatches = intentMap.filter(m => m.pattern.test(segment));
let matched = allMatches[0] || null;
// ★ Ambiguity resolution: if multiple intents match, ask user
if (allMatches.length > 1) {
const disambig = AskUserQuestion({
questions: [{
question: `"${segment}" 匹配到多个意图,请选择最符合的:`,
header: `Disambiguate Step ${i + 1}`,
multiSelect: false,
options: allMatches.map(m => ({
label: m.name,
description: `Tool: ${m.tool}, Mode: ${m.mode}, Rule: ${m.rule}`
}))
}]
});
const chosen = disambig[`Disambiguate Step ${i + 1}`];
matched = allMatches.find(m => m.name === chosen) || allMatches[0];
}
if (matched) {
// Extract target scope from segment (e.g., "分析 src 目录" → scope = "src")
const scopeMatch = segment.match(/(?:目录|文件|模块|directory|file|module)?\s*[:]?\s*(\S+)/);
const scope = scopeMatch ? scopeMatch[1].replace(/[的地得]$/, '') : '**/*';
return {
name: `${matched.name}`,
type: 'ccw-cli',
command: `ccw cli -p "${segment}" --tool ${matched.tool} --mode ${matched.mode} --rule ${matched.rule}`,
tool: matched.tool,
mode: matched.mode,
rule: matched.rule,
original_text: segment,
expected_artifacts: [],
success_criteria: ''
};
} else {
// Unmatched segment → generic analysis step
return {
name: `step-${i + 1}`,
type: 'ccw-cli',
command: `ccw cli -p "${segment}" --tool gemini --mode analysis`,
tool: 'gemini',
mode: 'analysis',
rule: 'universal-rigorous-style',
original_text: segment,
expected_artifacts: [],
success_criteria: ''
};
}
});
// Deduplicate: if same intent name appears twice, suffix with index
const nameCount = {};
steps.forEach(s => {
nameCount[s.name] = (nameCount[s.name] || 0) + 1;
if (nameCount[s.name] > 1) {
s.name = `${s.name}-${nameCount[s.name]}`;
}
});
}
// Common: set workflow context and name for Format 4
if (inputFormat === 'natural-language') {
if (!workflowContext) {
workflowContext = naturalLanguageInput;
}
if (!workflowName || workflowName === 'unnamed-workflow') {
workflowName = referenceDocPath ? 'doc-workflow' : 'nl-workflow';
}
}
```
### Step 1.1c: Generate Command Document
Generate a formatted execution plan for user review. This runs for ALL input formats, not just Format 4.
```javascript
function generateCommandDoc(steps, workflowName, workflowContext, analysisDepth) {
const stepTable = steps.map((s, i) => {
const tool = s.tool || (s.type === 'skill' ? '-' : 'claude');
const mode = s.mode || (s.type === 'skill' ? '-' : 'write');
const cmdPreview = s.command.length > 60 ? s.command.substring(0, 57) + '...' : s.command;
return `| ${i + 1} | ${s.name} | ${s.type} | \`${cmdPreview}\` | ${tool} | ${mode} |`;
}).join('\n');
const flowDiagram = steps.map((s, i) => {
const arrow = i < steps.length - 1 ? '\n ↓' : '';
const feedsInto = i < steps.length - 1 ? `Feeds into: Step ${i + 2} (${steps[i + 1].name})` : 'Final step';
const originalText = s.original_text ? `\n Source: "${s.original_text}"` : '';
return `Step ${i + 1}: ${s.name}
Command: ${s.command}
Type: ${s.type} | Tool: ${s.tool || '-'} | Mode: ${s.mode || '-'}${originalText}
${feedsInto}${arrow}`;
}).join('\n');
const totalCli = steps.filter(s => s.type === 'ccw-cli').length;
const totalSkill = steps.filter(s => s.type === 'skill').length;
const totalCmd = steps.filter(s => s.type === 'command').length;
return `# Workflow Tune — Execution Plan
**Workflow**: ${workflowName}
**Goal**: ${workflowContext}
**Steps**: ${steps.length}
**Analysis Depth**: ${analysisDepth}
## Step Chain
| # | Name | Type | Command | Tool | Mode |
|---|------|------|---------|------|------|
${stepTable}
## Execution Flow
\`\`\`
${flowDiagram}
\`\`\`
## Estimated Scope
- CLI execute calls: ${totalCli}
- Skill invocations: ${totalSkill}
- Shell commands: ${totalCmd}
- Analysis calls (gemini --resume chain): ${steps.length} (per-step) + 1 (synthesis)
- Process documentation: process-log.md (accumulated)
- Final output: final-report.md with optimization recommendations
`;
}
const commandDoc = generateCommandDoc(steps, workflowName, workflowContext, workflowPreferences.analysisDepth);
// Output command document to user (direct text output)
// The orchestrator displays this as formatted text before confirmation
```
### Step 1.1d: Pre-Execution Confirmation
```javascript
// ★ Skip confirmation if -y/--yes auto mode
if (!workflowPreferences.autoYes) {
// Display commandDoc to user as formatted text output
// Then ask for confirmation
const confirmation = AskUserQuestion({
questions: [{
question: "确认执行以上 Workflow 调优计划?",
header: "Confirm Execution",
multiSelect: false,
options: [
{ label: "Execute (确认执行)", description: "按计划开始执行所有步骤" },
{ label: "Edit steps (修改步骤)", description: "调整步骤顺序、增删步骤、更换工具" },
{ label: "Cancel (取消)", description: "取消本次调优" }
]
}]
});
const choice = confirmation["Confirm Execution"];
if (choice.startsWith("Cancel")) {
// Abort: no workspace created, no state written
// Output: "Workflow tune cancelled."
return;
}
if (choice.startsWith("Edit")) {
// Enter edit loop: ask user what to change, apply, re-display, re-confirm
let editing = true;
while (editing) {
const editResponse = AskUserQuestion({
questions: [{
question: "请描述要修改的内容:\n" +
" - 删除步骤: '删除步骤2' 或 'remove step 2'\n" +
" - 添加步骤: '在步骤1后加入安全扫描' 或 'add security scan after step 1'\n" +
" - 修改工具: '步骤3改用codex' 或 'step 3 use codex'\n" +
" - 调换顺序: '步骤2和步骤3互换' 或 'swap step 2 and 3'\n" +
" - 修改命令: '步骤1命令改为 ccw cli -p \"...\" --tool gemini'",
header: "Edit Steps"
}]
});
const editText = editResponse["Edit Steps"];
// Apply edits to steps[] based on user instruction
// The orchestrator interprets the edit instruction and modifies steps:
//
// Delete: filter out the specified step, re-index
// Add: insert new step at specified position
// Modify tool: update the step's tool/mode/command
// Swap: exchange positions of two steps
// Modify command: replace command string
//
// After applying edits, re-generate command doc and re-display
const updatedCommandDoc = generateCommandDoc(steps, workflowName, workflowContext, workflowPreferences.analysisDepth);
// Display updatedCommandDoc to user
const reconfirm = AskUserQuestion({
questions: [{
question: "修改后的计划如上,是否确认?",
header: "Confirm Execution",
multiSelect: false,
options: [
{ label: "Execute (确认执行)", description: "按修改后的计划执行" },
{ label: "Edit more (继续修改)", description: "还需要调整" },
{ label: "Cancel (取消)", description: "取消本次调优" }
]
}]
});
const reChoice = reconfirm["Confirm Execution"];
if (reChoice.startsWith("Execute")) {
editing = false;
} else if (reChoice.startsWith("Cancel")) {
return; // Abort
}
// else: continue editing loop
}
}
// choice === "Execute" → proceed to workspace creation
}
// Save command doc for reference
// Will be written to workspace after Step 1.3
```
### Step 1.2: Validate Steps
```javascript
for (const step of steps) {
if (step.type === 'skill' && step.skill_path) {
const skillFiles = Glob(`${step.skill_path}/SKILL.md`);
if (skillFiles.length === 0) {
step.validation = 'warning';
step.validation_msg = `Skill not found: ${step.skill_path}`;
} else {
step.validation = 'ok';
}
} else {
// Command-type steps: basic validation (non-empty)
step.validation = step.command && step.command.trim() ? 'ok' : 'invalid';
}
}
const invalidSteps = steps.filter(s => s.validation === 'invalid');
if (invalidSteps.length > 0) {
throw new Error(`Invalid steps: ${invalidSteps.map(s => s.name).join(', ')}`);
}
```
### Step 1.2b: Generate Test Requirements (Acceptance Criteria)
> 调优的前提:为每一步生成跟任务匹配的验收标准。没有预期基准,就无法判断命令执行是否达标。
用 Gemini 根据 step command + workflow context + 上下游关系,自动推断每步的验收标准。
```javascript
// Build step chain description for context
const stepChainDesc = steps.map((s, i) =>
`Step ${i + 1}: ${s.name} (${s.type}) — ${s.command}`
).join('\n');
const reqGenPrompt = `PURPOSE: Generate concrete acceptance criteria (test requirements) for each step in a workflow pipeline. These criteria will be used to objectively judge whether each step's execution succeeded or failed.
WORKFLOW:
Name: ${workflowName}
Goal: ${workflowContext}
STEP CHAIN:
${stepChainDesc}
TASK:
For each step, generate:
1. **expected_outputs** — what files/artifacts should be produced (specific filenames or patterns)
2. **content_signals** — what content patterns indicate success (keywords, structures, data shapes)
3. **quality_thresholds** — minimum quality bar (e.g., "no empty files", "JSON must be parseable", "must contain at least N items")
4. **pass_criteria** — 1-2 sentence description of what "pass" looks like for this step
5. **fail_signals** — what patterns indicate failure (error messages, empty output, wrong format)
6. **handoff_contract** — what this step must provide for the next step to work (data format, required fields)
CONTEXT RULES:
- Infer from the command what the step is supposed to do
- Consider workflow goal when judging what "good enough" means
- Each step's handoff_contract should match what the next step needs as input
- Be specific: "report.md with ## Summary section" not "a report file"
EXPECTED OUTPUT (strict JSON, no markdown):
{
"step_requirements": [
{
"step_index": 0,
"step_name": "<name>",
"expected_outputs": ["<file or pattern>"],
"content_signals": ["<keyword or pattern that indicates success>"],
"quality_thresholds": ["<minimum bar>"],
"pass_criteria": "<what pass looks like>",
"fail_signals": ["<pattern that indicates failure>"],
"handoff_contract": "<what next step needs from this step>"
}
]
}
CONSTRAINTS: Be specific to each command, output ONLY JSON`;
Bash({
command: `ccw cli -p "${escapeForShell(reqGenPrompt)}" --tool gemini --mode analysis --rule universal-rigorous-style`,
run_in_background: true,
timeout: 300000
});
// STOP — wait for hook callback
// After callback: parse JSON, attach requirements to each step
const reqOutput = /* CLI output from callback */;
const reqJsonMatch = reqOutput.match(/\{[\s\S]*\}/);
if (reqJsonMatch) {
try {
const reqData = JSON.parse(reqJsonMatch[0]);
(reqData.step_requirements || []).forEach(req => {
const idx = req.step_index;
if (idx >= 0 && idx < steps.length) {
steps[idx].test_requirements = {
expected_outputs: req.expected_outputs || [],
content_signals: req.content_signals || [],
quality_thresholds: req.quality_thresholds || [],
pass_criteria: req.pass_criteria || '',
fail_signals: req.fail_signals || [],
handoff_contract: req.handoff_contract || ''
};
}
});
} catch (e) {
// Fallback: proceed without generated requirements
// Steps will use any manually provided success_criteria
}
}
// Capture session ID for resume chain start
const reqSessionMatch = reqOutput.match(/\[CCW_EXEC_ID=([^\]]+)\]/);
const reqSessionId = reqSessionMatch ? reqSessionMatch[1] : null;
```
### Step 1.3: Create Workspace
```javascript
const ts = Date.now();
const workDir = `.workflow/.scratchpad/workflow-tune-${ts}`;
Bash(`mkdir -p "${workDir}/steps"`);
// Create per-step directories
for (let i = 0; i < steps.length; i++) {
Bash(`mkdir -p "${workDir}/steps/step-${i + 1}/artifacts"`);
}
```
### Step 1.3b: Save Command Document
```javascript
// Save confirmed command doc to workspace for reference
Write(`${workDir}/command-doc.md`, commandDoc);
```
### Step 1.4: Initialize State
```javascript
const initialState = {
status: 'running',
started_at: new Date().toISOString(),
updated_at: new Date().toISOString(),
workflow_name: workflowName,
workflow_context: workflowContext,
analysis_depth: workflowPreferences.analysisDepth,
auto_fix: workflowPreferences.autoFix,
steps: steps.map((s, i) => ({
...s,
index: i,
status: 'pending',
execution: null,
analysis: null,
test_requirements: s.test_requirements || null // from Step 1.2b
})),
analysis_session_id: reqSessionId || null, // resume chain starts from requirements generation
process_log_entries: [],
synthesis: null,
errors: [],
error_count: 0,
max_errors: 3,
work_dir: workDir
};
Write(`${workDir}/workflow-state.json`, JSON.stringify(initialState, null, 2));
```
### Step 1.5: Initialize Process Log
```javascript
const processLog = `# Workflow Tune Process Log
**Workflow**: ${workflowName}
**Context**: ${workflowContext}
**Steps**: ${steps.length}
**Analysis Depth**: ${workflowPreferences.analysisDepth}
**Started**: ${new Date().toISOString()}
---
`;
Write(`${workDir}/process-log.md`, processLog);
```
## Output
- **Variables**: `workDir`, `steps[]`, `workflowContext`, `commandDoc`, initialized state
- **Files**: `workflow-state.json`, `process-log.md`, `command-doc.md`, per-step directories
- **User Confirmation**: Execution plan confirmed (or cancelled → abort)
- **TaskUpdate**: Mark Phase 1 completed, start Step Loop

View File

@@ -1,197 +0,0 @@
# Phase 2: Execute Step
> **COMPACT SENTINEL [Phase 2: Execute Step]**
> This phase contains 4 execution steps (Step 2.1 -- 2.4).
> If you can read this sentinel but cannot find the full Step protocol below, context has been compressed.
> Recovery: `Read("phases/02-step-execute.md")`
Execute a single workflow step and collect its output artifacts.
## Objective
- Determine step execution method (skill invoke / ccw cli / shell command)
- Execute step with appropriate tool
- Collect output artifacts into step directory
- Write artifacts manifest
## Execution
### Step 2.1: Prepare Step Directory
```javascript
const stepIdx = currentStepIndex; // from orchestrator loop
const step = state.steps[stepIdx];
const stepDir = `${state.work_dir}/steps/step-${stepIdx + 1}`;
const artifactsDir = `${stepDir}/artifacts`;
// Capture pre-execution state (git status, file timestamps)
const preGitStatus = Bash('git status --porcelain 2>/dev/null || echo "not a git repo"').stdout;
// ★ Warn if dirty git working directory (first step only)
if (stepIdx === 0 && preGitStatus.trim() && preGitStatus.trim() !== 'not a git repo') {
const dirtyLines = preGitStatus.trim().split('\n').length;
// Log warning — artifact collection via git diff may be unreliable
// This is informational; does not block execution
console.warn(`⚠ Dirty git working directory detected (${dirtyLines} changed files). Artifact collection via git diff may include pre-existing changes.`);
}
const preExecSnapshot = {
timestamp: new Date().toISOString(),
git_status: preGitStatus,
working_files: Glob('**/*.{ts,js,md,json}').slice(0, 50) // sample
};
Write(`${stepDir}/pre-exec-snapshot.json`, JSON.stringify(preExecSnapshot, null, 2));
```
### Step 2.2: Execute by Step Type
```javascript
let executionResult = { success: false, method: '', output: '', duration: 0 };
const startTime = Date.now();
switch (step.type) {
case 'skill': {
// Skill invocation — use Skill tool
// Extract skill name and arguments from command
const skillCmd = step.command.replace(/^\//, '');
const [skillName, ...skillArgs] = skillCmd.split(/\s+/);
// Execute skill (this runs synchronously within current context)
// Note: Skill execution produces artifacts in the working directory
// We capture changes by comparing pre/post state
Skill({
name: skillName,
arguments: skillArgs.join(' ')
});
executionResult.method = 'skill';
executionResult.success = true;
break;
}
case 'ccw-cli': {
// Direct ccw cli command
const cliCommand = step.command;
Bash({
command: cliCommand,
run_in_background: true,
timeout: 600000 // 10 minutes
});
// STOP — wait for hook callback
// After callback:
executionResult.method = 'ccw-cli';
executionResult.success = true;
break;
}
case 'command': {
// Generic shell command
const result = Bash({
command: step.command,
timeout: 300000 // 5 minutes
});
executionResult.method = 'command';
executionResult.output = result.stdout || '';
executionResult.success = result.exitCode === 0;
// Save command output
if (executionResult.output) {
Write(`${artifactsDir}/command-output.txt`, executionResult.output);
}
if (result.stderr) {
Write(`${artifactsDir}/command-stderr.txt`, result.stderr);
}
break;
}
}
executionResult.duration = Date.now() - startTime;
```
### Step 2.3: Collect Artifacts
```javascript
// Capture post-execution state
const postExecSnapshot = {
timestamp: new Date().toISOString(),
git_status: Bash('git status --porcelain 2>/dev/null || echo "not a git repo"').stdout,
working_files: Glob('**/*.{ts,js,md,json}').slice(0, 50)
};
// Detect changed/new files by comparing snapshots
const preFiles = new Set(preExecSnapshot.working_files);
const newOrChanged = postExecSnapshot.working_files.filter(f => !preFiles.has(f));
// Also check git diff for modified files
const gitDiff = Bash('git diff --name-only 2>/dev/null || true').stdout.trim().split('\n').filter(Boolean);
// Collect all artifacts (new files + git-changed files + declared expected_artifacts)
const declaredArtifacts = (step.expected_artifacts || []).filter(f => {
// Verify declared artifacts actually exist
const exists = Glob(f);
return exists.length > 0;
}).flatMap(f => Glob(f));
const allArtifacts = [...new Set([...newOrChanged, ...gitDiff, ...declaredArtifacts])];
// Copy detected artifacts to step artifacts dir (or record references)
const artifactManifest = {
step: step.name,
step_index: stepIdx,
execution_method: executionResult.method,
success: executionResult.success,
duration_ms: executionResult.duration,
artifacts: allArtifacts.map(f => ({
path: f,
type: f.endsWith('.md') ? 'markdown' : f.endsWith('.json') ? 'json' : 'other',
size: 'unknown' // Can be filled by stat if needed
})),
// For skill type: also check .workflow/.scratchpad for generated files
scratchpad_files: step.type === 'skill'
? Glob('.workflow/.scratchpad/**/*').filter(f => {
// Only include files created after step started
return true; // Heuristic: include recent scratchpad files
}).slice(0, 20)
: [],
collected_at: new Date().toISOString()
};
Write(`${stepDir}/artifacts-manifest.json`, JSON.stringify(artifactManifest, null, 2));
```
### Step 2.4: Update State
```javascript
state.steps[stepIdx].status = 'executed';
state.steps[stepIdx].execution = {
method: executionResult.method,
success: executionResult.success,
duration_ms: executionResult.duration,
artifacts_dir: artifactsDir,
manifest_path: `${stepDir}/artifacts-manifest.json`,
artifact_count: artifactManifest.artifacts.length,
started_at: preExecSnapshot.timestamp,
completed_at: new Date().toISOString()
};
state.updated_at = new Date().toISOString();
Write(`${state.work_dir}/workflow-state.json`, JSON.stringify(state, null, 2));
```
## Error Handling
| Error | Recovery |
|-------|----------|
| Skill not found | Record failure, set success=false, continue to Phase 3 |
| CLI timeout (10min) | Retry once with shorter timeout, then record failure |
| Command exit non-zero | Record stderr, set success=false, continue to Phase 3 |
| No artifacts detected | Continue to Phase 3 — analysis evaluates step definition quality |
## Output
- **Files**: `pre-exec-snapshot.json`, `artifacts-manifest.json`, `artifacts/` (if command type)
- **State**: `steps[stepIdx].execution` updated
- **Next**: Phase 3 (Analyze Step)

View File

@@ -1,386 +0,0 @@
# Phase 3: Analyze Step
> **COMPACT SENTINEL [Phase 3: Analyze Step]**
> This phase contains 5 execution steps (Step 3.1 -- 3.5).
> If you can read this sentinel but cannot find the full Step protocol below, context has been compressed.
> Recovery: `Read("phases/03-step-analyze.md")`
Analyze a completed step's artifacts and quality using `ccw cli --tool gemini --mode analysis`. Uses `--resume` to maintain context across step analyses, building a continuous analysis chain.
## Objective
- Inspect step artifacts (file list, content, quality signals)
- Build analysis prompt with step context + prior process log
- Execute via ccw cli Gemini with resume chain
- Parse analysis results → write step-{N}-analysis.md
- Append findings to process-log.md
- Return updated session ID for resume chain
## Execution
### Step 3.1: Inspect Artifacts
```javascript
const stepIdx = currentStepIndex;
const step = state.steps[stepIdx];
const stepDir = `${state.work_dir}/steps/step-${stepIdx + 1}`;
// Read artifacts manifest
const manifest = JSON.parse(Read(`${stepDir}/artifacts-manifest.json`));
// Build artifact summary based on analysis depth
let artifactSummary = '';
if (state.analysis_depth === 'quick') {
// Quick: just file list and sizes
artifactSummary = `Artifacts (${manifest.artifacts.length} files):\n` +
manifest.artifacts.map(a => `- ${a.path} (${a.type})`).join('\n');
} else {
// Standard/Deep: include file content summaries
artifactSummary = manifest.artifacts.map(a => {
const maxLines = state.analysis_depth === 'deep' ? 300 : 150;
try {
const content = Read(a.path, { limit: maxLines });
return `--- ${a.path} (${a.type}) ---\n${content}`;
} catch {
return `--- ${a.path} --- [unreadable]`;
}
}).join('\n\n');
// Deep: also include scratchpad files
if (state.analysis_depth === 'deep' && manifest.scratchpad_files?.length > 0) {
artifactSummary += '\n\n--- Scratchpad Files ---\n' +
manifest.scratchpad_files.slice(0, 5).map(f => {
const content = Read(f, { limit: 100 });
return `--- ${f} ---\n${content}`;
}).join('\n\n');
}
}
// Execution result summary
const execSummary = `Execution: ${step.execution.method} | ` +
`Success: ${step.execution.success} | ` +
`Duration: ${step.execution.duration_ms}ms | ` +
`Artifacts: ${manifest.artifacts.length} files`;
```
### Step 3.2: Build Prior Context
```javascript
// Build accumulated process log context for this analysis
const priorProcessLog = Read(`${state.work_dir}/process-log.md`);
// Build step chain context (what came before, what comes after)
const stepChainContext = state.steps.map((s, i) => {
const status = i < stepIdx ? 'completed' : i === stepIdx ? 'CURRENT' : 'pending';
const score = s.analysis?.quality_score || '-';
return `${i + 1}. [${status}] ${s.name} (${s.type}) — Quality: ${score}`;
}).join('\n');
// Previous step handoff context (if not first step)
let handoffContext = '';
if (stepIdx > 0) {
const prevStep = state.steps[stepIdx - 1];
const prevAnalysis = prevStep.analysis;
if (prevAnalysis) {
handoffContext = `PREVIOUS STEP OUTPUT SUMMARY:
Step "${prevStep.name}" produced ${prevStep.execution?.artifact_count || 0} artifacts.
Quality: ${prevAnalysis.quality_score}/100
Key outputs: ${prevAnalysis.key_outputs?.join(', ') || 'unknown'}
Handoff notes: ${prevAnalysis.handoff_notes || 'none'}`;
}
}
```
### Step 3.3: Construct Analysis Prompt
```javascript
// Ref: templates/step-analysis-prompt.md
const depthInstructions = {
quick: 'Provide brief assessment (3-5 bullet points). Focus on: execution success, output completeness, obvious issues.',
standard: 'Provide detailed assessment. Cover: execution quality, output completeness, artifact quality, step-to-step handoff readiness, potential issues.',
deep: 'Provide exhaustive assessment. Cover: execution quality, output completeness and correctness, artifact quality and structure, step-to-step handoff integrity, error handling, performance signals, architecture implications, edge cases.'
};
// ★ Build test requirements section (the evaluation baseline)
const testReqs = step.test_requirements;
let testReqSection = '';
if (testReqs) {
testReqSection = `
TEST REQUIREMENTS (Acceptance Criteria — use these as the PRIMARY evaluation baseline):
Pass Criteria: ${testReqs.pass_criteria}
Expected Outputs: ${(testReqs.expected_outputs || []).join(', ') || 'not specified'}
Content Signals (patterns that indicate success): ${(testReqs.content_signals || []).join(', ') || 'not specified'}
Quality Thresholds: ${(testReqs.quality_thresholds || []).join(', ') || 'not specified'}
Fail Signals (patterns that indicate failure): ${(testReqs.fail_signals || []).join(', ') || 'not specified'}
Handoff Contract (what next step needs): ${testReqs.handoff_contract || 'not specified'}
IMPORTANT: Score quality_score based on how well the actual output matches these test requirements.
- 90-100: All pass_criteria met, all expected_outputs present, content_signals found, no fail_signals
- 70-89: Most criteria met, minor gaps
- 50-69: Partial match, significant gaps
- 0-49: Fail — fail_signals present or pass_criteria not met`;
} else {
testReqSection = `
NOTE: No pre-generated test requirements for this step. Evaluate based on general quality signals and workflow context.`;
}
const analysisPrompt = `PURPOSE: Evaluate workflow step "${step.name}" (step ${stepIdx + 1}/${state.steps.length}) against its acceptance criteria. Judge whether the command execution met the pre-defined test requirements.
WORKFLOW CONTEXT:
Name: ${state.workflow_name}
Goal: ${state.workflow_context}
Step Chain:
${stepChainContext}
CURRENT STEP:
Name: ${step.name}
Type: ${step.type}
Command: ${step.command}
${step.success_criteria ? `Success Criteria: ${step.success_criteria}` : ''}
${testReqSection}
EXECUTION RESULT:
${execSummary}
${handoffContext}
STEP ARTIFACTS:
${artifactSummary}
ANALYSIS DEPTH: ${state.analysis_depth}
${depthInstructions[state.analysis_depth]}
TASK:
1. **Requirement Matching**: Compare actual output against test requirements (pass_criteria, expected_outputs, content_signals)
2. **Fail Signal Detection**: Check for any fail_signals in the output
3. **Handoff Contract Verification**: Does the output satisfy handoff_contract for the next step?
4. **Gap Analysis**: What's missing between actual output and requirements?
5. **Quality Score**: Rate 0-100 based on requirement fulfillment (NOT general quality)
EXPECTED OUTPUT (strict JSON, no markdown):
{
"quality_score": <0-100>,
"requirement_match": {
"pass": <true|false>,
"criteria_met": ["<which pass_criteria were satisfied>"],
"criteria_missed": ["<which pass_criteria were NOT satisfied>"],
"expected_outputs_found": ["<expected files that exist>"],
"expected_outputs_missing": ["<expected files that are absent>"],
"content_signals_found": ["<success patterns detected in output>"],
"content_signals_missing": ["<success patterns NOT found>"],
"fail_signals_detected": ["<failure patterns found, if any>"]
},
"execution_assessment": {
"success": <true|false>,
"completeness": "<complete|partial|failed>",
"notes": "<brief assessment>"
},
"artifact_assessment": {
"count": <number>,
"quality": "<high|medium|low>",
"key_outputs": ["<main output 1>", "<main output 2>"],
"missing_outputs": ["<expected but missing>"]
},
"handoff_assessment": {
"ready": <true|false>,
"contract_satisfied": <true|false|null>,
"next_step_compatible": <true|false|null>,
"handoff_notes": "<what next step should know>"
},
"issues": [
{ "severity": "high|medium|low", "description": "<issue>", "suggestion": "<fix>" }
],
"optimization_opportunities": [
{ "area": "<area>", "description": "<opportunity>", "impact": "high|medium|low" }
],
"step_summary": "<1-2 sentence summary for process log>"
}
CONSTRAINTS: Be specific, reference artifact content where possible, score against requirements not general quality, output ONLY JSON`;
```
### Step 3.4: Execute via ccw cli Gemini with Resume
```javascript
function escapeForShell(str) {
return str.replace(/"/g, '\\"').replace(/\$/g, '\\$').replace(/`/g, '\\`');
}
// Build CLI command with optional resume
let cliCommand = `ccw cli -p "${escapeForShell(analysisPrompt)}" --tool gemini --mode analysis`;
// Resume from previous step's analysis session (maintains context chain)
if (state.analysis_session_id) {
cliCommand += ` --resume ${state.analysis_session_id}`;
}
Bash({
command: cliCommand,
run_in_background: true,
timeout: 300000 // 5 minutes
});
// STOP — wait for hook callback
```
### Step 3.5: Parse Results and Update Process Log
After CLI completes:
```javascript
const rawOutput = /* CLI output from callback */;
// Extract session ID from CLI output for resume chain
const sessionIdMatch = rawOutput.match(/\[CCW_EXEC_ID=([^\]]+)\]/);
if (sessionIdMatch) {
state.analysis_session_id = sessionIdMatch[1];
}
// Parse JSON
const jsonMatch = rawOutput.match(/\{[\s\S]*\}/);
let analysis;
if (jsonMatch) {
try {
analysis = JSON.parse(jsonMatch[0]);
} catch (e) {
// Fallback: extract score heuristically
const scoreMatch = rawOutput.match(/"quality_score"\s*:\s*(\d+)/);
analysis = {
quality_score: scoreMatch ? parseInt(scoreMatch[1]) : 50,
execution_assessment: { success: step.execution.success, completeness: 'unknown', notes: 'Parse failed' },
artifact_assessment: { count: manifest.artifacts.length, quality: 'unknown', key_outputs: [], missing_outputs: [] },
handoff_assessment: { ready: true, next_step_compatible: null, handoff_notes: '' },
issues: [{ severity: 'low', description: 'Analysis output parsing failed', suggestion: 'Review raw output' }],
optimization_opportunities: [],
step_summary: 'Analysis parsing failed — raw output saved for manual review'
};
}
} else {
analysis = {
quality_score: 50,
step_summary: 'No structured analysis output received'
};
}
// Write step analysis file
const reqMatch = analysis.requirement_match;
const reqMatchSection = reqMatch ? `
## Requirement Match — ${reqMatch.pass ? 'PASS ✓' : 'FAIL ✗'}
### Criteria Met
${(reqMatch.criteria_met || []).map(c => `- ✓ ${c}`).join('\n') || '- None'}
### Criteria Missed
${(reqMatch.criteria_missed || []).map(c => `- ✗ ${c}`).join('\n') || '- None'}
### Expected Outputs
- Found: ${(reqMatch.expected_outputs_found || []).join(', ') || 'None'}
- Missing: ${(reqMatch.expected_outputs_missing || []).join(', ') || 'None'}
### Content Signals
- Detected: ${(reqMatch.content_signals_found || []).join(', ') || 'None'}
- Missing: ${(reqMatch.content_signals_missing || []).join(', ') || 'None'}
### Fail Signals
${(reqMatch.fail_signals_detected || []).length > 0
? (reqMatch.fail_signals_detected || []).map(f => `- ⚠ ${f}`).join('\n')
: '- None detected'}
` : '';
const stepAnalysisReport = `# Step ${stepIdx + 1} Analysis: ${step.name}
**Quality Score**: ${analysis.quality_score}/100
**Requirement Match**: ${reqMatch ? (reqMatch.pass ? 'PASS' : 'FAIL') : 'N/A (no test requirements)'}
**Date**: ${new Date().toISOString()}
${reqMatchSection}
## Execution
- Success: ${analysis.execution_assessment?.success}
- Completeness: ${analysis.execution_assessment?.completeness}
- Notes: ${analysis.execution_assessment?.notes}
## Artifacts
- Count: ${analysis.artifact_assessment?.count}
- Quality: ${analysis.artifact_assessment?.quality}
- Key Outputs: ${analysis.artifact_assessment?.key_outputs?.join(', ') || 'N/A'}
- Missing: ${analysis.artifact_assessment?.missing_outputs?.join(', ') || 'None'}
## Handoff Readiness
- Ready: ${analysis.handoff_assessment?.ready}
- Contract Satisfied: ${analysis.handoff_assessment?.contract_satisfied}
- Next Step Compatible: ${analysis.handoff_assessment?.next_step_compatible}
- Notes: ${analysis.handoff_assessment?.handoff_notes}
## Issues
${(analysis.issues || []).map(i => `- [${i.severity}] ${i.description}${i.suggestion}`).join('\n') || 'None'}
## Optimization Opportunities
${(analysis.optimization_opportunities || []).map(o => `- [${o.impact}] ${o.area}: ${o.description}`).join('\n') || 'None'}
`;
Write(`${stepDir}/step-${stepIdx + 1}-analysis.md`, stepAnalysisReport);
// Append to process log
const reqPassStr = reqMatch ? (reqMatch.pass ? 'PASS' : 'FAIL') : 'N/A';
const processLogEntry = `
## Step ${stepIdx + 1}: ${step.name} — Score: ${analysis.quality_score}/100 | Req: ${reqPassStr}
**Command**: \`${step.command}\`
**Result**: ${analysis.execution_assessment?.completeness || 'unknown'} | ${analysis.artifact_assessment?.count || 0} artifacts
**Requirement Match**: ${reqPassStr}${reqMatch ? ` — Met: ${(reqMatch.criteria_met || []).length}, Missed: ${(reqMatch.criteria_missed || []).length}, Fail Signals: ${(reqMatch.fail_signals_detected || []).length}` : ''}
**Summary**: ${analysis.step_summary || 'No summary'}
**Issues**: ${(analysis.issues || []).filter(i => i.severity === 'high').map(i => i.description).join('; ') || 'None critical'}
**Handoff**: ${analysis.handoff_assessment?.contract_satisfied ? 'Contract satisfied' : analysis.handoff_assessment?.handoff_notes || 'Ready'}
---
`;
// Append to process-log.md
const currentLog = Read(`${state.work_dir}/process-log.md`);
Write(`${state.work_dir}/process-log.md`, currentLog + processLogEntry);
// Update state
state.steps[stepIdx].analysis = {
quality_score: analysis.quality_score,
requirement_pass: reqMatch?.pass ?? null,
criteria_met_count: (reqMatch?.criteria_met || []).length,
criteria_missed_count: (reqMatch?.criteria_missed || []).length,
fail_signals_count: (reqMatch?.fail_signals_detected || []).length,
key_outputs: analysis.artifact_assessment?.key_outputs || [],
handoff_notes: analysis.handoff_assessment?.handoff_notes || '',
contract_satisfied: analysis.handoff_assessment?.contract_satisfied ?? null,
issue_count: (analysis.issues || []).length,
high_issues: (analysis.issues || []).filter(i => i.severity === 'high').length,
optimization_count: (analysis.optimization_opportunities || []).length,
analysis_file: `${stepDir}/step-${stepIdx + 1}-analysis.md`
};
state.steps[stepIdx].status = 'analyzed';
state.process_log_entries.push({
step_index: stepIdx,
step_name: step.name,
quality_score: analysis.quality_score,
summary: analysis.step_summary,
timestamp: new Date().toISOString()
});
state.updated_at = new Date().toISOString();
Write(`${state.work_dir}/workflow-state.json`, JSON.stringify(state, null, 2));
```
## Error Handling
| Error | Recovery |
|-------|----------|
| CLI timeout | Retry once without --resume (fresh session) |
| Resume session not found | Start fresh analysis session, continue |
| JSON parse fails | Extract score heuristically, save raw output |
| No output | Default score 50, minimal process log entry |
## Output
- **Files**: `step-{N}-analysis.md`, updated `process-log.md`
- **State**: `steps[stepIdx].analysis` updated, `analysis_session_id` updated
- **Next**: Phase 2 for next step, or Phase 4 (Synthesize) if all steps done

View File

@@ -1,257 +0,0 @@
# Phase 4: Synthesize
> **COMPACT SENTINEL [Phase 4: Synthesize]**
> This phase contains 4 execution steps (Step 4.1 -- 4.4).
> If you can read this sentinel but cannot find the full Step protocol below, context has been compressed.
> Recovery: `Read("phases/04-synthesize.md")`
Synthesize all step analyses into cross-step insights. Evaluates the workflow as a whole: step ordering, handoff quality, redundancy, bottlenecks, and overall coherence.
## Objective
- Read complete process-log.md and all step analyses
- Build synthesis prompt with full workflow context
- Execute via ccw cli Gemini with resume chain
- Generate cross-step optimization insights
- Write synthesis.md
## Execution
### Step 4.1: Gather All Analyses
```javascript
// Read process log
const processLog = Read(`${state.work_dir}/process-log.md`);
// Read all step analysis files
const stepAnalyses = state.steps.map((step, i) => {
const analysisFile = `${state.work_dir}/steps/step-${i + 1}/step-${i + 1}-analysis.md`;
try {
return { step: step.name, index: i, content: Read(analysisFile) };
} catch {
return { step: step.name, index: i, content: '[Analysis not available]' };
}
});
// Build score summary
const scoreSummary = state.steps.map((s, i) =>
`Step ${i + 1} (${s.name}): ${s.analysis?.quality_score || '-'}/100 | Issues: ${s.analysis?.issue_count || 0} (${s.analysis?.high_issues || 0} high)`
).join('\n');
// Compute aggregate stats
const scores = state.steps.map(s => s.analysis?.quality_score).filter(Boolean);
const avgScore = scores.length > 0 ? Math.round(scores.reduce((a, b) => a + b, 0) / scores.length) : 0;
const minScore = scores.length > 0 ? Math.min(...scores) : 0;
const totalIssues = state.steps.reduce((sum, s) => sum + (s.analysis?.issue_count || 0), 0);
const totalHighIssues = state.steps.reduce((sum, s) => sum + (s.analysis?.high_issues || 0), 0);
```
### Step 4.2: Construct Synthesis Prompt
```javascript
// Ref: templates/synthesis-prompt.md
const synthesisPrompt = `PURPOSE: Synthesize all step analyses into a holistic workflow optimization assessment. Evaluate cross-step concerns: ordering, handoff quality, redundancy, bottlenecks, and overall workflow coherence.
WORKFLOW OVERVIEW:
Name: ${state.workflow_name}
Goal: ${state.workflow_context}
Steps: ${state.steps.length}
Average Quality: ${avgScore}/100
Weakest Step: ${minScore}/100
Total Issues: ${totalIssues} (${totalHighIssues} high severity)
SCORE SUMMARY:
${scoreSummary}
COMPLETE PROCESS LOG:
${processLog}
DETAILED STEP ANALYSES:
${stepAnalyses.map(a => `### ${a.step} (Step ${a.index + 1})\n${a.content}`).join('\n\n---\n\n')}
TASK:
1. **Workflow Coherence**: Do steps form a logical sequence? Any missing steps?
2. **Handoff Quality**: Are step outputs well-consumed by subsequent steps? Data format mismatches?
3. **Redundancy Detection**: Do any steps duplicate work? Overlapping concerns?
4. **Bottleneck Identification**: Which steps are bottlenecks (lowest quality, longest duration)?
5. **Step Ordering**: Would reordering steps improve outcomes?
6. **Missing Steps**: Are there gaps in the pipeline that need additional steps?
7. **Per-Step Optimization**: Top 3 improvements per underperforming step
8. **Workflow-Level Optimization**: Structural changes to the overall pipeline
EXPECTED OUTPUT (strict JSON, no markdown):
{
"workflow_score": <0-100>,
"coherence": {
"score": <0-100>,
"assessment": "<logical flow evaluation>",
"gaps": ["<missing step or transition>"]
},
"handoff_quality": {
"score": <0-100>,
"issues": [
{ "from_step": "<step name>", "to_step": "<step name>", "issue": "<description>", "fix": "<suggestion>" }
]
},
"redundancy": {
"found": <true|false>,
"items": [
{ "steps": ["<step1>", "<step2>"], "description": "<what overlaps>", "recommendation": "<merge or remove>" }
]
},
"bottlenecks": [
{ "step": "<step name>", "reason": "<why it's a bottleneck>", "impact": "high|medium|low", "fix": "<suggestion>" }
],
"ordering_suggestions": [
{ "current": "<current order description>", "proposed": "<new order>", "rationale": "<why>" }
],
"per_step_improvements": [
{ "step": "<step name>", "improvements": [
{ "priority": "high|medium|low", "description": "<what to change>", "rationale": "<why>" }
]}
],
"workflow_improvements": [
{ "priority": "high|medium|low", "category": "structure|handoff|performance|quality", "description": "<change>", "rationale": "<why>", "affected_steps": ["<step names>"] }
],
"summary": "<2-3 sentence executive summary of workflow health and top recommendations>"
}
CONSTRAINTS: Be specific, reference step names and artifact details, output ONLY JSON`;
```
### Step 4.3: Execute via ccw cli Gemini with Resume
```javascript
function escapeForShell(str) {
return str.replace(/"/g, '\\"').replace(/\$/g, '\\$').replace(/`/g, '\\`');
}
let cliCommand = `ccw cli -p "${escapeForShell(synthesisPrompt)}" --tool gemini --mode analysis`;
// Resume from the last step's analysis session
if (state.analysis_session_id) {
cliCommand += ` --resume ${state.analysis_session_id}`;
}
Bash({
command: cliCommand,
run_in_background: true,
timeout: 300000
});
// STOP — wait for hook callback
```
### Step 4.4: Parse Results and Write Synthesis
After CLI completes:
```javascript
const rawOutput = /* CLI output from callback */;
const jsonMatch = rawOutput.match(/\{[\s\S]*\}/);
let synthesis;
if (jsonMatch) {
try {
synthesis = JSON.parse(jsonMatch[0]);
} catch {
synthesis = {
workflow_score: avgScore,
summary: 'Synthesis parsing failed — individual step analyses available',
workflow_improvements: [],
per_step_improvements: [],
bottlenecks: [],
handoff_quality: { score: 0, issues: [] },
coherence: { score: 0, assessment: 'Parse error' },
redundancy: { found: false, items: [] },
ordering_suggestions: []
};
}
} else {
synthesis = {
workflow_score: avgScore,
summary: 'No synthesis output received',
workflow_improvements: [],
per_step_improvements: []
};
}
// Write synthesis report
const synthesisReport = `# Workflow Synthesis
**Workflow Score**: ${synthesis.workflow_score}/100
**Date**: ${new Date().toISOString()}
## Executive Summary
${synthesis.summary}
## Coherence (${synthesis.coherence?.score || '-'}/100)
${synthesis.coherence?.assessment || 'N/A'}
${(synthesis.coherence?.gaps || []).length > 0 ? '\n### Gaps\n' + synthesis.coherence.gaps.map(g => `- ${g}`).join('\n') : ''}
## Handoff Quality (${synthesis.handoff_quality?.score || '-'}/100)
${(synthesis.handoff_quality?.issues || []).map(i =>
`- **${i.from_step}${i.to_step}**: ${i.issue}\n Fix: ${i.fix}`
).join('\n') || 'No handoff issues'}
## Redundancy
${synthesis.redundancy?.found ? (synthesis.redundancy.items || []).map(r =>
`- Steps ${r.steps.join(', ')}: ${r.description}${r.recommendation}`
).join('\n') : 'No redundancy detected'}
## Bottlenecks
${(synthesis.bottlenecks || []).map(b =>
`- **${b.step}** [${b.impact}]: ${b.reason}\n Fix: ${b.fix}`
).join('\n') || 'No bottlenecks'}
## Ordering Suggestions
${(synthesis.ordering_suggestions || []).map(o =>
`- Current: ${o.current}\n Proposed: ${o.proposed}\n Rationale: ${o.rationale}`
).join('\n') || 'Current ordering is optimal'}
## Per-Step Improvements
${(synthesis.per_step_improvements || []).map(s =>
`### ${s.step}\n` + (s.improvements || []).map(i =>
`- [${i.priority}] ${i.description}${i.rationale}`
).join('\n')
).join('\n\n') || 'No per-step improvements'}
## Workflow-Level Improvements
${(synthesis.workflow_improvements || []).map((w, i) =>
`### ${i + 1}. [${w.priority}] ${w.description}\n- Category: ${w.category}\n- Rationale: ${w.rationale}\n- Affected: ${(w.affected_steps || []).join(', ')}`
).join('\n\n') || 'No workflow-level improvements'}
`;
Write(`${state.work_dir}/synthesis.md`, synthesisReport);
// Update state
state.synthesis = {
workflow_score: synthesis.workflow_score,
summary: synthesis.summary,
improvement_count: (synthesis.workflow_improvements || []).length +
(synthesis.per_step_improvements || []).reduce((sum, s) => sum + (s.improvements || []).length, 0),
high_priority_count: (synthesis.workflow_improvements || []).filter(w => w.priority === 'high').length,
bottleneck_count: (synthesis.bottlenecks || []).length,
handoff_issue_count: (synthesis.handoff_quality?.issues || []).length,
synthesis_file: `${state.work_dir}/synthesis.md`,
raw_data: synthesis
};
state.updated_at = new Date().toISOString();
Write(`${state.work_dir}/workflow-state.json`, JSON.stringify(state, null, 2));
```
## Error Handling
| Error | Recovery |
|-------|----------|
| CLI timeout | Generate synthesis from individual step analyses only (no cross-step) |
| Resume fails | Start fresh analysis session |
| JSON parse fails | Use step-level data to construct minimal synthesis |
## Output
- **Files**: `synthesis.md`
- **State**: `state.synthesis` updated
- **Next**: Phase 5 (Optimization Report)

View File

@@ -1,246 +0,0 @@
# Phase 5: Optimization Report
> **COMPACT SENTINEL [Phase 5: Report]**
> This phase contains 4 execution steps (Step 5.1 -- 5.4).
> If you can read this sentinel but cannot find the full Step protocol below, context has been compressed.
> Recovery: `Read("phases/05-optimize-report.md")`
Generate the final optimization report and optionally apply high-priority fixes.
## Objective
- Read complete state, process log, synthesis
- Generate structured final report
- Optionally apply auto-fix (if enabled)
- Write final-report.md
- Display summary to user
## Execution
### Step 5.1: Read Complete State
```javascript
const state = JSON.parse(Read(`${state.work_dir}/workflow-state.json`));
const processLog = Read(`${state.work_dir}/process-log.md`);
const synthesis = state.synthesis;
state.status = 'completed';
state.updated_at = new Date().toISOString();
```
### Step 5.2: Generate Report
```javascript
// Compute stats
const scores = state.steps.map(s => s.analysis?.quality_score).filter(Boolean);
const avgScore = scores.length > 0 ? Math.round(scores.reduce((a, b) => a + b, 0) / scores.length) : 0;
const minStep = state.steps.reduce((min, s) =>
(s.analysis?.quality_score || 100) < (min.analysis?.quality_score || 100) ? s : min
, state.steps[0]);
const totalIssues = state.steps.reduce((sum, s) => sum + (s.analysis?.issue_count || 0), 0);
const totalHighIssues = state.steps.reduce((sum, s) => sum + (s.analysis?.high_issues || 0), 0);
// Step quality table (with requirement match)
const stepTable = state.steps.map((s, i) => {
const reqPass = s.analysis?.requirement_pass;
const reqStr = reqPass === true ? 'PASS' : reqPass === false ? 'FAIL' : '-';
return `| ${i + 1} | ${s.name} | ${s.type} | ${s.execution?.success ? 'OK' : 'FAIL'} | ${reqStr} | ${s.analysis?.quality_score || '-'} | ${s.analysis?.issue_count || 0} | ${s.analysis?.high_issues || 0} |`;
}).join('\n');
// Collect all improvements (workflow-level + per-step)
const allImprovements = [];
if (synthesis?.raw_data?.workflow_improvements) {
synthesis.raw_data.workflow_improvements.forEach(w => {
allImprovements.push({
scope: 'workflow',
priority: w.priority,
description: w.description,
rationale: w.rationale,
category: w.category,
affected: w.affected_steps || []
});
});
}
if (synthesis?.raw_data?.per_step_improvements) {
synthesis.raw_data.per_step_improvements.forEach(s => {
(s.improvements || []).forEach(imp => {
allImprovements.push({
scope: s.step,
priority: imp.priority,
description: imp.description,
rationale: imp.rationale,
category: 'step',
affected: [s.step]
});
});
});
}
// Sort by priority
const priorityOrder = { high: 0, medium: 1, low: 2 };
allImprovements.sort((a, b) => (priorityOrder[a.priority] || 2) - (priorityOrder[b.priority] || 2));
const report = `# Workflow Tune — Final Report
## Summary
| Field | Value |
|-------|-------|
| **Workflow** | ${state.workflow_name} |
| **Goal** | ${state.workflow_context} |
| **Steps** | ${state.steps.length} |
| **Workflow Score** | ${synthesis?.workflow_score || avgScore}/100 |
| **Average Step Quality** | ${avgScore}/100 |
| **Weakest Step** | ${minStep.name} (${minStep.analysis?.quality_score || '-'}/100) |
| **Total Issues** | ${totalIssues} (${totalHighIssues} high severity) |
| **Analysis Depth** | ${state.analysis_depth} |
| **Started** | ${state.started_at} |
| **Completed** | ${state.updated_at} |
## Step Quality Matrix
| # | Step | Type | Exec | Req Match | Quality | Issues | High |
|---|------|------|------|-----------|---------|--------|------|
${stepTable}
## Workflow Flow Assessment
### Coherence: ${synthesis?.raw_data?.coherence?.score || '-'}/100
${synthesis?.raw_data?.coherence?.assessment || 'Not evaluated'}
### Handoff Quality: ${synthesis?.raw_data?.handoff_quality?.score || '-'}/100
${(synthesis?.raw_data?.handoff_quality?.issues || []).map(i =>
`- **${i.from_step}${i.to_step}**: ${i.issue}`
).join('\n') || 'No handoff issues'}
### Bottlenecks
${(synthesis?.raw_data?.bottlenecks || []).map(b =>
`- **${b.step}** [${b.impact}]: ${b.reason}`
).join('\n') || 'No bottlenecks identified'}
## Optimization Recommendations
### Priority: HIGH
${allImprovements.filter(i => i.priority === 'high').map((i, idx) =>
`${idx + 1}. **[${i.scope}]** ${i.description}\n - Rationale: ${i.rationale}\n - Affected: ${i.affected.join(', ')}`
).join('\n') || 'None'}
### Priority: MEDIUM
${allImprovements.filter(i => i.priority === 'medium').map((i, idx) =>
`${idx + 1}. **[${i.scope}]** ${i.description}\n - Rationale: ${i.rationale}`
).join('\n') || 'None'}
### Priority: LOW
${allImprovements.filter(i => i.priority === 'low').map((i, idx) =>
`${idx + 1}. **[${i.scope}]** ${i.description}`
).join('\n') || 'None'}
## Process Documentation
Full process log: \`${state.work_dir}/process-log.md\`
Synthesis: \`${state.work_dir}/synthesis.md\`
### Per-Step Analysis Files
| Step | Analysis File |
|------|---------------|
${state.steps.map((s, i) =>
`| ${s.name} | \`${state.work_dir}/steps/step-${i + 1}/step-${i + 1}-analysis.md\` |`
).join('\n')}
## Artifact Locations
| Path | Description |
|------|-------------|
| \`${state.work_dir}/workflow-state.json\` | Complete state |
| \`${state.work_dir}/process-log.md\` | Accumulated process log |
| \`${state.work_dir}/synthesis.md\` | Cross-step synthesis |
| \`${state.work_dir}/final-report.md\` | This report |
`;
Write(`${state.work_dir}/final-report.md`, report);
```
### Step 5.3: Optional Auto-Fix
```javascript
if (state.auto_fix && allImprovements.filter(i => i.priority === 'high').length > 0) {
const highPriorityFixes = allImprovements.filter(i => i.priority === 'high');
// ★ Safety: confirm with user before applying auto-fixes
const fixList = highPriorityFixes.map((f, i) =>
`${i + 1}. [${f.scope}] ${f.description}\n Affected: ${f.affected.join(', ')}`
).join('\n');
const autoFixConfirm = AskUserQuestion({
questions: [{
question: `以下 ${highPriorityFixes.length} 项高优先级优化将被自动应用:\n\n${fixList}\n\n确认应用?`,
header: "Auto-Fix Confirmation",
multiSelect: false,
options: [
{ label: "Apply (应用)", description: "自动应用以上高优先级修复" },
{ label: "Skip (跳过)", description: "跳过自动修复,仅保留报告" }
]
}]
});
if (autoFixConfirm["Auto-Fix Confirmation"].startsWith("Skip")) {
// Skip auto-fix, just log it
state.auto_fix_skipped = true;
} else {
Agent({
subagent_type: 'general-purpose',
run_in_background: false,
description: 'Apply high-priority workflow optimizations',
prompt: `## Task: Apply High-Priority Workflow Optimizations
You are applying the top optimization suggestions from a workflow analysis.
## Improvements to Apply (HIGH priority only)
${highPriorityFixes.map((f, i) =>
`${i + 1}. [${f.scope}] ${f.description}\n Rationale: ${f.rationale}\n Affected: ${f.affected.join(', ')}`
).join('\n')}
## Workflow Steps
${state.steps.map((s, i) => `${i + 1}. ${s.name} (${s.type}): ${s.command}`).join('\n')}
## Rules
1. Read each affected file BEFORE modifying
2. Apply ONLY the high-priority suggestions
3. Preserve existing code style
4. Write a changes summary to: ${state.work_dir}/auto-fix-changes.md
`
});
} // end Apply branch
}
```
### Step 5.4: Display Summary
Output to user:
```
Workflow Tune Complete!
Workflow: {name}
Steps: {count}
Workflow Score: {score}/100
Average Step Quality: {avgScore}/100
Weakest Step: {name} ({score}/100)
Step Scores: {step1}={score1} → {step2}={score2} → ... → {stepN}={scoreN}
Issues: {total} ({high} high priority)
Improvements: {count} ({highCount} high priority)
Full report: {workDir}/final-report.md
Process log: {workDir}/process-log.md
```
## Output
- **Files**: `final-report.md`, optionally `auto-fix-changes.md`
- **State**: `status = completed`
- **Next**: Workflow complete. Return control to user.

View File

@@ -1,57 +0,0 @@
# Workflow Evaluation Criteria
Workflow 调优评估标准,由 Phase 03 (Analyze Step) 和 Phase 04 (Synthesize) 引用。
## Per-Step Dimensions
| Dimension | Description |
|-----------|-------------|
| Execution Success | 命令是否成功执行,退出码是否正确 |
| Output Completeness | 产物是否齐全,预期文件是否生成 |
| Artifact Quality | 产物内容质量 — 非空、格式正确、内容有意义 |
| Handoff Readiness | 产物是否满足下一步的输入要求,格式兼容性 |
## Per-Step Scoring Guide
| Range | Level | Description |
|-------|-------|-------------|
| 90-100 | Excellent | 执行完美,产物高质量,下游可直接消费 |
| 80-89 | Good | 执行成功,产物基本完整,微调即可衔接 |
| 70-79 | Adequate | 执行成功但产物有缺失或质量一般 |
| 60-69 | Needs Work | 部分失败或产物质量差,衔接困难 |
| 0-59 | Poor | 执行失败或产物无法使用 |
## Workflow-Level Dimensions
| Dimension | Description |
|-----------|-------------|
| Coherence | 步骤间的逻辑顺序是否合理,是否形成完整流程 |
| Handoff Quality | 步骤间的数据传递是否顺畅,格式是否匹配 |
| Redundancy | 是否存在步骤间的工作重叠或重复 |
| Efficiency | 整体流程是否高效,有无不必要的步骤 |
| Completeness | 是否覆盖所有必要环节,有无遗漏 |
## Analysis Depth Profiles
### Quick
- 每步 3-5 要点
- 关注: 执行成功、产出完整、明显问题
- 跨步骤: 基本衔接检查
### Standard
- 每步详细评估
- 关注: 执行质量、产出完整性、产物质量、衔接就绪度、潜在问题
- 跨步骤: 衔接质量、冗余检测、瓶颈识别
### Deep
- 每步深度审查
- 关注: 执行质量、产出正确性、结构质量、衔接完整性、错误处理、性能信号、架构影响、边界情况
- 跨步骤: 全面流程优化、重排建议、缺失步骤检测、架构改进
## Issue Severity Guide
| Severity | Description | Example |
|----------|-------------|---------|
| High | 阻断流程或导致错误结果 | 步骤执行失败、产物格式不兼容、关键数据丢失 |
| Medium | 影响质量但不阻断 | 产物不完整、衔接需手动调整、冗余步骤 |
| Low | 可改进但不影响功能 | 输出格式不一致、可优化的步骤顺序 |

View File

@@ -1,88 +0,0 @@
# Step Analysis Prompt Template
Phase 03 使用此模板构造 ccw cli 提示词,让 Gemini 分析单个步骤的执行结果和产物质量。
## Template
```
PURPOSE: Analyze the output of workflow step "${stepName}" (step ${stepIndex}/${totalSteps}) to assess quality, identify issues, and evaluate handoff readiness for the next step.
WORKFLOW CONTEXT:
Name: ${workflowName}
Goal: ${workflowContext}
Step Chain:
${stepChainContext}
CURRENT STEP:
Name: ${stepName}
Type: ${stepType}
Command: ${stepCommand}
${successCriteria}
EXECUTION RESULT:
${execSummary}
${handoffContext}
STEP ARTIFACTS:
${artifactSummary}
ANALYSIS DEPTH: ${analysisDepth}
${depthInstructions}
TASK:
1. Assess step execution quality (did it succeed? complete output?)
2. Evaluate artifact quality (content correctness, completeness, format)
3. Check handoff readiness (can the next step consume this output?)
4. Identify issues, risks, or optimization opportunities
5. Rate overall step quality 0-100
EXPECTED OUTPUT (strict JSON, no markdown):
{
"quality_score": <0-100>,
"execution_assessment": {
"success": <true|false>,
"completeness": "<complete|partial|failed>",
"notes": "<brief assessment>"
},
"artifact_assessment": {
"count": <number>,
"quality": "<high|medium|low>",
"key_outputs": ["<main output 1>", "<main output 2>"],
"missing_outputs": ["<expected but missing>"]
},
"handoff_assessment": {
"ready": <true|false>,
"next_step_compatible": <true|false|null>,
"handoff_notes": "<what next step should know>"
},
"issues": [
{ "severity": "high|medium|low", "description": "<issue>", "suggestion": "<fix>" }
],
"optimization_opportunities": [
{ "area": "<area>", "description": "<opportunity>", "impact": "high|medium|low" }
],
"step_summary": "<1-2 sentence summary for process log>"
}
CONSTRAINTS: Be specific, reference artifact content where possible, output ONLY JSON
```
## Variable Substitution
| Variable | Source | Description |
|----------|--------|-------------|
| `${stepName}` | workflow-state.json | 当前步骤名称 |
| `${stepIndex}` | orchestrator loop | 当前步骤序号 (1-based) |
| `${totalSteps}` | workflow-state.json | 总步骤数 |
| `${workflowName}` | workflow-state.json | Workflow 名称 |
| `${workflowContext}` | workflow-state.json | Workflow 目标描述 |
| `${stepChainContext}` | Phase 03 builds | 所有步骤状态概览 |
| `${stepType}` | workflow-state.json | 步骤类型 (skill/ccw-cli/command) |
| `${stepCommand}` | workflow-state.json | 步骤命令 |
| `${successCriteria}` | workflow-state.json | 成功标准 (如有) |
| `${execSummary}` | Phase 03 builds | 执行结果摘要 |
| `${handoffContext}` | Phase 03 builds | 上一步的产出摘要 (非首步) |
| `${artifactSummary}` | Phase 03 builds | 产物内容摘要 |
| `${analysisDepth}` | workflow-state.json | 分析深度 (quick/standard/deep) |
| `${depthInstructions}` | Phase 03 maps | 对应深度的分析指令 |

View File

@@ -1,90 +0,0 @@
# Synthesis Prompt Template
Phase 04 使用此模板构造 ccw cli 提示词,让 Gemini 综合所有步骤分析,产出跨步骤优化建议。
## Template
```
PURPOSE: Synthesize all step analyses into a holistic workflow optimization assessment. Evaluate cross-step concerns: ordering, handoff quality, redundancy, bottlenecks, and overall workflow coherence.
WORKFLOW OVERVIEW:
Name: ${workflowName}
Goal: ${workflowContext}
Steps: ${stepCount}
Average Quality: ${avgScore}/100
Weakest Step: ${minScore}/100
Total Issues: ${totalIssues} (${totalHighIssues} high severity)
SCORE SUMMARY:
${scoreSummary}
COMPLETE PROCESS LOG:
${processLog}
DETAILED STEP ANALYSES:
${stepAnalyses}
TASK:
1. **Workflow Coherence**: Do steps form a logical sequence? Any missing steps?
2. **Handoff Quality**: Are step outputs well-consumed by subsequent steps? Data format mismatches?
3. **Redundancy Detection**: Do any steps duplicate work? Overlapping concerns?
4. **Bottleneck Identification**: Which steps are bottlenecks (lowest quality, longest duration)?
5. **Step Ordering**: Would reordering steps improve outcomes?
6. **Missing Steps**: Are there gaps in the pipeline that need additional steps?
7. **Per-Step Optimization**: Top 3 improvements per underperforming step
8. **Workflow-Level Optimization**: Structural changes to the overall pipeline
EXPECTED OUTPUT (strict JSON, no markdown):
{
"workflow_score": <0-100>,
"coherence": {
"score": <0-100>,
"assessment": "<logical flow evaluation>",
"gaps": ["<missing step or transition>"]
},
"handoff_quality": {
"score": <0-100>,
"issues": [
{ "from_step": "<step name>", "to_step": "<step name>", "issue": "<description>", "fix": "<suggestion>" }
]
},
"redundancy": {
"found": <true|false>,
"items": [
{ "steps": ["<step1>", "<step2>"], "description": "<what overlaps>", "recommendation": "<merge or remove>" }
]
},
"bottlenecks": [
{ "step": "<step name>", "reason": "<why it's a bottleneck>", "impact": "high|medium|low", "fix": "<suggestion>" }
],
"ordering_suggestions": [
{ "current": "<current order description>", "proposed": "<new order>", "rationale": "<why>" }
],
"per_step_improvements": [
{ "step": "<step name>", "improvements": [
{ "priority": "high|medium|low", "description": "<what to change>", "rationale": "<why>" }
]}
],
"workflow_improvements": [
{ "priority": "high|medium|low", "category": "structure|handoff|performance|quality", "description": "<change>", "rationale": "<why>", "affected_steps": ["<step names>"] }
],
"summary": "<2-3 sentence executive summary of workflow health and top recommendations>"
}
CONSTRAINTS: Be specific, reference step names and artifact details, output ONLY JSON
```
## Variable Substitution
| Variable | Source | Description |
|----------|--------|-------------|
| `${workflowName}` | workflow-state.json | Workflow 名称 |
| `${workflowContext}` | workflow-state.json | Workflow 目标 |
| `${stepCount}` | workflow-state.json | 总步骤数 |
| `${avgScore}` | Phase 04 computes | 所有步骤平均分 |
| `${minScore}` | Phase 04 computes | 最低步骤分 |
| `${totalIssues}` | Phase 04 computes | 总问题数 |
| `${totalHighIssues}` | Phase 04 computes | 高优先级问题数 |
| `${scoreSummary}` | Phase 04 builds | 每步分数一行 |
| `${processLog}` | process-log.md | 完整过程日志 |
| `${stepAnalyses}` | Phase 04 reads | 所有 step-N-analysis.md 内容 |