diff --git a/.claude/skills/team-brainstorm/roles/coordinator.md b/.claude/skills/team-brainstorm/roles/coordinator.md index 36ed51dd..c063c9d8 100644 --- a/.claude/skills/team-brainstorm/roles/coordinator.md +++ b/.claude/skills/team-brainstorm/roles/coordinator.md @@ -94,7 +94,7 @@ AskUserQuestion({ }) ``` -### Phase 2: Create Team + Spawn Workers +### Phase 2: Create Team + Initialize Session ```javascript TeamCreate({ team_name: teamName }) @@ -135,7 +135,9 @@ const teamSession = { Write(`${sessionFolder}/team-session.json`, JSON.stringify(teamSession, null, 2)) ``` -Spawn workers (see SKILL.md Coordinator Spawn Template). +// ⚠️ Workers are NOT pre-spawned here. +// Workers are spawned per-stage in Phase 4 via Stop-Wait Task(run_in_background: false). +// See SKILL.md Coordinator Spawn Template for worker prompt templates. ### Phase 3: Create Task Chain @@ -212,6 +214,13 @@ TaskUpdate({ taskId: evalId, owner: "evaluator", addBlockedBy: [synthId] }) ### Phase 4: Coordination Loop + Generator-Critic Control +> **设计原则(Stop-Wait)**: 模型执行没有时间概念,禁止任何形式的轮询等待。 +> - ❌ 禁止: `while` 循环 + `sleep` + 检查状态 +> - ✅ 采用: 同步 `Task(run_in_background: false)` 调用,Worker 返回 = 阶段完成信号 +> +> 按 Phase 3 创建的任务链顺序,逐阶段 spawn worker 同步执行。 +> Worker prompt 使用 SKILL.md Coordinator Spawn Template。 + | Received Message | Action | |-----------------|--------| | ideator: ideas_ready | Read ideas → team_msg log → TaskUpdate completed → unblock CHALLENGE | diff --git a/.claude/skills/team-frontend/roles/coordinator/role.md b/.claude/skills/team-frontend/roles/coordinator/role.md index 4a6885f4..96148f34 100644 --- a/.claude/skills/team-frontend/roles/coordinator/role.md +++ b/.claude/skills/team-frontend/roles/coordinator/role.md @@ -149,7 +149,7 @@ AskUserQuestion({ }) ``` -### Phase 2: Create Team + Session + Spawn Teammates +### Phase 2: Create Team + Initialize Session ```javascript // Create session directory @@ -183,9 +183,11 @@ Write(`${sessionFolder}/shared-memory.json`, JSON.stringify({ industry_context: { industry: industryChoice, config: industry } }, null, 2)) -// Create team and spawn workers +// Create team TeamCreate({ team_name: teamName }) -// → Spawn analyst, architect, developer, qa (see SKILL.md Coordinator Spawn Template) +// ⚠️ Workers are NOT pre-spawned here. +// Workers are spawned per-stage in Phase 4 via Stop-Wait Task(run_in_background: false). +// See SKILL.md Coordinator Spawn Template for worker prompt templates. ``` ### Phase 3: Create Task Chain @@ -227,6 +229,13 @@ if (pipeline === 'system') { ### Phase 4: Coordination Loop +> **设计原则(Stop-Wait)**: 模型执行没有时间概念,禁止任何形式的轮询等待。 +> - ❌ 禁止: `while` 循环 + `sleep` + 检查状态 +> - ✅ 采用: 同步 `Task(run_in_background: false)` 调用,Worker 返回 = 阶段完成信号 +> +> 按 Phase 3 创建的任务链顺序,逐阶段 spawn worker 同步执行。 +> Worker prompt 使用 SKILL.md Coordinator Spawn Template。 + Receive teammate messages, dispatch based on content. **Before each decision**: `team_msg list` to check recent messages. **After each decision**: `team_msg log` to record. diff --git a/.claude/skills/team-issue/roles/coordinator.md b/.claude/skills/team-issue/roles/coordinator.md index 5d9166b9..fd1df5f0 100644 --- a/.claude/skills/team-issue/roles/coordinator.md +++ b/.claude/skills/team-issue/roles/coordinator.md @@ -136,38 +136,18 @@ function detectMode(issueIds, userMode) { } ``` -### Phase 2: Create Team + Spawn Workers +### Phase 2: Create Team + Initialize Session ```javascript TeamCreate({ team_name: "issue" }) -// Spawn workers based on mode -const workersToSpawn = mode === 'quick' - ? ['explorer', 'planner', 'integrator', 'implementer'] // No reviewer in quick mode - : ['explorer', 'planner', 'reviewer', 'integrator', 'implementer'] - -for (const workerName of workersToSpawn) { - Task({ - subagent_type: "general-purpose", - team_name: "issue", - name: workerName, - prompt: `你是 team "issue" 的 ${workerName.toUpperCase()}。 -当你收到任务时,调用 Skill(skill="team-issue", args="--role=${workerName}") 执行。 -当前需求: 处理 issue ${issueIds.join(', ')},模式: ${mode} -约束: CLI-first data access, 所有 issue 操作通过 ccw issue 命令 - -## 角色准则(强制) -- 所有输出必须带 [${workerName}] 标识前缀 -- 仅与 coordinator 通信 -- 每次 SendMessage 前,先调用 mcp__ccw-tools__team_msg 记录 - -工作流程: -1. TaskList → 找到分配给你的任务 -2. Skill(skill="team-issue", args="--role=${workerName}") 执行 -3. team_msg log + SendMessage 结果给 coordinator -4. TaskUpdate completed → 检查下一个任务` - }) -} +// ⚠️ Workers are NOT pre-spawned here. +// Workers are spawned per-stage in Phase 4 via Stop-Wait Task(run_in_background: false). +// See SKILL.md Coordinator Spawn Template for worker prompt templates. +// +// Worker roles available (spawned on-demand per pipeline stage): +// quick mode: explorer, planner, integrator, implementer +// full mode: explorer, planner, reviewer, integrator, implementer ``` ### Phase 3: Create Task Chain @@ -323,6 +303,13 @@ const marshalId = TaskCreate({ ### Phase 4: Coordination Loop +> **设计原则(Stop-Wait)**: 模型执行没有时间概念,禁止任何形式的轮询等待。 +> - ❌ 禁止: `while` 循环 + `sleep` + 检查状态 +> - ✅ 采用: 同步 `Task(run_in_background: false)` 调用,Worker 返回 = 阶段完成信号 +> +> 按 Phase 3 创建的任务链顺序,逐阶段 spawn worker 同步执行。 +> Worker prompt 使用 SKILL.md Coordinator Spawn Template。 + Receive teammate messages, dispatch based on type. | Received Message | Action | diff --git a/.claude/skills/team-iterdev/roles/coordinator.md b/.claude/skills/team-iterdev/roles/coordinator.md index d540f582..3d6d8629 100644 --- a/.claude/skills/team-iterdev/roles/coordinator.md +++ b/.claude/skills/team-iterdev/roles/coordinator.md @@ -112,7 +112,9 @@ const teamSession = { Write(`${sessionFolder}/team-session.json`, JSON.stringify(teamSession, null, 2)) ``` -Spawn workers (see SKILL.md Coordinator Spawn Template). +// ⚠️ Workers are NOT pre-spawned here. +// Workers are spawned per-stage in Phase 4 via Stop-Wait Task(run_in_background: false). +// See SKILL.md Coordinator Spawn Template for worker prompt templates. ### Phase 3: Create Task Chain + Update Ledger @@ -161,6 +163,13 @@ TaskUpdate({ taskId: reviewId, owner: "reviewer", addBlockedBy: [devId] }) ### Phase 4: Coordination Loop + GC Control + Ledger Updates +> **设计原则(Stop-Wait)**: 模型执行没有时间概念,禁止任何形式的轮询等待。 +> - ❌ 禁止: `while` 循环 + `sleep` + 检查状态 +> - ✅ 采用: 同步 `Task(run_in_background: false)` 调用,Worker 返回 = 阶段完成信号 +> +> 按 Phase 3 创建的任务链顺序,逐阶段 spawn worker 同步执行。 +> Worker prompt 使用 SKILL.md Coordinator Spawn Template。 + | Received Message | Action | |-----------------|--------| | architect: design_ready | Read design → update ledger → unblock DEV | diff --git a/.claude/skills/team-lifecycle-v2/roles/coordinator/commands/monitor.md b/.claude/skills/team-lifecycle-v2/roles/coordinator/commands/monitor.md index 8846d44d..89af8a8e 100644 --- a/.claude/skills/team-lifecycle-v2/roles/coordinator/commands/monitor.md +++ b/.claude/skills/team-lifecycle-v2/roles/coordinator/commands/monitor.md @@ -10,62 +10,66 @@ ## Coordination Loop +> **设计原则**: 模型执行没有时间概念,禁止任何形式的轮询等待。 +> 使用同步 `Task(run_in_background: false)` 调用作为等待机制。 +> Worker 返回 = 阶段完成信号(天然回调),无需 sleep 轮询。 + ```javascript -Output("[coordinator] Entering coordination loop...") +Output("[coordinator] Entering coordination loop (Stop-Wait mode)...") -let loopActive = true -let checkpointPending = false +// Sequentially execute each task by spawning its worker +const allTasks = TaskList() +const pendingTasks = allTasks.filter(t => t.status !== 'completed' && t.assigned_to !== 'coordinator') -while (loopActive) { - // Load current session state - const session = Read(sessionFile) - const teamState = TeamGet(session.team_id) - const allTasks = teamState.tasks +for (const task of pendingTasks) { + // Check if all dependencies are met + const allDepsMet = (task.dependencies || []).every(depId => { + const dep = TaskGet(depId) + return dep.status === "completed" + }) - // Check for incoming messages - const messages = TeamGetMessages(session.team_id) - - for (const message of messages) { - Output(`[coordinator] Received message: ${message.type} from ${message.sender}`) - - switch (message.type) { - case "task_complete": - handleTaskComplete(message) - break - - case "task_blocked": - handleTaskBlocked(message) - break - - case "discussion_needed": - handleDiscussionNeeded(message) - break - - case "research_complete": - handleResearchComplete(message) - break - - default: - Output(`[coordinator] Unknown message type: ${message.type}`) - } + if (!allDepsMet) { + Output(`[coordinator] Task ${task.task_id} blocked by dependencies, skipping`) + continue } - // Check if all tasks complete - const completedTasks = allTasks.filter(t => t.status === "completed") - const totalTasks = allTasks.length + Output(`[coordinator] Starting task: ${task.task_id} (assigned to: ${task.assigned_to})`) - if (completedTasks.length === totalTasks) { - Output("[coordinator] All tasks completed!") - loopActive = false - break + // Mark as active + TaskUpdate(task.task_id, { + status: "active", + started_at: new Date().toISOString() + }) + + // Spawn worker — blocks until worker returns (Stop-Wait) + Task({ + subagent_type: "general-purpose", + prompt: `Execute task ${task.task_id}: ${task.description} +Assigned role: ${task.assigned_to} +When complete, mark TaskUpdate(${task.task_id}, { status: "completed" })`, + run_in_background: false + }) + + // Worker returned — check status + const completedTask = TaskGet(task.task_id) + Output(`[coordinator] Task ${task.task_id} status: ${completedTask.status}`) + + if (completedTask.status === "completed") { + handleTaskComplete({ task_id: task.task_id, output: completedTask.output }) } // Update session progress - session.tasks_completed = completedTasks.length + const session = Read(sessionFile) + const allTasksNow = TaskList() + session.tasks_completed = allTasksNow.filter(t => t.status === "completed").length Write(sessionFile, session) - // Sleep before next iteration - sleep(5000) // 5 seconds + // Check if all tasks complete + const remaining = allTasksNow.filter(t => t.status !== "completed" && t.assigned_to !== 'coordinator') + if (remaining.length === 0) { + Output("[coordinator] All tasks completed!") + break + } } Output("[coordinator] Coordination loop complete") diff --git a/.claude/skills/team-lifecycle-v2/roles/coordinator/role.md b/.claude/skills/team-lifecycle-v2/roles/coordinator/role.md index dc499d2c..317b2689 100644 --- a/.claude/skills/team-lifecycle-v2/roles/coordinator/role.md +++ b/.claude/skills/team-lifecycle-v2/roles/coordinator/role.md @@ -377,9 +377,9 @@ goto Phase2 --- -### Phase 2: Create Team + Spawn Workers +### Phase 2: Create Team + Initialize Session -**Purpose**: Initialize team and spawn worker subagents +**Purpose**: Initialize team and session state ```javascript Output("[coordinator] Phase 2: Team Creation") @@ -430,43 +430,17 @@ const sessionData = { Write(sessionFile, sessionData) Output(`[coordinator] Session file created: ${sessionFile}`) -// Spawn workers conditionally based on pipeline mode -const isFE = ['fe-only', 'fullstack', 'full-lifecycle-fe'].includes(requirements.mode) -const isBE = ['impl-only', 'fullstack', 'full-lifecycle', 'full-lifecycle-fe'].includes(requirements.mode) -const isSpec = ['spec-only', 'full-lifecycle', 'full-lifecycle-fe'].includes(requirements.mode) - -if (isSpec) { - TeamSpawn({ team_id: teamId, role: "spec-writer", count: 1 }) - Output("[coordinator] Spawned spec-writer") -} - -if (isBE) { - TeamSpawn({ team_id: teamId, role: "implementer", count: 1 }) - Output("[coordinator] Spawned implementer") -} - -if (isFE) { - TeamSpawn({ team_id: teamId, role: "fe-developer", count: 1 }) - Output("[coordinator] Spawned fe-developer") - TeamSpawn({ team_id: teamId, role: "fe-qa", count: 1 }) - Output("[coordinator] Spawned fe-qa") - - // Initialize shared memory for frontend pipeline - const sharedMemoryPath = `${sessionFolder}/shared-memory.json` - Write(sharedMemoryPath, JSON.stringify({ - design_intelligence: {}, - design_token_registry: {}, - component_inventory: [], - style_decisions: [], - qa_history: [], - industry_context: {} - }, null, 2)) - Output("[coordinator] Initialized shared-memory.json for frontend pipeline") -} - -// Always spawn researcher for ambiguity resolution -TeamSpawn({ team_id: teamId, role: "researcher", count: 1 }) -Output("[coordinator] Spawned researcher") +// ⚠️ Workers are NOT pre-spawned here. +// Workers are spawned per-stage in Phase 4 via Stop-Wait Task(run_in_background: false). +// See SKILL.md Coordinator Spawn Template for worker prompt templates. +// +// Worker roles by mode (spawned on-demand): +// spec-only: spec-writer +// impl-only: implementer +// fe-only: fe-developer, fe-qa +// fullstack: implementer, fe-developer, fe-qa +// full-lifecycle / full-lifecycle-fe: spec-writer + relevant impl roles +// Always available: researcher (for ambiguity resolution) goto Phase3 ``` @@ -495,6 +469,13 @@ goto Phase4 **Purpose**: Monitor task progress and route messages +> **设计原则(Stop-Wait)**: 模型执行没有时间概念,禁止任何形式的轮询等待。 +> - ❌ 禁止: `while` 循环 + `sleep` + 检查状态 +> - ✅ 采用: 同步 `Task(run_in_background: false)` 调用,Worker 返回 = 阶段完成信号 +> +> 按 Phase 3 创建的任务链顺序,逐阶段 spawn worker 同步执行。 +> Worker prompt 使用 SKILL.md Coordinator Spawn Template。 + ```javascript Output("[coordinator] Phase 4: Coordination Loop") diff --git a/.claude/skills/team-lifecycle/roles/coordinator.md b/.claude/skills/team-lifecycle/roles/coordinator.md index c873f017..1fb64d9a 100644 --- a/.claude/skills/team-lifecycle/roles/coordinator.md +++ b/.claude/skills/team-lifecycle/roles/coordinator.md @@ -188,15 +188,11 @@ if (isResume) { if (meta) neededRoles.add(meta.owner) }) - // Spawn only needed workers using Phase 2 spawn template (see SKILL.md Coordinator Spawn Template) - // Each worker is spawned with prompt that: - // 1. Identifies their role - // 2. Instructs to call Skill(skill="team-lifecycle", args="--role=") - // 3. Includes session context: taskDescription, sessionFolder, constraints - // 4. Instructs immediate TaskList polling on startup + // Spawn only needed workers using Phase 4 Stop-Wait pattern (see SKILL.md Coordinator Spawn Template) + // Workers are spawned per-stage via Task(run_in_background: false) in Phase 4 coordination loop. + // neededRoles is used to determine which workers will be spawned on-demand. neededRoles.forEach(role => { - // → Use SKILL.md Coordinator Spawn Template for each role - // → Worker prompt includes: "Session: ${sessionFolder}", "需求: ${taskDescription}" + // → Worker prompt template in SKILL.md (spawned per-stage in Phase 4, not pre-spawned here) }) // ============================================================ @@ -419,7 +415,7 @@ if (mode === 'impl-only' || mode === 'full-lifecycle') { } ``` -### Phase 2: Create Team + Spawn Workers +### Phase 2: Create Team + Initialize Session ```javascript TeamCreate({ team_name: teamName }) @@ -462,9 +458,11 @@ const teamSession = { Write(`${sessionFolder}/team-session.json`, JSON.stringify(teamSession, null, 2)) ``` -**Conditional spawn based on mode** (see SKILL.md Coordinator Spawn Template for full prompts): +**Workers are NOT pre-spawned here.** Workers are spawned per-stage in Phase 4 via Stop-Wait `Task(run_in_background: false)`. See SKILL.md Coordinator Spawn Template for worker prompt templates. -| Mode | Spawned Workers | +Worker roles by mode (spawned on-demand): + +| Mode | Worker Roles | |------|-----------------| | spec-only | analyst, writer, discussant, reviewer (4) | | impl-only | planner, executor, tester, reviewer (4) | @@ -562,6 +560,13 @@ TaskUpdate({ taskId: planId, owner: "planner", addBlockedBy: [discuss6Id] }) ### Phase 4: Coordination Loop +> **设计原则(Stop-Wait)**: 模型执行没有时间概念,禁止任何形式的轮询等待。 +> - ❌ 禁止: `while` 循环 + `sleep` + 检查状态 +> - ✅ 采用: 同步 `Task(run_in_background: false)` 调用,Worker 返回 = 阶段完成信号 +> +> 按 Phase 3 创建的任务链顺序,逐阶段 spawn worker 同步执行。 +> Worker prompt 使用 SKILL.md Coordinator Spawn Template。 + Receive teammate messages and make dispatch decisions. **Before each decision: `team_msg list` to review recent messages. After each decision: `team_msg log` to record.** #### Spec Messages diff --git a/.claude/skills/team-quality-assurance/roles/coordinator/commands/monitor.md b/.claude/skills/team-quality-assurance/roles/coordinator/commands/monitor.md index 705d3948..eedf04c1 100644 --- a/.claude/skills/team-quality-assurance/roles/coordinator/commands/monitor.md +++ b/.claude/skills/team-quality-assurance/roles/coordinator/commands/monitor.md @@ -20,8 +20,15 @@ ### 设计原则 -> **模型执行没有时间概念**。禁止空转 while 循环检查状态。 -> 使用固定 sleep 间隔 + 最大轮询次数,避免无意义的 API 调用浪费。 +> **模型执行没有时间概念,禁止任何形式的轮询等待。** +> +> - ❌ 禁止: `while` 循环 + `sleep` + 检查状态(空转浪费 API 轮次) +> - ❌ 禁止: `Bash(sleep N)` / `Bash(timeout /t N)` 作为等待手段 +> - ✅ 采用: 同步 `Task()` 调用(`run_in_background: false`),call 本身即等待 +> - ✅ 采用: Worker 返回 = 阶段完成信号(天然回调) +> +> **原理**: `Task(run_in_background: false)` 是阻塞调用,coordinator 自动挂起直到 worker 返回。 +> 无需 sleep,无需轮询,无需消息总线监控。Worker 的返回就是回调。 ### Decision Logic @@ -47,14 +54,16 @@ const routingTable = { } ``` -### 等待策略常量 +### Stage-Worker 映射表 ```javascript -const POLL_INTERVAL_SEC = 300 // 每次检查间隔 5 分钟(测试执行可能很慢) -const MAX_POLLS_PER_STAGE = 6 // 单阶段最多等待 6 次(~30 分钟) -const SLEEP_CMD = process.platform === 'win32' - ? `timeout /t ${POLL_INTERVAL_SEC} /nobreak >nul 2>&1` - : `sleep ${POLL_INTERVAL_SEC}` +const STAGE_WORKER_MAP = { + 'SCOUT': { role: 'scout', skillArgs: '--role=scout' }, + 'QASTRAT': { role: 'strategist', skillArgs: '--role=strategist' }, + 'QAGEN': { role: 'generator', skillArgs: '--role=generator' }, + 'QARUN': { role: 'executor', skillArgs: '--role=executor' }, + 'QAANA': { role: 'analyst', skillArgs: '--role=analyst' } +} // ★ 统一 auto mode 检测:-y/--yes 从 $ARGUMENTS 或 ccw 传播 const autoYes = /\b(-y|--yes)\b/.test(args) @@ -83,94 +92,125 @@ const pipelineTasks = allTasks .sort((a, b) => Number(a.id) - Number(b.id)) ``` -### Step 2: Stage-Driven Execution +### Step 2: Sequential Stage Execution (Stop-Wait) -> **核心改动**: 不再使用 while 轮询循环。按 pipeline 阶段顺序,逐阶段等待完成。 -> 每个阶段:sleep → 检查消息 → 确认任务状态 → 处理结果 → 下一阶段。 +> **核心**: 逐阶段 spawn worker,同步阻塞等待返回。 +> Worker 返回 = 阶段完成。无 sleep、无轮询、无消息总线监控。 ```javascript // 按依赖顺序处理每个阶段 for (const stageTask of pipelineTasks) { - // --- 等待当前阶段完成 --- - let stageComplete = false - let pollCount = 0 + // 1. 提取阶段前缀 → 确定 worker 角色 + const stagePrefix = stageTask.subject.match(/^([\w-]+)-\d/)?.[1]?.replace(/-L\d$/, '') + const workerConfig = STAGE_WORKER_MAP[stagePrefix] - while (!stageComplete && pollCount < MAX_POLLS_PER_STAGE) { - // ★ 固定等待:sleep 30s,让 worker 有执行时间 - Bash(SLEEP_CMD) - pollCount++ - - // 1. 检查消息总线(主要信号源) - const messages = mcp__ccw-tools__team_msg({ - operation: "list", - team: teamName, - last: 5 + if (!workerConfig) { + mcp__ccw-tools__team_msg({ + operation: "log", team: teamName, from: "coordinator", + to: "user", type: "error", + summary: `[coordinator] 未知阶段前缀: ${stagePrefix},跳过` }) - - // 2. 路由消息 - for (const msg of messages) { - const handler = routingTable[msg.type] - if (!handler) continue - processMessage(msg, handler) - } - - // 3. 确认任务状态(兜底) - const currentTask = TaskGet({ taskId: stageTask.id }) - stageComplete = currentTask.status === 'completed' || currentTask.status === 'deleted' + continue } - // --- 阶段超时处理 --- - if (!stageComplete) { - const elapsedMin = Math.round(pollCount * POLL_INTERVAL_SEC / 60) + // 2. 标记任务为执行中 + TaskUpdate({ taskId: stageTask.id, status: 'in_progress' }) + mcp__ccw-tools__team_msg({ + operation: "log", team: teamName, from: "coordinator", + to: workerConfig.role, type: "task_unblocked", + summary: `[coordinator] 启动阶段: ${stageTask.subject} → ${workerConfig.role}` + }) + + // 3. 同步 spawn worker — 阻塞直到 worker 返回(Stop-Wait 核心) + const workerResult = Task({ + subagent_type: "general-purpose", + prompt: `你是 team "${teamName}" 的 ${workerConfig.role.toUpperCase()}。 + +## ⚠️ 首要指令(MUST) +Skill(skill="team-quality-assurance", args="${workerConfig.skillArgs}") + +## 当前任务 +- 任务 ID: ${stageTask.id} +- 任务: ${stageTask.subject} +- 描述: ${stageTask.description || taskDescription} +- Session: ${sessionFolder} + +## 角色准则(强制) +- 所有输出必须带 [${workerConfig.role}] 标识前缀 +- 仅与 coordinator 通信 + +## 工作流程 +1. Skill(skill="team-quality-assurance", args="${workerConfig.skillArgs}") 获取角色定义 +2. 执行任务 → 汇报结果 +3. TaskUpdate({ taskId: "${stageTask.id}", status: "completed" })`, + run_in_background: false + }) + + // 4. Worker 已返回 — 直接处理结果 + const taskState = TaskGet({ taskId: stageTask.id }) + + if (taskState.status !== 'completed') { + // Worker 返回但未标记 completed → 异常处理 if (autoYes) { - // 自动模式:记录日志,自动跳过 mcp__ccw-tools__team_msg({ operation: "log", team: teamName, from: "coordinator", to: "user", type: "error", - summary: `[coordinator] [auto] 阶段 ${stageTask.subject} 超时 (${elapsedMin}min),自动跳过` + summary: `[coordinator] [auto] 阶段 ${stageTask.subject} 未完成,自动跳过` }) TaskUpdate({ taskId: stageTask.id, status: 'deleted' }) continue } - // 交互模式:由用户决定 const decision = AskUserQuestion({ questions: [{ - question: `阶段 "${stageTask.subject}" 已等待 ${elapsedMin} 分钟仍未完成。如何处理?`, - header: "Stage Wait", + question: `阶段 "${stageTask.subject}" worker 返回但未完成。如何处理?`, + header: "Stage Fail", multiSelect: false, options: [ - { label: "继续等待", description: `再等 ${MAX_POLLS_PER_STAGE} 轮(~${Math.round(MAX_POLLS_PER_STAGE * POLL_INTERVAL_SEC / 60)}min)` }, - { label: "跳过此阶段", description: "标记为跳过,继续后续流水线" }, - { label: "终止流水线", description: "停止整个 QA 流程,汇报当前结果" } + { label: "重试", description: "重新 spawn worker 执行此阶段" }, + { label: "跳过", description: "标记为跳过,继续后续流水线" }, + { label: "终止", description: "停止整个 QA 流程,汇报当前结果" } ] }] }) - const answer = decision["Stage Wait"] - - if (answer === "继续等待") { - // 重置计数器,继续等待当前阶段 - pollCount = 0 - // 重新进入当前阶段的等待循环(需要用 while 包裹,此处用 goto 语义) - continue // 注意:实际执行中需要将 for 改为可重入的逻辑 - } else if (answer === "跳过此阶段") { - mcp__ccw-tools__team_msg({ - operation: "log", team: teamName, from: "coordinator", - to: "user", type: "error", - summary: `[coordinator] 用户选择跳过阶段 ${stageTask.subject}` - }) + const answer = decision["Stage Fail"] + if (answer === "跳过") { TaskUpdate({ taskId: stageTask.id, status: 'deleted' }) continue - } else { - // 终止流水线 + } else if (answer === "终止") { mcp__ccw-tools__team_msg({ operation: "log", team: teamName, from: "coordinator", to: "user", type: "shutdown", summary: `[coordinator] 用户终止流水线,当前阶段: ${stageTask.subject}` }) - break // 跳出 for 循环,进入 Step 3 汇报 + break + } + // 重试: continue to next iteration will re-process if logic wraps + } else { + mcp__ccw-tools__team_msg({ + operation: "log", team: teamName, from: "coordinator", + to: "user", type: "quality_gate", + summary: `[coordinator] 阶段完成: ${stageTask.subject}` + }) + } + + // 5. 阶段间检查(QARUN 阶段检查覆盖率,决定 GC 循环) + if (stagePrefix === 'QARUN') { + const latestMemory = JSON.parse(Read(`${sessionFolder}/shared-memory.json`)) + const coverage = latestMemory.execution_results?.coverage || 0 + const targetLayer = stageTask.metadata?.layer || 'L1' + const target = coverageTargets[targetLayer] || 80 + + if (coverage < target && gcIteration < MAX_GC_ITERATIONS) { + gcIteration++ + mcp__ccw-tools__team_msg({ + operation: "log", team: teamName, from: "coordinator", + to: "generator", type: "gc_loop_trigger", + summary: `[coordinator] GC循环 #${gcIteration}: 覆盖率 ${coverage}% < ${target}%,请修复` + }) + // 创建 GC 修复任务追加到 pipeline } } } @@ -284,10 +324,8 @@ const summary = { | Scenario | Resolution | |----------|------------| -| Message bus unavailable | Fall back to TaskList polling only | -| Stage timeout (交互模式) | AskUserQuestion:继续等待 / 跳过 / 终止流水线 | -| Stage timeout (自动模式 `-y`/`--yes`,`autoYes`) | 自动跳过,记录日志,继续流水线 | -| Teammate unresponsive (2x no response) | Respawn teammate with same task | -| Deadlock detected (tasks blocked indefinitely) | Identify cycle, manually unblock | +| Worker 返回但未 completed (交互模式) | AskUserQuestion: 重试 / 跳过 / 终止 | +| Worker 返回但未 completed (自动模式) | 自动跳过,记录日志 | +| Worker spawn 失败 | 重试一次,仍失败则上报用户 | | Quality gate FAIL | Report to user, suggest targeted re-run | | GC loop stuck >3 iterations | Accept current coverage, continue pipeline | diff --git a/.claude/skills/team-quality-assurance/roles/coordinator/role.md b/.claude/skills/team-quality-assurance/roles/coordinator/role.md index d288876b..9cb0b55c 100644 --- a/.claude/skills/team-quality-assurance/roles/coordinator/role.md +++ b/.claude/skills/team-quality-assurance/roles/coordinator/role.md @@ -99,7 +99,7 @@ if (!autoYes && (!taskDescription || taskDescription.length < 10)) { } ``` -### Phase 2: Create Team + Spawn Teammates +### Phase 2: Create Team + Initialize Session ```javascript const teamName = "quality-assurance" @@ -121,8 +121,10 @@ Write(`${sessionFolder}/shared-memory.json`, JSON.stringify({ TeamCreate({ team_name: teamName }) -// Spawn teammates (see SKILL.md Coordinator Spawn Template) -// Scout, Strategist, Generator, Executor, Analyst +// ⚠️ Workers are NOT pre-spawned here. +// Workers are spawned per-stage in Phase 4 via Stop-Wait Task(run_in_background: false). +// See SKILL.md Coordinator Spawn Template for worker prompt templates. +// Worker roles: Scout, Strategist, Generator, Executor, Analyst ``` ### Phase 3: Create Task Chain @@ -151,6 +153,13 @@ SCOUT-001 → QASTRAT-001 → [QAGEN-001(L1) + QAGEN-002(L2)](parallel) → [QAR ### Phase 4: Coordination Loop +> **设计原则(Stop-Wait)**: 模型执行没有时间概念,禁止任何形式的轮询等待。 +> - ❌ 禁止: `while` 循环 + `sleep` + 检查状态 +> - ✅ 采用: 同步 `Task(run_in_background: false)` 调用,Worker 返回 = 阶段完成信号 +> +> 按 Phase 3 创建的任务链顺序,逐阶段 spawn worker 同步执行。 +> Worker prompt 使用 SKILL.md Coordinator Spawn Template。 + ```javascript // Read commands/monitor.md for full implementation Read("commands/monitor.md") diff --git a/.claude/skills/team-tech-debt/SKILL.md b/.claude/skills/team-tech-debt/SKILL.md index 0c60681e..a621ce61 100644 --- a/.claude/skills/team-tech-debt/SKILL.md +++ b/.claude/skills/team-tech-debt/SKILL.md @@ -243,14 +243,21 @@ TDFIX → TDVAL → (if regression or quality drop) → TDFIX-fix → TDVAL-2 ## Coordinator Spawn Template +> **注意**: 以下模板作为 worker prompt 参考。在 Stop-Wait 策略下,coordinator 不再在 Phase 2 预先 spawn 所有 worker。 +> 而是在 Phase 4 (monitor) 中,按 pipeline 阶段逐个 spawn worker(同步阻塞 `Task(run_in_background: false)`), +> worker 返回即阶段完成。详见 `roles/coordinator/commands/monitor.md`。 + ```javascript TeamCreate({ team_name: teamName }) -// Scanner +// Worker 按需 spawn(monitor.md Phase 4 调用) +// 以下为各角色的 prompt 模板参考: +``` + +### Scanner Prompt Template +```javascript Task({ subagent_type: "general-purpose", - team_name: teamName, - name: "scanner", prompt: `你是 team "${teamName}" 的 SCANNER。 ## ⚠️ 首要指令(MUST) @@ -274,14 +281,15 @@ Skill(skill="team-tech-debt", args="--role=scanner") 1. 调用 Skill(skill="team-tech-debt", args="--role=scanner") 获取角色定义和执行逻辑 2. 按 role.md 中的 5-Phase 流程执行(TaskList → 找到 TDSCAN-* 任务 → 执行 → 汇报) 3. team_msg log + SendMessage 结果给 coordinator(带 [scanner] 标识) -4. TaskUpdate completed → 检查下一个任务 → 回到步骤 1` +4. TaskUpdate completed → 检查下一个任务 → 回到步骤 1`, + run_in_background: false // 同步阻塞 }) +``` -// Assessor +### Assessor Prompt Template +```javascript Task({ subagent_type: "general-purpose", - team_name: teamName, - name: "assessor", prompt: `你是 team "${teamName}" 的 ASSESSOR。 ## ⚠️ 首要指令(MUST) @@ -302,14 +310,15 @@ Skill(skill="team-tech-debt", args="--role=assessor") 1. Skill(skill="team-tech-debt", args="--role=assessor") 获取角色定义 2. TaskList → 找到 TDEVAL-* 任务 → 执行 → 汇报 3. team_msg log + SendMessage 结果给 coordinator -4. TaskUpdate completed → 检查下一个任务` +4. TaskUpdate completed`, + run_in_background: false }) +``` -// Planner +### Planner Prompt Template +```javascript Task({ subagent_type: "general-purpose", - team_name: teamName, - name: "planner", prompt: `你是 team "${teamName}" 的 PLANNER。 ## ⚠️ 首要指令(MUST) @@ -330,14 +339,15 @@ Skill(skill="team-tech-debt", args="--role=planner") 1. Skill(skill="team-tech-debt", args="--role=planner") 获取角色定义 2. TaskList → 找到 TDPLAN-* 任务 → 执行 → 汇报 3. team_msg log + SendMessage 结果给 coordinator -4. TaskUpdate completed → 检查下一个任务` +4. TaskUpdate completed`, + run_in_background: false }) +``` -// Executor +### Executor Prompt Template +```javascript Task({ subagent_type: "general-purpose", - team_name: teamName, - name: "executor", prompt: `你是 team "${teamName}" 的 EXECUTOR。 ## ⚠️ 首要指令(MUST) @@ -358,14 +368,15 @@ Skill(skill="team-tech-debt", args="--role=executor") 1. Skill(skill="team-tech-debt", args="--role=executor") 获取角色定义 2. TaskList → 找到 TDFIX-* 任务 → 执行 → 汇报 3. team_msg log + SendMessage 结果给 coordinator -4. TaskUpdate completed → 检查下一个任务` +4. TaskUpdate completed`, + run_in_background: false }) +``` -// Validator +### Validator Prompt Template +```javascript Task({ subagent_type: "general-purpose", - team_name: teamName, - name: "validator", prompt: `你是 team "${teamName}" 的 VALIDATOR。 ## ⚠️ 首要指令(MUST) @@ -386,7 +397,8 @@ Skill(skill="team-tech-debt", args="--role=validator") 1. Skill(skill="team-tech-debt", args="--role=validator") 获取角色定义 2. TaskList → 找到 TDVAL-* 任务 → 执行 → 汇报 3. team_msg log + SendMessage 结果给 coordinator -4. TaskUpdate completed → 检查下一个任务` +4. TaskUpdate completed`, + run_in_background: false }) ``` diff --git a/.claude/skills/team-tech-debt/roles/coordinator/commands/monitor.md b/.claude/skills/team-tech-debt/roles/coordinator/commands/monitor.md index 4edc8f9a..dfa9da40 100644 --- a/.claude/skills/team-tech-debt/roles/coordinator/commands/monitor.md +++ b/.claude/skills/team-tech-debt/roles/coordinator/commands/monitor.md @@ -1,12 +1,12 @@ # Command: monitor -> 阶段驱动的协调循环。按 pipeline 阶段顺序等待 worker 完成,路由消息,处理 Fix-Verify 循环,检测完成。 +> 停止等待(Stop-Wait)协调。按 pipeline 阶段顺序,逐阶段 spawn worker 同步执行,worker 返回即阶段完成,无需轮询。 ## When to Use - Phase 4 of Coordinator -- 任务链已创建并分发 -- 需要持续监控直到所有任务完成 +- 任务链已创建(dispatch 完成) +- 需要逐阶段驱动 worker 执行直到所有任务完成 **Trigger conditions**: - dispatch 完成后立即启动 @@ -16,199 +16,225 @@ ### Delegation Mode -**Mode**: Stage-driven(按阶段顺序等待,非轮询) +**Mode**: Stop-Wait(同步阻塞 Task call,非轮询) ### 设计原则 -> **模型执行没有时间概念**。禁止空转 while 循环检查状态。 -> 使用固定 sleep 间隔 + 最大轮询次数,避免无意义的 API 调用浪费。 +> **模型执行没有时间概念,禁止任何形式的轮询等待。** +> +> - ❌ 禁止: `while` 循环 + `sleep` + 检查状态(空转浪费 API 轮次) +> - ❌ 禁止: `Bash(sleep N)` / `Bash(timeout /t N)` 作为等待手段 +> - ✅ 采用: 同步 `Task()` 调用(`run_in_background: false`),call 本身即等待 +> - ✅ 采用: Worker 返回 = 阶段完成信号(天然回调) +> +> **原理**: `Task(run_in_background: false)` 是阻塞调用,coordinator 自动挂起直到 worker 返回。 +> 无需 sleep,无需轮询,无需消息总线监控。Worker 的返回就是回调。 -### Decision Logic +### Stage-Worker 映射表 ```javascript -// 消息路由表 -const routingTable = { - // Scanner 完成 - 'scan_complete': { action: 'Mark TDSCAN complete, unblock TDEVAL' }, - 'debt_items_found': { action: 'Mark TDSCAN complete with items, unblock TDEVAL' }, - // Assessor 完成 - 'assessment_complete': { action: 'Mark TDEVAL complete, unblock TDPLAN' }, - // Planner 完成 - 'plan_ready': { action: 'Mark TDPLAN complete, unblock TDFIX' }, - 'plan_revision': { action: 'Plan revised, re-evaluate dependencies' }, - // Executor 完成 - 'fix_complete': { action: 'Mark TDFIX complete, unblock TDVAL' }, - 'fix_progress': { action: 'Log progress, continue waiting' }, - // Validator 完成 - 'validation_complete': { action: 'Mark TDVAL complete, evaluate quality gate', special: 'quality_gate' }, - 'regression_found': { action: 'Evaluate regression, decide Fix-Verify loop', special: 'fix_verify_decision' }, - // 错误 - 'error': { action: 'Assess severity, retry or escalate', special: 'error_handler' } +const STAGE_WORKER_MAP = { + 'TDSCAN': { role: 'scanner', skillArgs: '--role=scanner' }, + 'TDEVAL': { role: 'assessor', skillArgs: '--role=assessor' }, + 'TDPLAN': { role: 'planner', skillArgs: '--role=planner' }, + 'TDFIX': { role: 'executor', skillArgs: '--role=executor' }, + 'TDVAL': { role: 'validator', skillArgs: '--role=validator' } } ``` -### 等待策略常量 - -```javascript -const POLL_INTERVAL_SEC = 300 // 每次检查间隔 5 分钟 -const MAX_POLLS_PER_STAGE = 6 // 单阶段最多等待 6 次(~30 分钟) -const SLEEP_CMD = process.platform === 'win32' - ? `timeout /t ${POLL_INTERVAL_SEC} /nobreak >nul 2>&1` - : `sleep ${POLL_INTERVAL_SEC}` - -// 统一 auto mode 检测 -const autoYes = /\b(-y|--yes)\b/.test(args) -``` - ## Execution Steps ### Step 1: Context Preparation ```javascript -// 从 shared memory 获取上下文 const sharedMemory = JSON.parse(Read(`${sessionFolder}/shared-memory.json`)) let fixVerifyIteration = 0 const MAX_FIX_VERIFY_ITERATIONS = 3 -// 获取 pipeline 阶段列表 +// 获取 pipeline 阶段列表(按创建顺序 = 依赖顺序) const allTasks = TaskList() const pipelineTasks = allTasks .filter(t => t.owner && t.owner !== 'coordinator') .sort((a, b) => Number(a.id) - Number(b.id)) + +// 统一 auto mode 检测 +const autoYes = /\b(-y|--yes)\b/.test(args) ``` -### Step 2: Stage-Driven Execution +### Step 2: Sequential Stage Execution (Stop-Wait) -> **核心**: 按 pipeline 阶段顺序,逐阶段等待完成。 -> 每个阶段:sleep → 检查消息 → 确认任务状态 → 处理结果 → 下一阶段。 +> **核心**: 逐阶段 spawn worker,同步阻塞等待返回。 +> Worker 返回 = 阶段完成。无 sleep、无轮询、无消息总线监控。 ```javascript for (const stageTask of pipelineTasks) { - let stageComplete = false - let pollCount = 0 + // 1. 提取阶段前缀 → 确定 worker 角色 + const stagePrefix = stageTask.subject.match(/^(TD\w+)-/)?.[1] + const workerConfig = STAGE_WORKER_MAP[stagePrefix] - while (!stageComplete && pollCount < MAX_POLLS_PER_STAGE) { - Bash(SLEEP_CMD) - pollCount++ - - // 1. 检查消息总线 - const messages = mcp__ccw-tools__team_msg({ - operation: "list", - team: teamName, - last: 5 + if (!workerConfig) { + mcp__ccw-tools__team_msg({ + operation: "log", team: teamName, from: "coordinator", + to: "user", type: "error", + summary: `[coordinator] 未知阶段前缀: ${stagePrefix},跳过` }) - - // 2. 路由消息 - for (const msg of messages) { - const handler = routingTable[msg.type] - if (!handler) continue - processMessage(msg, handler) - } - - // 3. 确认任务状态 - const currentTask = TaskGet({ taskId: stageTask.id }) - stageComplete = currentTask.status === 'completed' || currentTask.status === 'deleted' + continue } - // 阶段超时处理 - if (!stageComplete) { - const elapsedMin = Math.round(pollCount * POLL_INTERVAL_SEC / 60) + // 2. 标记任务为执行中 + TaskUpdate({ taskId: stageTask.id, status: 'in_progress' }) - if (autoYes) { - mcp__ccw-tools__team_msg({ - operation: "log", team: teamName, from: "coordinator", - to: "user", type: "error", - summary: `[coordinator] [auto] 阶段 ${stageTask.subject} 超时 (${elapsedMin}min),自动跳过` - }) - TaskUpdate({ taskId: stageTask.id, status: 'deleted' }) - continue - } + mcp__ccw-tools__team_msg({ + operation: "log", team: teamName, from: "coordinator", + to: workerConfig.role, type: "task_unblocked", + summary: `[coordinator] 启动阶段: ${stageTask.subject} → ${workerConfig.role}` + }) - const decision = AskUserQuestion({ - questions: [{ - question: `阶段 "${stageTask.subject}" 已等待 ${elapsedMin} 分钟仍未完成。如何处理?`, - header: "Stage Wait", - multiSelect: false, - options: [ - { label: "继续等待", description: `再等 ${MAX_POLLS_PER_STAGE} 轮` }, - { label: "跳过此阶段", description: "标记为跳过,继续后续流水线" }, - { label: "终止流水线", description: "停止整个流程,汇报当前结果" } - ] - }] + // 3. 同步 spawn worker — 阻塞直到 worker 返回(Stop-Wait 核心) + // Task() 本身就是等待机制,无需 sleep/poll + const workerResult = Task({ + subagent_type: "general-purpose", + prompt: buildWorkerPrompt(stageTask, workerConfig, sessionFolder, taskDescription), + run_in_background: false // ← 同步阻塞 = 天然回调 + }) + + // 4. Worker 已返回 — 直接处理结果(无需检查状态) + const taskState = TaskGet({ taskId: stageTask.id }) + + if (taskState.status !== 'completed') { + // Worker 返回但未标记 completed → 异常处理 + handleStageFailure(stageTask, taskState, workerConfig, autoYes) + } else { + mcp__ccw-tools__team_msg({ + operation: "log", team: teamName, from: "coordinator", + to: "user", type: "quality_gate", + summary: `[coordinator] 阶段完成: ${stageTask.subject}` }) + } - const answer = decision["Stage Wait"] - if (answer === "跳过此阶段") { - TaskUpdate({ taskId: stageTask.id, status: 'deleted' }) - continue - } else if (answer === "终止流水线") { - mcp__ccw-tools__team_msg({ - operation: "log", team: teamName, from: "coordinator", - to: "user", type: "shutdown", - summary: `[coordinator] 用户终止流水线,当前阶段: ${stageTask.subject}` - }) - break + // 5. 阶段间质量检查(仅 TDVAL 阶段) + if (stagePrefix === 'TDVAL') { + const needsFixVerify = evaluateValidationResult(sessionFolder) + if (needsFixVerify && fixVerifyIteration < MAX_FIX_VERIFY_ITERATIONS) { + fixVerifyIteration++ + const fixVerifyTasks = createFixVerifyTasks(fixVerifyIteration, sessionFolder) + // 将 Fix-Verify 任务追加到 pipeline 末尾继续执行 + pipelineTasks.push(...fixVerifyTasks) } } } ``` -### Step 2.1: Message Processing (processMessage) +### Step 2.1: Worker Prompt Builder ```javascript -function processMessage(msg, handler) { - switch (handler.special) { - case 'quality_gate': { - const latestMemory = JSON.parse(Read(`${sessionFolder}/shared-memory.json`)) - const debtBefore = latestMemory.debt_score_before || 0 - const debtAfter = latestMemory.debt_score_after || 0 - const improved = debtAfter < debtBefore +function buildWorkerPrompt(stageTask, workerConfig, sessionFolder, taskDescription) { + return `你是 team "${teamName}" 的 ${workerConfig.role.toUpperCase()}。 - let status = 'PASS' - if (!improved && latestMemory.validation_results?.regressions > 0) status = 'FAIL' - else if (!improved) status = 'CONDITIONAL' +## ⚠️ 首要指令(MUST) +你的所有工作必须通过调用 Skill 获取角色定义后执行,禁止自行发挥: +Skill(skill="team-tech-debt", args="${workerConfig.skillArgs}") +此调用会加载你的角色定义(role.md)、可用命令(commands/*.md)和完整执行逻辑。 - mcp__ccw-tools__team_msg({ - operation: "log", team: teamName, from: "coordinator", - to: "user", type: "quality_gate", - summary: `[coordinator] 质量门控: ${status} (债务分 ${debtBefore} → ${debtAfter})` - }) - break - } +## 当前任务 +- 任务 ID: ${stageTask.id} +- 任务: ${stageTask.subject} +- 描述: ${stageTask.description || taskDescription} +- Session: ${sessionFolder} - case 'fix_verify_decision': { - const regressions = msg.data?.regressions || 0 - if (regressions > 0 && fixVerifyIteration < MAX_FIX_VERIFY_ITERATIONS) { - fixVerifyIteration++ - mcp__ccw-tools__team_msg({ - operation: "log", team: teamName, from: "coordinator", - to: "executor", type: "task_unblocked", - summary: `[coordinator] Fix-Verify #${fixVerifyIteration}: 发现 ${regressions} 个回归,请修复`, - data: { iteration: fixVerifyIteration, regressions } - }) - // 创建 Fix-Verify 修复任务(参见 dispatch.md createFixVerifyTasks) - } else { - mcp__ccw-tools__team_msg({ - operation: "log", team: teamName, from: "coordinator", - to: "user", type: "quality_gate", - summary: `[coordinator] Fix-Verify 循环已达上限(${MAX_FIX_VERIFY_ITERATIONS}),接受当前结果` - }) - } - break - } +## 角色准则(强制) +- 你只能处理 ${stageTask.subject.match(/^(TD\w+)-/)?.[1] || 'TD'}-* 前缀的任务 +- 所有输出必须带 [${workerConfig.role}] 标识前缀 +- 仅与 coordinator 通信,不得直接联系其他 worker - case 'error_handler': { - const severity = msg.data?.severity || 'medium' - if (severity === 'critical') { - SendMessage({ - content: `## [coordinator] Critical Error from ${msg.from}\n\n${msg.summary}`, - summary: `[coordinator] Critical error: ${msg.summary}` - }) - } - break - } +## 消息总线(必须) +每次 SendMessage 前,先调用 mcp__ccw-tools__team_msg 记录。 + +## 工作流程(严格按顺序) +1. 调用 Skill(skill="team-tech-debt", args="${workerConfig.skillArgs}") 获取角色定义和执行逻辑 +2. 按 role.md 中的 5-Phase 流程执行 +3. team_msg log + SendMessage 结果给 coordinator +4. TaskUpdate({ taskId: "${stageTask.id}", status: "completed" })` +} +``` + +### Step 2.2: Stage Failure Handler + +```javascript +function handleStageFailure(stageTask, taskState, workerConfig, autoYes) { + if (autoYes) { + mcp__ccw-tools__team_msg({ + operation: "log", team: teamName, from: "coordinator", + to: "user", type: "error", + summary: `[coordinator] [auto] 阶段 ${stageTask.subject} 未完成 (status=${taskState.status}),自动跳过` + }) + TaskUpdate({ taskId: stageTask.id, status: 'deleted' }) + return 'skip' } + + const decision = AskUserQuestion({ + questions: [{ + question: `阶段 "${stageTask.subject}" worker 返回但未完成 (status=${taskState.status})。如何处理?`, + header: "Stage Fail", + multiSelect: false, + options: [ + { label: "重试", description: "重新 spawn worker 执行此阶段" }, + { label: "跳过", description: "标记为跳过,继续后续流水线" }, + { label: "终止", description: "停止整个流程,汇报当前结果" } + ] + }] + }) + + const answer = decision["Stage Fail"] + if (answer === "重试") { + // 重新 spawn worker(递归单次) + TaskUpdate({ taskId: stageTask.id, status: 'in_progress' }) + const retryResult = Task({ + subagent_type: "general-purpose", + prompt: buildWorkerPrompt(stageTask, workerConfig, sessionFolder, taskDescription), + run_in_background: false + }) + const retryState = TaskGet({ taskId: stageTask.id }) + if (retryState.status !== 'completed') { + TaskUpdate({ taskId: stageTask.id, status: 'deleted' }) + } + return 'retried' + } else if (answer === "跳过") { + TaskUpdate({ taskId: stageTask.id, status: 'deleted' }) + return 'skip' + } else { + mcp__ccw-tools__team_msg({ + operation: "log", team: teamName, from: "coordinator", + to: "user", type: "shutdown", + summary: `[coordinator] 用户终止流水线,当前阶段: ${stageTask.subject}` + }) + return 'abort' + } +} +``` + +### Step 2.3: Validation Evaluation + +```javascript +function evaluateValidationResult(sessionFolder) { + const latestMemory = JSON.parse(Read(`${sessionFolder}/shared-memory.json`)) + const debtBefore = latestMemory.debt_score_before || 0 + const debtAfter = latestMemory.debt_score_after || 0 + const regressions = latestMemory.validation_results?.regressions || 0 + const improved = debtAfter < debtBefore + + let status = 'PASS' + if (!improved && regressions > 0) status = 'FAIL' + else if (!improved) status = 'CONDITIONAL' + + mcp__ccw-tools__team_msg({ + operation: "log", team: teamName, from: "coordinator", + to: "user", type: "quality_gate", + summary: `[coordinator] 质量门控: ${status} (债务分 ${debtBefore} → ${debtAfter}, 回归 ${regressions})` + }) + + return regressions > 0 } ``` @@ -237,19 +263,15 @@ const summary = { ### Tasks: [completed]/[total] ### Fix-Verify Iterations: [count] ### Debt Score: [before] → [after] - -### Message Log (last 10) -- [timestamp] [from] → [to]: [type] - [summary] ``` ## Error Handling | Scenario | Resolution | |----------|------------| -| Message bus unavailable | Fall back to TaskList polling only | -| Stage timeout (交互模式) | AskUserQuestion: 继续等待 / 跳过 / 终止 | -| Stage timeout (自动模式 `-y`/`--yes`) | 自动跳过,记录日志 | -| Teammate unresponsive (2x no response) | Respawn teammate with same task | -| Deadlock detected | Identify cycle, manually unblock | +| Worker 返回但未 completed (交互模式) | AskUserQuestion: 重试 / 跳过 / 终止 | +| Worker 返回但未 completed (自动模式) | 自动跳过,记录日志 | +| Worker spawn 失败 | 重试一次,仍失败则上报用户 | | Quality gate FAIL | Report to user, suggest targeted re-run | | Fix-Verify loop stuck >3 iterations | Accept current state, continue pipeline | +| Shared memory 读取失败 | 降级为 TaskList 状态判断 | diff --git a/.claude/skills/team-tech-debt/roles/coordinator/role.md b/.claude/skills/team-tech-debt/roles/coordinator/role.md index 3e9fe499..a3d0b23d 100644 --- a/.claude/skills/team-tech-debt/roles/coordinator/role.md +++ b/.claude/skills/team-tech-debt/roles/coordinator/role.md @@ -121,7 +121,7 @@ if (!autoYes && (!taskDescription || taskDescription.length < 10)) { } ``` -### Phase 2: Create Team + Spawn Teammates +### Phase 2: Create Team + Initialize Session ```javascript const teamName = "tech-debt" @@ -143,8 +143,9 @@ Write(`${sessionFolder}/shared-memory.json`, JSON.stringify({ TeamCreate({ team_name: teamName }) -// Spawn teammates (see SKILL.md Coordinator Spawn Template) -// Scanner, Assessor, Planner, Executor, Validator +// ⚠️ 不在此阶段 spawn worker +// Worker 在 Phase 4 (monitor) 中按阶段按需 spawn(Stop-Wait 策略) +// 这避免了 worker 先启动但无任务可做的鸡生蛋问题 ``` ### Phase 3: Create Task Chain @@ -171,28 +172,33 @@ TDSCAN-001(扫描) → TDEVAL-001(评估) → TDPLAN-001(规划) → TDFIX-001( TDPLAN-001(规划) → TDFIX-001(修复) → TDVAL-001(验证) ``` -### Phase 4: Coordination Loop +### Phase 4: Sequential Stage Execution (Stop-Wait) ```javascript // Read commands/monitor.md for full implementation Read("commands/monitor.md") ``` -| Received Message | Action | -|-----------------|--------| -| `scan_complete` | 标记 TDSCAN complete → 解锁 TDEVAL | -| `assessment_complete` | 标记 TDEVAL complete → 解锁 TDPLAN | -| `plan_ready` | 标记 TDPLAN complete → 解锁 TDFIX | -| `fix_complete` | 标记 TDFIX complete → 解锁 TDVAL | -| `validation_complete` | 标记 TDVAL complete → 评估质量门控 | -| `regression_found` | 评估回归 → 触发 Fix-Verify 循环(max 3) | -| Worker: `error` | 评估严重性 → 重试或上报用户 | +> **策略**: 逐阶段 spawn worker,同步阻塞等待返回。Worker 返回即阶段完成,无需轮询。 +> +> - ❌ 禁止: while 循环 + sleep + 检查状态 +> - ✅ 采用: `Task(run_in_background: false)` 同步调用 = 天然回调 -**Fix-Verify 循环逻辑**: +**阶段流转**: + +| 当前阶段 | Worker | 完成后 | +|----------|--------|--------| +| TDSCAN-001 | scanner | → 启动 TDEVAL | +| TDEVAL-001 | assessor | → 启动 TDPLAN | +| TDPLAN-001 | planner | → 启动 TDFIX | +| TDFIX-001 | executor | → 启动 TDVAL | +| TDVAL-001 | validator | → 评估质量门控 | + +**Fix-Verify 循环**(TDVAL 阶段发现回归时): ```javascript if (regressionFound && fixVerifyIteration < 3) { fixVerifyIteration++ - // 创建 TDFIX-fix 任务 → TDVAL 重新验证 + // 创建 TDFIX-fix + TDVAL-verify 任务,追加到 pipeline 继续执行 } else if (fixVerifyIteration >= 3) { // 接受当前状态,继续汇报 mcp__ccw-tools__team_msg({ diff --git a/.claude/skills/team-testing/SKILL.md b/.claude/skills/team-testing/SKILL.md index f0f827ad..b7c8ddbe 100644 --- a/.claude/skills/team-testing/SKILL.md +++ b/.claude/skills/team-testing/SKILL.md @@ -103,7 +103,6 @@ mcp__ccw-tools__team_msg({ summary: `[${role}] ...` }) const TEAM_CONFIG = { name: "testing", sessionDir: ".workflow/.team/TST-{slug}-{date}/", - msgDir: ".workflow/.team-msg/testing/", sharedMemory: "shared-memory.json", testLayers: { L1: { name: "Unit Tests", coverage_target: 80 }, diff --git a/.claude/skills/team-testing/roles/coordinator.md b/.claude/skills/team-testing/roles/coordinator.md index d721c50c..8b8da638 100644 --- a/.claude/skills/team-testing/roles/coordinator.md +++ b/.claude/skills/team-testing/roles/coordinator.md @@ -86,7 +86,7 @@ AskUserQuestion({ }) ``` -### Phase 2: Create Team + Spawn Workers +### Phase 2: Create Team + Initialize Session ```javascript TeamCreate({ team_name: teamName }) @@ -130,7 +130,9 @@ const teamSession = { Write(`${sessionFolder}/team-session.json`, JSON.stringify(teamSession, null, 2)) ``` -Spawn workers (see SKILL.md Coordinator Spawn Template). +// ⚠️ Workers are NOT pre-spawned here. +// Workers are spawned per-stage in Phase 4 via Stop-Wait Task(run_in_background: false). +// See SKILL.md Coordinator Spawn Template for worker prompt templates. ### Phase 3: Create Task Chain @@ -178,6 +180,13 @@ TaskUpdate({ taskId: anaId, owner: "analyst", addBlockedBy: [run2Id] }) ### Phase 4: Coordination Loop + Generator-Critic Control +> **设计原则(Stop-Wait)**: 模型执行没有时间概念,禁止任何形式的轮询等待。 +> - ❌ 禁止: `while` 循环 + `sleep` + 检查状态 +> - ✅ 采用: 同步 `Task(run_in_background: false)` 调用,Worker 返回 = 阶段完成信号 +> +> 按 Phase 3 创建的任务链顺序,逐阶段 spawn worker 同步执行。 +> Worker prompt 使用 SKILL.md Coordinator Spawn Template。 + | Received Message | Action | |-----------------|--------| | strategist: strategy_ready | Read strategy → team_msg log → TaskUpdate completed | diff --git a/.claude/skills/team-uidesign/roles/coordinator.md b/.claude/skills/team-uidesign/roles/coordinator.md index f9f06930..fcca5b0d 100644 --- a/.claude/skills/team-uidesign/roles/coordinator.md +++ b/.claude/skills/team-uidesign/roles/coordinator.md @@ -125,7 +125,7 @@ const industryConfig = { }[industryChoice] || { strictness: 'standard', mustHave: [] } ``` -### Phase 2: Create Team + Spawn Workers +### Phase 2: Create Team + Initialize Session ```javascript TeamCreate({ team_name: teamName }) @@ -174,7 +174,9 @@ const teamSession = { } Write(`${sessionFolder}/team-session.json`, JSON.stringify(teamSession, null, 2)) -// Spawn 4 workers (see SKILL.md Coordinator Spawn Template) +// ⚠️ Workers are NOT pre-spawned here. +// Workers are spawned per-stage in Phase 4 via Stop-Wait Task(run_in_background: false). +// See SKILL.md Coordinator Spawn Template for worker prompt templates. ``` ### Phase 3: Create Task Chain @@ -246,6 +248,13 @@ TaskUpdate({ taskId: audit3Id, owner: "reviewer", addBlockedBy: [build2Id] }) ### Phase 4: Coordination Loop +> **设计原则(Stop-Wait)**: 模型执行没有时间概念,禁止任何形式的轮询等待。 +> - ❌ 禁止: `while` 循环 + `sleep` + 检查状态 +> - ✅ 采用: 同步 `Task(run_in_background: false)` 调用,Worker 返回 = 阶段完成信号 +> +> 按 Phase 3 创建的任务链顺序,逐阶段 spawn worker 同步执行。 +> Worker prompt 使用 SKILL.md Coordinator Spawn Template。 + #### Message Handling | Received Message | Action | diff --git a/.claude/skills/team-ultra-analyze/roles/coordinator/commands/monitor.md b/.claude/skills/team-ultra-analyze/roles/coordinator/commands/monitor.md index 399b9d8a..6b3f445e 100644 --- a/.claude/skills/team-ultra-analyze/roles/coordinator/commands/monitor.md +++ b/.claude/skills/team-ultra-analyze/roles/coordinator/commands/monitor.md @@ -20,8 +20,15 @@ ### 设计原则 -> **模型执行没有时间概念**。禁止空转 while 循环检查状态。 -> 使用固定 sleep 间隔 + 最大轮询次数,避免无意义的 API 调用浪费。 +> **模型执行没有时间概念,禁止任何形式的轮询等待。** +> +> - ❌ 禁止: `while` 循环 + `sleep` + 检查状态(空转浪费 API 轮次) +> - ❌ 禁止: `Bash(sleep N)` / `Bash(timeout /t N)` 作为等待手段 +> - ✅ 采用: 同步 `Task()` 调用(`run_in_background: false`),call 本身即等待 +> - ✅ 采用: Worker 返回 = 阶段完成信号(天然回调) +> +> **原理**: `Task(run_in_background: false)` 是阻塞调用,coordinator 自动挂起直到 worker 返回。 +> 无需 sleep,无需轮询,无需消息总线监控。Worker 的返回就是回调。 ### Decision Logic @@ -41,14 +48,15 @@ const routingTable = { } ``` -### 等待策略常量 +### Stage-Worker 映射表 ```javascript -const POLL_INTERVAL_SEC = 300 // 每次检查间隔 5 分钟 -const MAX_POLLS_PER_STAGE = 6 // 单阶段最多等待 6 次(~30 分钟) -const SLEEP_CMD = process.platform === 'win32' - ? `timeout /t ${POLL_INTERVAL_SEC} /nobreak >nul 2>&1` - : `sleep ${POLL_INTERVAL_SEC}` +const STAGE_WORKER_MAP = { + 'EXPLORE': { role: 'explorer', skillArgs: '--role=explorer' }, + 'ANALYZE': { role: 'analyst', skillArgs: '--role=analyst' }, + 'DISCUSS': { role: 'discussant', skillArgs: '--role=discussant' }, + 'SYNTH': { role: 'synthesizer', skillArgs: '--role=synthesizer' } +} // ★ 统一 auto mode 检测 const autoYes = /\b(-y|--yes)\b/.test(args) @@ -72,9 +80,10 @@ const pipelineTasks = allTasks .sort((a, b) => Number(a.id) - Number(b.id)) ``` -### Step 2: Stage-Driven Execution (Exploration + Analysis) +### Step 2: Sequential Stage Execution (Stop-Wait) — Exploration + Analysis -> 按 pipeline 阶段顺序,逐阶段等待完成。 +> **核心**: 逐阶段 spawn worker,同步阻塞等待返回。 +> Worker 返回 = 阶段完成。无 sleep、无轮询、无消息总线监控。 ```javascript // 处理 EXPLORE 和 ANALYZE 阶段 @@ -83,33 +92,57 @@ const preDiscussionTasks = pipelineTasks.filter(t => ) for (const stageTask of preDiscussionTasks) { - let stageComplete = false - let pollCount = 0 + // 1. 提取阶段前缀 → 确定 worker 角色 + const stagePrefix = stageTask.subject.match(/^(\w+)-/)?.[1] + const workerConfig = STAGE_WORKER_MAP[stagePrefix] - while (!stageComplete && pollCount < MAX_POLLS_PER_STAGE) { - Bash(SLEEP_CMD) - pollCount++ + if (!workerConfig) continue - // 1. 检查消息总线 - const messages = mcp__ccw-tools__team_msg({ - operation: "list", team: teamName, last: 5 + // 2. 标记任务为执行中 + TaskUpdate({ taskId: stageTask.id, status: 'in_progress' }) + + mcp__ccw-tools__team_msg({ + operation: "log", team: teamName, from: "coordinator", + to: workerConfig.role, type: "task_unblocked", + summary: `[coordinator] 启动阶段: ${stageTask.subject} → ${workerConfig.role}` + }) + + // 3. 同步 spawn worker — 阻塞直到 worker 返回(Stop-Wait 核心) + const workerResult = Task({ + subagent_type: "general-purpose", + prompt: `你是 team "${teamName}" 的 ${workerConfig.role.toUpperCase()}。 + +## ⚠️ 首要指令(MUST) +Skill(skill="team-ultra-analyze", args="${workerConfig.skillArgs}") + +## 当前任务 +- 任务 ID: ${stageTask.id} +- 任务: ${stageTask.subject} +- 描述: ${stageTask.description || taskDescription} +- Session: ${sessionFolder} + +## 角色准则(强制) +- 所有输出必须带 [${workerConfig.role}] 标识前缀 +- 仅与 coordinator 通信 + +## 工作流程 +1. Skill(skill="team-ultra-analyze", args="${workerConfig.skillArgs}") 获取角色定义 +2. 执行任务 → 汇报结果 +3. TaskUpdate({ taskId: "${stageTask.id}", status: "completed" })`, + run_in_background: false + }) + + // 4. Worker 已返回 — 检查结果 + const taskState = TaskGet({ taskId: stageTask.id }) + + if (taskState.status !== 'completed') { + handleStageTimeout(stageTask, 0, autoYes) + } else { + mcp__ccw-tools__team_msg({ + operation: "log", team: teamName, from: "coordinator", + to: "user", type: "quality_gate", + summary: `[coordinator] 阶段完成: ${stageTask.subject}` }) - - // 2. 路由消息 - for (const msg of messages) { - const handler = routingTable[msg.type] - if (!handler) continue - processMessage(msg, handler) - } - - // 3. 确认任务状态(兜底) - const currentTask = TaskGet({ taskId: stageTask.id }) - stageComplete = currentTask.status === 'completed' || currentTask.status === 'deleted' - } - - // 阶段超时处理 - if (!stageComplete) { - handleStageTimeout(stageTask, pollCount, autoYes) } } ``` @@ -164,9 +197,21 @@ if (MAX_DISCUSSION_ROUNDS === 0) { // Then enter discussion loop while (discussionRound < MAX_DISCUSSION_ROUNDS) { - // 等待当前 DISCUSS 任务完成 + // 等待当前 DISCUSS 任务完成(Stop-Wait: spawn discussant worker) const currentDiscussId = `DISCUSS-${String(discussionRound + 1).padStart(3, '0')}` - // ... wait for completion (same pattern as Step 2) + const discussTask = pipelineTasks.find(t => t.subject.startsWith(currentDiscussId)) + if (discussTask) { + TaskUpdate({ taskId: discussTask.id, status: 'in_progress' }) + const discussResult = Task({ + subagent_type: "general-purpose", + prompt: `你是 team "${teamName}" 的 DISCUSSANT。 +Skill(skill="team-ultra-analyze", args="--role=discussant") +当前任务: ${discussTask.subject} +Session: ${sessionFolder} +TaskUpdate({ taskId: "${discussTask.id}", status: "completed" })`, + run_in_background: false + }) + } // 收集用户反馈 const feedbackResult = AskUserQuestion({ @@ -291,14 +336,12 @@ ${data.updated_understanding || '(Updated by discussant)'} Write(`${sessionFolder}/discussion.md`, currentContent + roundContent) } -function handleStageTimeout(stageTask, pollCount, autoYes) { - const elapsedMin = Math.round(pollCount * POLL_INTERVAL_SEC / 60) - +function handleStageTimeout(stageTask, _unused, autoYes) { if (autoYes) { mcp__ccw-tools__team_msg({ operation: "log", team: teamName, from: "coordinator", to: "user", type: "error", - summary: `[coordinator] [auto] 阶段 ${stageTask.subject} 超时 (${elapsedMin}min),自动跳过` + summary: `[coordinator] [auto] 阶段 ${stageTask.subject} worker 返回但未完成,自动跳过` }) TaskUpdate({ taskId: stageTask.id, status: 'deleted' }) return @@ -306,18 +349,18 @@ function handleStageTimeout(stageTask, pollCount, autoYes) { const decision = AskUserQuestion({ questions: [{ - question: `阶段 "${stageTask.subject}" 已等待 ${elapsedMin} 分钟仍未完成。如何处理?`, - header: "Stage Wait", + question: `阶段 "${stageTask.subject}" worker 返回但未完成。如何处理?`, + header: "Stage Fail", multiSelect: false, options: [ - { label: "继续等待", description: `再等 ${MAX_POLLS_PER_STAGE} 轮` }, + { label: "重试", description: "重新 spawn worker 执行此阶段" }, { label: "跳过此阶段", description: "标记为跳过,继续后续流水线" }, { label: "终止流水线", description: "停止整个分析流程" } ] }] }) - const answer = decision["Stage Wait"] + const answer = decision["Stage Fail"] if (answer === "跳过此阶段") { TaskUpdate({ taskId: stageTask.id, status: 'deleted' }) } else if (answer === "终止流水线") { @@ -333,8 +376,20 @@ function handleStageTimeout(stageTask, pollCount, autoYes) { ### Step 4: Wait for Synthesis + Result Processing ```javascript -// 等待 SYNTH-001 完成 -// ... same wait pattern +// 等待 SYNTH-001 完成(Stop-Wait: spawn synthesizer worker) +const synthTask = pipelineTasks.find(t => t.subject.startsWith('SYNTH-')) +if (synthTask) { + TaskUpdate({ taskId: synthTask.id, status: 'in_progress' }) + const synthResult = Task({ + subagent_type: "general-purpose", + prompt: `你是 team "${teamName}" 的 SYNTHESIZER。 +Skill(skill="team-ultra-analyze", args="--role=synthesizer") +当前任务: ${synthTask.subject} +Session: ${sessionFolder} +TaskUpdate({ taskId: "${synthTask.id}", status: "completed" })`, + run_in_background: false + }) +} // 汇总所有结果 const finalMemory = JSON.parse(Read(`${sessionFolder}/shared-memory.json`)) @@ -368,9 +423,8 @@ const summary = { | Scenario | Resolution | |----------|------------| -| Message bus unavailable | Fall back to TaskList polling only | -| Stage timeout (交互模式) | AskUserQuestion:继续等待 / 跳过 / 终止 | -| Stage timeout (自动模式) | 自动跳过,记录日志 | -| Teammate unresponsive (2x) | Respawn teammate with same task | +| Worker 返回但未 completed (交互模式) | AskUserQuestion: 重试 / 跳过 / 终止 | +| Worker 返回但未 completed (自动模式) | 自动跳过,记录日志 | +| Worker spawn 失败 | 重试一次,仍失败则上报用户 | | Discussion loop stuck >5 rounds | Force synthesis, offer continuation | | Synthesis fails | Report partial results from analyses | diff --git a/.claude/skills/team-ultra-analyze/roles/coordinator/role.md b/.claude/skills/team-ultra-analyze/roles/coordinator/role.md index 7f42a88b..8b566c38 100644 --- a/.claude/skills/team-ultra-analyze/roles/coordinator/role.md +++ b/.claude/skills/team-ultra-analyze/roles/coordinator/role.md @@ -163,7 +163,7 @@ if (!autoYes) { } ``` -### Phase 2: Create Team + Spawn Teammates +### Phase 2: Create Team + Initialize Session ```javascript const teamName = "ultra-analyze" @@ -210,7 +210,9 @@ Write(`${sessionFolder}/discussion.md`, `# Analysis Discussion TeamCreate({ team_name: teamName }) -// Spawn teammates (see SKILL.md Coordinator Spawn Template) +// ⚠️ Workers are NOT pre-spawned here. +// Workers are spawned per-stage in Phase 4 via Stop-Wait Task(run_in_background: false). +// See SKILL.md Coordinator Spawn Template for worker prompt templates. // Quick mode: 1 explorer + 1 analyst (single agents) // Standard/Deep mode: N explorers + N analysts (parallel agents with distinct names) // explorer-1, explorer-2... / analyst-1, analyst-2... for true parallel execution @@ -243,6 +245,13 @@ EXPLORE-001 → ANALYZE-001 → SYNTH-001 ### Phase 4: Discussion Loop + Coordination +> **设计原则(Stop-Wait)**: 模型执行没有时间概念,禁止任何形式的轮询等待。 +> - ❌ 禁止: `while` 循环 + `sleep` + 检查状态 +> - ✅ 采用: 同步 `Task(run_in_background: false)` 调用,Worker 返回 = 阶段完成信号 +> +> 按 Phase 3 创建的任务链顺序,逐阶段 spawn worker 同步执行。 +> Worker prompt 使用 SKILL.md Coordinator Spawn Template。 + ```javascript // Read commands/monitor.md for full implementation Read("commands/monitor.md")