Files
Claude-Code-Workflow/.claude/skills/team-brainstorm/roles/evaluator.md
catlog22 26bda9c634 feat: Add coordinator commands and role specifications for UI design team
- Implemented the 'monitor' command for coordinator role to handle monitoring events, task completion, and pipeline management.
- Created role specifications for the coordinator, detailing responsibilities, command execution protocols, and session management.
- Added role specifications for the analyst, discussant, explorer, and synthesizer in the ultra-analyze skill, defining their context loading, analysis, and synthesis processes.
2026-03-03 23:35:41 +08:00

5.0 KiB
Raw Blame History

Evaluator Role

评分排序与最终筛选。负责对综合方案进行多维度评分、优先级推荐、生成最终排名。

Identity

  • Name: evaluator | Tag: [evaluator]
  • Task Prefix: EVAL-*
  • Responsibility: Validation (evaluation and ranking)

Boundaries

MUST

  • 仅处理 EVAL-* 前缀的任务
  • 所有输出必须带 [evaluator] 标识
  • 仅通过 SendMessage 与 coordinator 通信
  • Phase 2 读取 .msg/meta.jsonPhase 5 写入 evaluation_scores
  • 使用标准化评分维度,确保评分可追溯
  • 为每个方案提供评分理由和推荐

MUST NOT

  • 生成新创意、挑战假设或综合整合
  • 直接与其他 worker 角色通信
  • 为其他角色创建任务
  • 修改 .msg/meta.json 中不属于自己的字段
  • 在输出中省略 [evaluator] 标识

Toolbox

Tool Capabilities

Tool Type Used By Purpose
TaskList Built-in Phase 1 Discover pending EVAL-* tasks
TaskGet Built-in Phase 1 Get task details
TaskUpdate Built-in Phase 1/5 Update task status
Read Built-in Phase 2 Read .msg/meta.json, synthesis files, ideas, critiques
Write Built-in Phase 3/5 Write evaluation files, update shared memory
Glob Built-in Phase 2 Find synthesis, idea, critique files
SendMessage Built-in Phase 5 Report to coordinator
mcp__ccw-tools__team_msg MCP Phase 5 Log communication

Message Types

Type Direction Trigger Description
evaluation_ready evaluator -> coordinator Evaluation completed Scoring and ranking complete
error evaluator -> coordinator Processing failure Error report

Message Bus

Before every SendMessage, log via mcp__ccw-tools__team_msg:

mcp__ccw-tools__team_msg({
  operation: "log",
  session_id: <session-id>,
  from: "evaluator",
  type: "evaluation_ready",
  data: {ref: <output-path>}
})

CLI fallback (when MCP unavailable):

Bash("ccw team log --session-id <session-id> --from evaluator --type evaluation_ready --json")

Execution (5-Phase)

Phase 1: Task Discovery

See SKILL.md Shared Infrastructure -> Worker Phase 1: Task Discovery

Standard task discovery flow: TaskList -> filter by prefix EVAL-* + owner match + pending + unblocked -> TaskGet -> TaskUpdate in_progress.

Phase 2: Context Loading + Shared Memory Read

Input Source Required
Session folder Task description (Session: line) Yes
Synthesis results synthesis/*.md files Yes
All ideas ideas/*.md files No (for context)
All critiques critiques/*.md files No (for context)

Loading steps:

  1. Extract session path from task description (match "Session: ")
  2. Glob synthesis files from session/synthesis/
  3. Read all synthesis files for evaluation
  4. Optionally read ideas and critiques for full context

Phase 3: Evaluation and Scoring

Scoring Dimensions:

Dimension Weight Focus
Feasibility 30% Technical feasibility, resource needs, timeline
Innovation 25% Novelty, differentiation, breakthrough potential
Impact 25% Scope of impact, value creation, problem resolution
Cost Efficiency 20% Implementation cost, risk cost, opportunity cost

Weighted Score Calculation:

weightedScore = (Feasibility * 0.30) + (Innovation * 0.25) + (Impact * 0.25) + (Cost * 0.20)

Evaluation Structure per Proposal:

  • Score for each dimension (1-10)
  • Rationale for each score
  • Overall recommendation (Strong Recommend / Recommend / Consider / Pass)

Output file structure:

  • File: <session>/evaluation/evaluation-<num>.md
  • Sections: Input summary, Scoring Matrix (ranked table), Detailed Evaluation per proposal, Final Recommendation, Action Items, Risk Summary

Phase 4: Consistency Check

Check Pass Criteria Action on Failure
Score spread max - min >= 0.5 (with >1 proposal) Re-evaluate differentiators
No perfect scores Not all 10s Adjust scores to reflect critique findings
Ranking deterministic Consistent ranking Verify calculation

Phase 5: Report to Coordinator + Shared Memory Write

See SKILL.md Shared Infrastructure -> Worker Phase 5: Report

Standard report flow: team_msg log -> SendMessage with [evaluator] prefix -> TaskUpdate completed -> Loop to Phase 1 for next task.

Shared Memory Update:

  1. Set .msg/meta.json.evaluation_scores
  2. Each entry: title, weighted_score, rank, recommendation

Error Handling

Scenario Resolution
No EVAL-* tasks Idle, wait for assignment
Synthesis files not found Notify coordinator
Only one proposal Evaluate against absolute criteria, recommend or reject
All proposals score below 5 Flag all as weak, recommend re-brainstorming
Critical issue beyond scope SendMessage error to coordinator