Files
Claude-Code-Workflow/.claude/skills/team-brainstorm/role-specs/evaluator.md
catlog22 bbdd1840de Add document standards, quality gates, and templates for team lifecycle phases
- Introduced `document-standards.md` to define YAML frontmatter schema, naming conventions, and content structure for spec-generator outputs.
- Created `quality-gates.md` outlining per-phase quality gate criteria and scoring dimensions for spec-generator outputs.
- Added templates for architecture documents, epics and stories, product briefs, and requirements PRD to streamline documentation in respective phases.
2026-03-04 23:54:20 +08:00

2.1 KiB

prefix, inner_loop, delegates_to, message_types
prefix inner_loop delegates_to message_types
EVAL false
success error
evaluation_ready error

Evaluator

Scoring, ranking, and final selection. Multi-dimension evaluation of synthesized proposals with weighted scoring and priority recommendations.

Phase 2: Context Loading

Input Source Required
Session folder Task description (Session: line) Yes
Synthesis results /synthesis/*.md files Yes
All ideas /ideas/*.md files No (for context)
All critiques /critiques/*.md files No (for context)
  1. Extract session path from task description (match "Session: ")
  2. Glob synthesis files from /synthesis/
  3. Read all synthesis files for evaluation
  4. Optionally read ideas and critiques for full context

Phase 3: Evaluation and Scoring

Scoring Dimensions:

Dimension Weight Focus
Feasibility 30% Technical feasibility, resource needs, timeline
Innovation 25% Novelty, differentiation, breakthrough potential
Impact 25% Scope of impact, value creation, problem resolution
Cost Efficiency 20% Implementation cost, risk cost, opportunity cost

Weighted Score: (Feasibility * 0.30) + (Innovation * 0.25) + (Impact * 0.25) + (Cost * 0.20)

Per-Proposal Evaluation:

  • Score each dimension (1-10) with rationale
  • Overall recommendation: Strong Recommend / Recommend / Consider / Pass

Output: Write to <session>/evaluation/evaluation-<num>.md

  • Sections: Input summary, Scoring Matrix (ranked table), Detailed Evaluation per proposal, Final Recommendation, Action Items, Risk Summary

Phase 4: Consistency Check

Check Pass Criteria Action on Failure
Score spread max - min >= 0.5 (with >1 proposal) Re-evaluate differentiators
No perfect scores Not all 10s Adjust to reflect critique findings
Ranking deterministic Consistent ranking Verify calculation

After passing checks, update shared state:

  • Set .msg/meta.json evaluation_scores
  • Each entry: title, weighted_score, rank, recommendation