mirror of https://github.com/catlog22/Claude-Code-Workflow.git synced 2026-03-26 19:56:37 +08:00

Files

catlog22 1e560ab8e8 feat: migrate all codex team skills from spawn_agents_on_csv to spawn_agent + wait_agent architecture

- Delete 21 old team skill directories using CSV-wave pipeline pattern (~100+ files)
- Delete old team-lifecycle (v3) and team-planex-v2
- Create generic team-worker.toml and team-supervisor.toml (replacing tlv4-specific TOMLs)
- Convert 19 team skills from Claude Code format (Agent/SendMessage/TaskCreate)
  to Codex format (spawn_agent/wait_agent/tasks.json/request_user_input)
- Update team-lifecycle-v4 to use generic agent types (team_worker/team_supervisor)
- Convert all coordinator role files: dispatch.md, monitor.md, role.md
- Convert all worker role files: remove run_in_background, fix Bash syntax
- Convert all specs/pipelines.md references
- Final state: 20 team skills, 217 .md files, zero Claude Code API residuals

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-24 16:54:48 +08:00

2.1 KiB

Raw Blame History

role, prefix, inner_loop, message_types

role

prefix

inner_loop

message_types

evaluator

EVAL

false

state_update

Evaluator

Scoring, ranking, and final selection. Multi-dimension evaluation of synthesized proposals with weighted scoring and priority recommendations.

Phase 2: Context Loading

Input	Source	Required
Session folder	Task description (Session: line)	Yes
Synthesis results	/synthesis/*.md files	Yes
All ideas	/ideas/*.md files	No (for context)
All critiques	/critiques/*.md files	No (for context)

Extract session path from task description (match "Session: ")
Glob synthesis files from /synthesis/
Read all synthesis files for evaluation
Optionally read ideas and critiques for full context

Phase 3: Evaluation and Scoring

Scoring Dimensions:

Dimension	Weight	Focus
Feasibility	30%	Technical feasibility, resource needs, timeline
Innovation	25%	Novelty, differentiation, breakthrough potential
Impact	25%	Scope of impact, value creation, problem resolution
Cost Efficiency	20%	Implementation cost, risk cost, opportunity cost

Weighted Score: (Feasibility * 0.30) + (Innovation * 0.25) + (Impact * 0.25) + (Cost * 0.20)

Per-Proposal Evaluation:

Score each dimension (1-10) with rationale
Overall recommendation: Strong Recommend / Recommend / Consider / Pass

Output: Write to <session>/evaluation/evaluation-<num>.md

Sections: Input summary, Scoring Matrix (ranked table), Detailed Evaluation per proposal, Final Recommendation, Action Items, Risk Summary

Phase 4: Consistency Check

Check	Pass Criteria	Action on Failure
Score spread	max - min >= 0.5 (with >1 proposal)	Re-evaluate differentiators
No perfect scores	Not all 10s	Adjust to reflect critique findings
Ranking deterministic	Consistent ranking	Verify calculation

After passing checks, update shared state:

Set .msg/meta.json evaluation_scores
Each entry: title, weighted_score, rank, recommendation

2.1 KiB Raw Blame History

Evaluator

Phase 2: Context Loading

Phase 3: Evaluation and Scoring

Phase 4: Consistency Check

2.1 KiB

Raw Blame History