feat: Add orchestrator template and roles for executor and planner

- Created a new orchestrator template for Codex skill design, detailing structure and execution phases. - Introduced the executor role with responsibilities for task execution, including routing to backends and handling implementation. - Added the planner role for requirement breakdown, issue creation, and task dispatching, ensuring a structured planning process.
2026-03-01 15:03:57 +08:00 · 2026-02-16 00:17:15 +08:00
parent dc03862ca7
commit a4fff6a591
36 changed files with 4168 additions and 2589 deletions
--- a/.claude/skills/codex-skill-designer/specs/quality-standards.md
+++ b/.claude/skills/codex-skill-designer/specs/quality-standards.md
@@ -0,0 +1,163 @@
+# Quality Standards
+
+Quality criteria and validation gates for generated Codex skills.
+
+## Purpose
+
+| Phase | Usage |
+|-------|-------|
+| Phase 3 | Reference during generation |
+| Phase 4 | Apply during validation |
+
+---
+
+## 1. Quality Dimensions
+
+### 1.1 Structural Completeness (30%)
+
+| Check | Weight | Criteria |
+|-------|--------|----------|
+| Orchestrator exists | 5 | File present at expected path |
+| Frontmatter valid | 3 | Contains name, description |
+| Architecture diagram | 3 | ASCII flow showing spawn/wait/close |
+| Agent Registry | 4 | Table with all agents, role paths, responsibilities |
+| Phase Execution blocks | 5 | Code blocks for each phase with spawn/wait/close |
+| Lifecycle Management | 5 | Timeout handling + cleanup protocol |
+| Agent files complete | 5 | All new agent roles have complete role files |
+
+**Scoring**: Each check passes (full weight) or fails (0). Total = sum / max.
+
+### 1.2 Pattern Compliance (40%)
+
+| Check | Weight | Criteria |
+|-------|--------|----------|
+| Lifecycle balanced | 6 | Every spawn_agent has matching close_agent |
+| Role loading correct | 6 | MANDATORY FIRST STEPS pattern used (not inline content) |
+| Wait for results | 5 | wait() used for results (not close_agent) |
+| Batch wait for parallel | 5 | Parallel agents use wait({ ids: [...] }) |
+| Timeout specified | 4 | All wait() calls have timeout_ms |
+| Timeout handled | 4 | timed_out checked after every wait() |
+| Structured output | 5 | Agents produce Summary/Findings/Changes/Tests/Questions |
+| No Claude patterns | 5 | No Task(), TaskOutput(), resume: remaining |
+
+**Scoring**: Each check passes (full weight) or fails (0). Total = sum / max.
+
+### 1.3 Content Quality (30%)
+
+| Check | Weight | Criteria |
+|-------|--------|----------|
+| Orchestrator substantive | 4 | Content > 500 chars, not boilerplate |
+| Code blocks present | 3 | >= 4 code blocks with executable patterns |
+| Error handling | 3 | Timeout + recovery + partial results handling |
+| No placeholders | 4 | No `{{...}}` or `TODO` remaining in output |
+| Agent roles substantive | 4 | Each agent role > 300 chars with actionable steps |
+| Output format defined | 3 | Structured output template in each agent |
+| Goals/scope clear | 4 | Every spawn_agent has Goal + Scope + Deliverables |
+| Conversion faithful | 5 | Source content preserved (if converting) |
+
+**Scoring**: Each check passes (full weight) or fails (0). Total = sum / max.
+
+## 2. Quality Gates
+
+| Verdict | Score | Action |
+|---------|-------|--------|
+| **PASS** | >= 80% | Deliver to target location |
+| **REVIEW** | 60-79% | Report issues, user decides |
+| **FAIL** | < 60% | Block delivery, list critical issues |
+
+### 2.1 Critical Failures (Auto-FAIL)
+
+These issues force FAIL regardless of overall score:
+
+1. **No orchestrator file** — skill has no entry point
+2. **Task() calls in output** — runtime incompatible with Codex
+3. **No agent registry** — agents cannot be identified
+4. **Missing close_agent** — resource leak risk
+5. **Inline role content** — violates Codex pattern (message bloat)
+
+### 2.2 Warnings (Non-blocking)
+
+1. **Missing timeout handling** — degraded reliability
+2. **No error handling section** — reduced robustness
+3. **Placeholder text remaining** — needs manual completion
+4. **Phase files missing** — acceptable for simple skills
+
+## 3. Validation Process
+
+### 3.1 Automated Checks
+
+```javascript
+function validateSkill(generatedFiles, codexSkillConfig) {
+  const checks = []
+
+  // Structural
+  checks.push(checkFileExists(generatedFiles.orchestrator))
+  checks.push(checkFrontmatter(generatedFiles.orchestrator))
+  checks.push(checkSection(generatedFiles.orchestrator, "Architecture"))
+  checks.push(checkSection(generatedFiles.orchestrator, "Agent Registry"))
+  // ...
+
+  // Pattern compliance
+  const content = Read(generatedFiles.orchestrator)
+  checks.push(checkBalancedLifecycle(content))
+  checks.push(checkRoleLoading(content))
+  checks.push(checkWaitPattern(content))
+  // ...
+
+  // Content quality
+  checks.push(checkNoPlaceholders(content))
+  checks.push(checkSubstantiveContent(content))
+  // ...
+
+  // Critical failures
+  const criticals = checkCriticalFailures(content, generatedFiles)
+  if (criticals.length > 0) return { verdict: "FAIL", criticals }
+
+  // Score
+  const score = calculateWeightedScore(checks)
+  const verdict = score >= 80 ? "PASS" : score >= 60 ? "REVIEW" : "FAIL"
+
+  return { score, verdict, checks, issues: checks.filter(c => !c.passed) }
+}
+```
+
+### 3.2 Manual Review Points
+
+For REVIEW verdict, highlight these for user attention:
+
+1. Agent role completeness — are all capabilities covered?
+2. Interaction model appropriateness — right pattern for use case?
+3. Timeout values — appropriate for expected task duration?
+4. Scope definitions — clear boundaries for each agent?
+5. Output format — suitable for downstream consumers?
+
+## 4. Scoring Formula
+
+```
+Overall = Structural × 0.30 + PatternCompliance × 0.40 + ContentQuality × 0.30
+```
+
+Pattern compliance weighted highest because Codex runtime correctness is critical.
+
+## 5. Quality Improvement Guidance
+
+### Low Structural Score
+
+- Add missing sections to orchestrator
+- Create missing agent role files
+- Add frontmatter to all files
+
+### Low Pattern Score
+
+- Add MANDATORY FIRST STEPS to all spawn_agent messages
+- Replace inline role content with path references
+- Add close_agent for every spawn_agent
+- Add timeout_ms and timed_out handling to all wait calls
+- Remove any remaining Claude patterns
+
+### Low Content Score
+
+- Expand agent role definitions with more specific steps
+- Add concrete Goal/Scope/Deliverables to spawn messages
+- Replace placeholders with actual content
+- Add error handling for each phase