Files
Claude-Code-Workflow/.claude/skills/codex-skill-designer/specs/quality-standards.md
catlog22 a4fff6a591 feat: Add orchestrator template and roles for executor and planner
- Created a new orchestrator template for Codex skill design, detailing structure and execution phases.
- Introduced the executor role with responsibilities for task execution, including routing to backends and handling implementation.
- Added the planner role for requirement breakdown, issue creation, and task dispatching, ensuring a structured planning process.
2026-02-16 00:17:15 +08:00

164 lines
5.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Quality Standards
Quality criteria and validation gates for generated Codex skills.
## Purpose
| Phase | Usage |
|-------|-------|
| Phase 3 | Reference during generation |
| Phase 4 | Apply during validation |
---
## 1. Quality Dimensions
### 1.1 Structural Completeness (30%)
| Check | Weight | Criteria |
|-------|--------|----------|
| Orchestrator exists | 5 | File present at expected path |
| Frontmatter valid | 3 | Contains name, description |
| Architecture diagram | 3 | ASCII flow showing spawn/wait/close |
| Agent Registry | 4 | Table with all agents, role paths, responsibilities |
| Phase Execution blocks | 5 | Code blocks for each phase with spawn/wait/close |
| Lifecycle Management | 5 | Timeout handling + cleanup protocol |
| Agent files complete | 5 | All new agent roles have complete role files |
**Scoring**: Each check passes (full weight) or fails (0). Total = sum / max.
### 1.2 Pattern Compliance (40%)
| Check | Weight | Criteria |
|-------|--------|----------|
| Lifecycle balanced | 6 | Every spawn_agent has matching close_agent |
| Role loading correct | 6 | MANDATORY FIRST STEPS pattern used (not inline content) |
| Wait for results | 5 | wait() used for results (not close_agent) |
| Batch wait for parallel | 5 | Parallel agents use wait({ ids: [...] }) |
| Timeout specified | 4 | All wait() calls have timeout_ms |
| Timeout handled | 4 | timed_out checked after every wait() |
| Structured output | 5 | Agents produce Summary/Findings/Changes/Tests/Questions |
| No Claude patterns | 5 | No Task(), TaskOutput(), resume: remaining |
**Scoring**: Each check passes (full weight) or fails (0). Total = sum / max.
### 1.3 Content Quality (30%)
| Check | Weight | Criteria |
|-------|--------|----------|
| Orchestrator substantive | 4 | Content > 500 chars, not boilerplate |
| Code blocks present | 3 | >= 4 code blocks with executable patterns |
| Error handling | 3 | Timeout + recovery + partial results handling |
| No placeholders | 4 | No `{{...}}` or `TODO` remaining in output |
| Agent roles substantive | 4 | Each agent role > 300 chars with actionable steps |
| Output format defined | 3 | Structured output template in each agent |
| Goals/scope clear | 4 | Every spawn_agent has Goal + Scope + Deliverables |
| Conversion faithful | 5 | Source content preserved (if converting) |
**Scoring**: Each check passes (full weight) or fails (0). Total = sum / max.
## 2. Quality Gates
| Verdict | Score | Action |
|---------|-------|--------|
| **PASS** | >= 80% | Deliver to target location |
| **REVIEW** | 60-79% | Report issues, user decides |
| **FAIL** | < 60% | Block delivery, list critical issues |
### 2.1 Critical Failures (Auto-FAIL)
These issues force FAIL regardless of overall score:
1. **No orchestrator file** — skill has no entry point
2. **Task() calls in output** — runtime incompatible with Codex
3. **No agent registry** — agents cannot be identified
4. **Missing close_agent** — resource leak risk
5. **Inline role content** — violates Codex pattern (message bloat)
### 2.2 Warnings (Non-blocking)
1. **Missing timeout handling** — degraded reliability
2. **No error handling section** — reduced robustness
3. **Placeholder text remaining** — needs manual completion
4. **Phase files missing** — acceptable for simple skills
## 3. Validation Process
### 3.1 Automated Checks
```javascript
function validateSkill(generatedFiles, codexSkillConfig) {
const checks = []
// Structural
checks.push(checkFileExists(generatedFiles.orchestrator))
checks.push(checkFrontmatter(generatedFiles.orchestrator))
checks.push(checkSection(generatedFiles.orchestrator, "Architecture"))
checks.push(checkSection(generatedFiles.orchestrator, "Agent Registry"))
// ...
// Pattern compliance
const content = Read(generatedFiles.orchestrator)
checks.push(checkBalancedLifecycle(content))
checks.push(checkRoleLoading(content))
checks.push(checkWaitPattern(content))
// ...
// Content quality
checks.push(checkNoPlaceholders(content))
checks.push(checkSubstantiveContent(content))
// ...
// Critical failures
const criticals = checkCriticalFailures(content, generatedFiles)
if (criticals.length > 0) return { verdict: "FAIL", criticals }
// Score
const score = calculateWeightedScore(checks)
const verdict = score >= 80 ? "PASS" : score >= 60 ? "REVIEW" : "FAIL"
return { score, verdict, checks, issues: checks.filter(c => !c.passed) }
}
```
### 3.2 Manual Review Points
For REVIEW verdict, highlight these for user attention:
1. Agent role completeness — are all capabilities covered?
2. Interaction model appropriateness — right pattern for use case?
3. Timeout values — appropriate for expected task duration?
4. Scope definitions — clear boundaries for each agent?
5. Output format — suitable for downstream consumers?
## 4. Scoring Formula
```
Overall = Structural × 0.30 + PatternCompliance × 0.40 + ContentQuality × 0.30
```
Pattern compliance weighted highest because Codex runtime correctness is critical.
## 5. Quality Improvement Guidance
### Low Structural Score
- Add missing sections to orchestrator
- Create missing agent role files
- Add frontmatter to all files
### Low Pattern Score
- Add MANDATORY FIRST STEPS to all spawn_agent messages
- Replace inline role content with path references
- Add close_agent for every spawn_agent
- Add timeout_ms and timed_out handling to all wait calls
- Remove any remaining Claude patterns
### Low Content Score
- Expand agent role definitions with more specific steps
- Add concrete Goal/Scope/Deliverables to spawn messages
- Replace placeholders with actual content
- Add error handling for each phase