Files
Claude-Code-Workflow/.claude/commands/workflow/tdd-verify.md
catlog22 2f10305945 refactor(commands): replace SlashCommand with Skill tool invocation
Update all workflow command files to use Skill tool instead of SlashCommand:
- Change allowed-tools: SlashCommand(*) → Skill(*)
- Convert SlashCommand(command="/path", args="...") → Skill(skill="path", args="...")
- Update descriptive text references from SlashCommand to Skill
- Remove javascript language tags from code blocks (25 files)

Affected 25 command files across:
- workflow: plan, execute, init, lite-plan, lite-fix, etc.
- workflow/test: test-fix-gen, test-cycle-execute, tdd-plan, tdd-verify
- workflow/review: review-cycle-fix, review-module-cycle, review-session-cycle
- workflow/ui-design: codify-style, explore-auto, imitate-auto
- workflow/brainstorm: brainstorm-with-file, auto-parallel
- issue: discover, discover-by-prompt, plan
- ccw, ccw-debug

This aligns with the actual Skill tool interface which uses 'skill' and 'args' parameters.
2026-01-31 23:09:59 +08:00

18 KiB

name, description, argument-hint, allowed-tools
name description argument-hint allowed-tools
tdd-verify Verify TDD workflow compliance against Red-Green-Refactor cycles. Generates quality report with coverage analysis and quality gate recommendation. Orchestrates sub-commands for comprehensive validation. [optional: --session WFS-session-id] Skill(*), TodoWrite(*), Read(*), Write(*), Bash(*), Glob(*)

TDD Verification Command (/workflow:tdd-verify)

Goal

Verify TDD workflow execution quality by validating Red-Green-Refactor cycle compliance, test coverage completeness, and task chain structure integrity. This command orchestrates multiple analysis phases and generates a comprehensive compliance report with quality gate recommendation.

Output: A structured Markdown report saved to .workflow/active/WFS-{session}/TDD_COMPLIANCE_REPORT.md containing:

  • Executive summary with compliance score and quality gate recommendation
  • Task chain validation (TEST → IMPL → REFACTOR structure)
  • Test coverage metrics (line, branch, function)
  • Red-Green-Refactor cycle verification
  • Best practices adherence assessment
  • Actionable improvement recommendations

Operating Constraints

ORCHESTRATOR MODE:

  • This command coordinates multiple sub-commands (/workflow:tools:tdd-coverage-analysis, ccw cli)
  • MAY write output files: TDD_COMPLIANCE_REPORT.md (primary report), .process/*.json (intermediate artifacts)
  • MUST NOT modify source task files or implementation code
  • MUST NOT create or delete tasks in the workflow

Quality Gate Authority: The compliance report provides a binding recommendation (BLOCK_MERGE / REQUIRE_FIXES / PROCEED_WITH_CAVEATS / APPROVED) based on objective compliance criteria.

Coordinator Role

This command is a pure orchestrator: Execute 4 phases to verify TDD workflow compliance, test coverage, and Red-Green-Refactor cycle execution.

Core Responsibilities

  • Verify TDD task chain structure (TEST → IMPL → REFACTOR)
  • Analyze test coverage metrics
  • Validate TDD cycle execution quality
  • Generate compliance report with quality gate recommendation

Execution Process

Input Parsing:
   └─ Decision (session argument):
      ├─ --session provided → Use provided session
      └─ No session → Auto-detect active session

Phase 1: Session Discovery & Validation
   ├─ Detect or validate session directory
   ├─ Check required artifacts exist (.task/*.json, .summaries/*)
   └─ ERROR if invalid or incomplete

Phase 2: Task Chain Structure Validation
   ├─ Load all task JSONs from .task/
   ├─ Validate TDD structure: TEST-N.M → IMPL-N.M → REFACTOR-N.M
   ├─ Verify dependencies (depends_on)
   ├─ Validate meta fields (tdd_phase, agent)
   └─ Extract chain validation data

Phase 3: Coverage & Cycle Analysis
   ├─ Call: /workflow:tools:tdd-coverage-analysis
   ├─ Parse: test-results.json, coverage-report.json, tdd-cycle-report.md
   └─ Extract coverage metrics and TDD cycle verification

Phase 4: Compliance Report Generation
   ├─ Aggregate findings from Phases 1-3
   ├─ Calculate compliance score (0-100)
   ├─ Determine quality gate recommendation
   ├─ Generate TDD_COMPLIANCE_REPORT.md
   └─ Display summary to user

4-Phase Execution

Phase 1: Session Discovery & Validation

Step 1.1: Detect Session

IF --session parameter provided:
    session_id = provided session
ELSE:
    # Auto-detect active session
    active_sessions = bash(find .workflow/active/ -name "WFS-*" -type d 2>/dev/null)
    IF active_sessions is empty:
        ERROR: "No active workflow session found. Use --session <session-id>"
        EXIT
    ELSE IF active_sessions has multiple entries:
        # Use most recently modified session
        session_id = bash(ls -td .workflow/active/WFS-*/ 2>/dev/null | head -1 | xargs basename)
    ELSE:
        session_id = basename(active_sessions[0])

# Derive paths
session_dir = .workflow/active/WFS-{session_id}
task_dir = session_dir/.task
summaries_dir = session_dir/.summaries
process_dir = session_dir/.process

Step 1.2: Validate Required Artifacts

# Check task files exist
task_files = Glob(task_dir/*.json)
IF task_files.count == 0:
    ERROR: "No task JSON files found. Run /workflow:tdd-plan first"
    EXIT

# Check summaries exist (optional but recommended for full analysis)
summaries_exist = EXISTS(summaries_dir)
IF NOT summaries_exist:
    WARNING: "No .summaries/ directory found. Some analysis may be limited."

Output: session_id, session_dir, task_files list


Phase 2: Task Chain Structure Validation

Step 2.1: Load and Parse Task JSONs

# Single-pass JSON extraction using jq
validation_data = bash("""
# Load all tasks and extract structured data
cd '{session_dir}/.task'

# Extract all task IDs
task_ids=$(jq -r '.id' *.json 2>/dev/null | sort)

# Extract dependencies for IMPL tasks
impl_deps=$(jq -r 'select(.id | startswith("IMPL")) | .id + ":" + (.context.depends_on[]? // "none")' *.json 2>/dev/null)

# Extract dependencies for REFACTOR tasks
refactor_deps=$(jq -r 'select(.id | startswith("REFACTOR")) | .id + ":" + (.context.depends_on[]? // "none")' *.json 2>/dev/null)

# Extract meta fields
meta_tdd=$(jq -r '.id + ":" + (.meta.tdd_phase // "missing")' *.json 2>/dev/null)
meta_agent=$(jq -r '.id + ":" + (.meta.agent // "missing")' *.json 2>/dev/null)

# Output as JSON
jq -n --arg ids "$task_ids" \\
       --arg impl "$impl_deps" \\
       --arg refactor "$refactor_deps" \\
       --arg tdd "$meta_tdd" \\
       --arg agent "$meta_agent" \\
       '{ids: $ids, impl_deps: $impl, refactor_deps: $refactor, tdd: $tdd, agent: $agent}'
""")

Step 2.2: Validate TDD Chain Structure

Parse validation_data JSON and validate:

For each feature N (extracted from task IDs):
   1. TEST-N.M exists?
   2. IMPL-N.M exists?
   3. REFACTOR-N.M exists? (optional but recommended)
   4. IMPL-N.M.context.depends_on contains TEST-N.M?
   5. REFACTOR-N.M.context.depends_on contains IMPL-N.M?
   6. TEST-N.M.meta.tdd_phase == "red"?
   7. TEST-N.M.meta.agent == "@code-review-test-agent"?
   8. IMPL-N.M.meta.tdd_phase == "green"?
   9. IMPL-N.M.meta.agent == "@code-developer"?
   10. REFACTOR-N.M.meta.tdd_phase == "refactor"?

Calculate:
- chain_completeness_score = (complete_chains / total_chains) * 100
- dependency_accuracy = (correct_deps / total_deps) * 100
- meta_field_accuracy = (correct_meta / total_meta) * 100

Output: chain_validation_report (JSON structure with validation results)


Phase 3: Coverage & Cycle Analysis

Step 3.1: Call Coverage Analysis Sub-command

Skill(skill="workflow:tools:tdd-coverage-analysis", args="--session {session_id}")

Step 3.2: Parse Output Files

# Check required outputs exist
IF NOT EXISTS(process_dir/test-results.json):
    WARNING: "test-results.json not found. Coverage analysis incomplete."
    coverage_data = null
ELSE:
    coverage_data = Read(process_dir/test-results.json)

IF NOT EXISTS(process_dir/coverage-report.json):
    WARNING: "coverage-report.json not found. Coverage metrics incomplete."
    metrics = null
ELSE:
    metrics = Read(process_dir/coverage-report.json)

IF NOT EXISTS(process_dir/tdd-cycle-report.md):
    WARNING: "tdd-cycle-report.md not found. Cycle validation incomplete."
    cycle_data = null
ELSE:
    cycle_data = Read(process_dir/tdd-cycle-report.md)

Step 3.3: Extract Coverage Metrics

If coverage_data exists:
   - line_coverage_percent
   - branch_coverage_percent
   - function_coverage_percent
   - uncovered_files (list)
   - uncovered_lines (map: file -> line ranges)

If cycle_data exists:
   - red_phase_compliance (tests failed initially?)
   - green_phase_compliance (tests pass after impl?)
   - refactor_phase_compliance (tests stay green during refactor?)
   - minimal_implementation_score (was impl minimal?)

Output: coverage_analysis, cycle_analysis


Phase 4: Compliance Report Generation

Step 4.1: Calculate Compliance Score

Base Score: 100 points

Deductions:
Chain Structure:
   - Missing TEST task: -30 points per feature
   - Missing IMPL task: -30 points per feature
   - Missing REFACTOR task: -10 points per feature
   - Wrong dependency: -15 points per error
   - Wrong agent: -5 points per error
   - Wrong tdd_phase: -5 points per error

TDD Cycle Compliance:
   - Test didn't fail initially: -10 points per feature
   - Tests didn't pass after IMPL: -20 points per feature
   - Tests broke during REFACTOR: -15 points per feature
   - Over-engineered IMPL: -10 points per feature

Coverage Quality:
   - Line coverage < 80%: -5 points
   - Branch coverage < 70%: -5 points
   - Function coverage < 80%: -5 points
   - Critical paths uncovered: -10 points

Final Score: Max(0, Base Score - Total Deductions)

Step 4.2: Determine Quality Gate

IF score >= 90 AND no_critical_violations:
    recommendation = "APPROVED"
ELSE IF score >= 70 AND critical_violations == 0:
    recommendation = "PROCEED_WITH_CAVEATS"
ELSE IF score >= 50:
    recommendation = "REQUIRE_FIXES"
ELSE:
    recommendation = "BLOCK_MERGE"

Step 4.3: Generate Report

report_content = Generate markdown report (see structure below)
report_path = "{session_dir}/TDD_COMPLIANCE_REPORT.md"
Write(report_path, report_content)

Step 4.4: Display Summary to User

echo "=== TDD Verification Complete ==="
echo "Session: {session_id}"
echo "Report: {report_path}"
echo ""
echo "Quality Gate: {recommendation}"
echo "Compliance Score: {score}/100"
echo ""
echo "Chain Validation: {chain_completeness_score}%"
echo "Line Coverage: {line_coverage}%"
echo "Branch Coverage: {branch_coverage}%"
echo ""
echo "Next: Review full report for detailed findings"

TodoWrite Pattern (Optional)

Note: As an orchestrator command, TodoWrite tracking is optional and primarily useful for long-running verification processes. For most cases, the 4-phase execution is fast enough that progress tracking adds noise without value.

// Only use TodoWrite for complex multi-session verification
// Skip for single-session verification

Validation Logic

Chain Validation Algorithm

1. Load all task JSONs from .workflow/active/{sessionId}/.task/
2. Extract task IDs and group by feature number
3. For each feature:
   - Check TEST-N.M exists
   - Check IMPL-N.M exists
   - Check REFACTOR-N.M exists (optional but recommended)
   - Verify IMPL-N.M depends_on TEST-N.M
   - Verify REFACTOR-N.M depends_on IMPL-N.M
   - Verify meta.tdd_phase values
   - Verify meta.agent assignments
4. Calculate chain completeness score
5. Report incomplete or invalid chains

Quality Gate Criteria

Recommendation Score Range Critical Violations Action
APPROVED ≥90 0 Safe to merge
PROCEED_WITH_CAVEATS ≥70 0 Can proceed, address minor issues
REQUIRE_FIXES ≥50 Any Must fix before merge
BLOCK_MERGE <50 Any Block merge until resolved

Critical Violations:

  • Missing TEST or IMPL task for any feature
  • Tests didn't fail initially (Red phase violation)
  • Tests didn't pass after IMPL (Green phase violation)
  • Tests broke during REFACTOR (Refactor phase violation)

Output Files

.workflow/active/WFS-{session-id}/
├── TDD_COMPLIANCE_REPORT.md     # Comprehensive compliance report ⭐
└── .process/
    ├── test-results.json         # From tdd-coverage-analysis
    ├── coverage-report.json      # From tdd-coverage-analysis
    └── tdd-cycle-report.md       # From tdd-coverage-analysis

Error Handling

Session Discovery Errors

Error Cause Resolution
No active session No WFS-* directories Provide --session explicitly
Multiple active sessions Multiple WFS-* directories Provide --session explicitly
Session not found Invalid session-id Check available sessions

Validation Errors

Error Cause Resolution
Task files missing Incomplete planning Run /workflow:tdd-plan first
Invalid JSON Corrupted task files Regenerate tasks
Missing summaries Tasks not executed Execute tasks before verify

Analysis Errors

Error Cause Resolution
Coverage tool missing No test framework Configure testing first
Tests fail to run Code errors Fix errors before verify
Sub-command fails tdd-coverage-analysis error Check sub-command logs

Integration & Usage

Command Chain

  • Called After: /workflow:execute (when TDD tasks completed)
  • Calls: /workflow:tools:tdd-coverage-analysis
  • Related: /workflow:tdd-plan, /workflow:status

Basic Usage

# Auto-detect active session
/workflow:tdd-verify

# Specify session
/workflow:tdd-verify --session WFS-auth

When to Use

  • After completing all TDD tasks in a workflow
  • Before merging TDD workflow branch
  • For TDD process quality assessment
  • To identify missing TDD steps

TDD Compliance Report Structure

# TDD Compliance Report - {Session ID}

**Generated**: {timestamp}
**Session**: WFS-{sessionId}
**Workflow Type**: TDD

---

## Executive Summary

### Quality Gate Decision

| Metric | Value | Status |
|--------|-------|--------|
| Compliance Score | {score}/100 | {status_emoji} |
| Chain Completeness | {percentage}% | {status} |
| Line Coverage | {percentage}% | {status} |
| Branch Coverage | {percentage}% | {status} |
| Function Coverage | {percentage}% | {status} |

### Recommendation

**{RECOMMENDATION}**

**Decision Rationale**:
{brief explanation based on score and violations}

**Quality Gate Criteria**:
- **APPROVED**: Score ≥90, no critical violations
- **PROCEED_WITH_CAVEATS**: Score ≥70, no critical violations
- **REQUIRE_FIXES**: Score ≥50 or critical violations exist
- **BLOCK_MERGE**: Score <50

---

## Chain Analysis

### Feature 1: {Feature Name}
**Status**: ✅ Complete
**Chain**: TEST-1.1 → IMPL-1.1 → REFACTOR-1.1

| Phase | Task | Status | Details |
|-------|------|--------|---------|
| Red | TEST-1.1 | ✅ Pass | Test created and failed with clear message |
| Green | IMPL-1.1 | ✅ Pass | Minimal implementation made test pass |
| Refactor | REFACTOR-1.1 | ✅ Pass | Code improved, tests remained green |

### Feature 2: {Feature Name}
**Status**: ⚠️ Incomplete
**Chain**: TEST-2.1 → IMPL-2.1 (Missing REFACTOR-2.1)

| Phase | Task | Status | Details |
|-------|------|--------|---------|
| Red | TEST-2.1 | ✅ Pass | Test created and failed |
| Green | IMPL-2.1 | ⚠️ Warning | Implementation seems over-engineered |
| Refactor | REFACTOR-2.1 | ❌ Missing | Task not completed |

**Issues**:
- REFACTOR-2.1 task not completed (-10 points)
- IMPL-2.1 implementation exceeded minimal scope (-10 points)

### Chain Validation Summary

| Metric | Value |
|--------|-------|
| Total Features | {count} |
| Complete Chains | {count} ({percent}%) |
| Incomplete Chains | {count} |
| Missing TEST | {count} |
| Missing IMPL | {count} |
| Missing REFACTOR | {count} |
| Dependency Errors | {count} |
| Meta Field Errors | {count} |

---

## Test Coverage Analysis

### Coverage Metrics

| Metric | Coverage | Target | Status |
|--------|----------|--------|--------|
| Line Coverage | {percentage}% | ≥80% | {status} |
| Branch Coverage | {percentage}% | ≥70% | {status} |
| Function Coverage | {percentage}% | ≥80% | {status} |

### Coverage Gaps

| File | Lines | Issue | Priority |
|------|-------|-------|----------|
| src/auth/service.ts | 45-52 | Uncovered error handling | HIGH |
| src/utils/parser.ts | 78-85 | Uncovered edge case | MEDIUM |

---

## TDD Cycle Validation

### Red Phase (Write Failing Test)
- {N}/{total} features had failing tests initially ({percent}%)
- ✅ Compliant features: {list}
- ❌ Non-compliant features: {list}

**Violations**:
- Feature 3: No evidence of initial test failure (-10 points)

### Green Phase (Make Test Pass)
- {N}/{total} implementations made tests pass ({percent}%)
- ✅ Compliant features: {list}
- ❌ Non-compliant features: {list}

**Violations**:
- Feature 2: Implementation over-engineered (-10 points)

### Refactor Phase (Improve Quality)
- {N}/{total} features completed refactoring ({percent}%)
- ✅ Compliant features: {list}
- ❌ Non-compliant features: {list}

**Violations**:
- Feature 2, 4: Refactoring step skipped (-20 points total)

---

## Best Practices Assessment

### Strengths
- Clear test descriptions
- Good test coverage
- Consistent naming conventions
- Well-structured code

### Areas for Improvement
- Some implementations over-engineered in Green phase
- Missing refactoring steps
- Test failure messages could be more descriptive

---

## Detailed Findings by Severity

### Critical Issues ({count})
{List of critical issues with impact and remediation}

### High Priority Issues ({count})
{List of high priority issues with impact and remediation}

### Medium Priority Issues ({count})
{List of medium priority issues with impact and remediation}

### Low Priority Issues ({count})
{List of low priority issues with impact and remediation}

---

## Recommendations

### Required Fixes (Before Merge)
1. Complete missing REFACTOR tasks (Features 2, 4)
2. Verify initial test failures for Feature 3
3. Fix tests that broke during refactoring

### Recommended Improvements
1. Simplify over-engineered implementations
2. Add edge case tests for Features 1, 3
3. Improve test failure message clarity
4. Increase branch coverage to >85%

### Optional Enhancements
1. Add more descriptive test names
2. Consider parameterized tests for similar scenarios
3. Document TDD process learnings

---

## Metrics Summary

| Metric | Value |
|--------|-------|
| Total Features | {count} |
| Complete Chains | {count} ({percent}%) |
| Compliance Score | {score}/100 |
| Critical Issues | {count} |
| High Issues | {count} |
| Medium Issues | {count} |
| Low Issues | {count} |
| Line Coverage | {percent}% |
| Branch Coverage | {percent}% |
| Function Coverage | {percent}% |

---

**Report End**