mirror of https://github.com/catlog22/Claude-Code-Workflow.git synced 2026-02-12 02:37:45 +08:00

Files

catlog22 a795538182 refactor(test-workflow): implement multi-layered testing strategy with quality gates

Introduce comprehensive test quality assurance framework to prevent "hollow tests"
from masking real issues. Optimize JSON data structures following minimal-but-sufficient principle.

Major Changes:
- Multi-layered test strategy (L0: Static, L1: Unit, L2: Integration, L3: E2E)
- New quality gate task (IMPL-001.5-review) validates tests before fix cycle
- Layer-aware failure diagnosis with test_type field support
- JSON simplification: removed redundant failure_context (~44% size reduction)

File Changes:
- new: cli-planning-agent.md - CLI analysis executor with layer-specific guidance
- mod: test-fix-gen.md - multi-layered test planning and quality gate generation
- mod: test-fix-agent.md - layer-aware test execution and failure classification
- mod: test-cycle-execute.md - 95% pass rate threshold with criticality assessment

Technical Details:
- test_type field tracks test layer (static/unit/integration/e2e)
- IMPL-fix-N.json simplified: removed 350 lines of redundant data
- Single source of truth: iteration-N-analysis.md contains full context
- Quality config: ~/.claude/workflows/test-quality-config.json (not in repo)

Benefits:
- Prevents symptom-level fixes through layer-specific diagnosis
- Ensures test quality with static analysis and coverage validation
- Reduces JSON file size by 44% while maintaining information completeness
- Enforces comprehensive test coverage (happy path + negative + edge cases)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-10 15:34:17 +08:00

21 KiB

Raw Blame History

name, description, argument-hint, allowed-tools

name	description	argument-hint	allowed-tools
test-fix-gen	Create test-fix workflow session from session ID, description, or file path with test strategy generation and task planning	[--use-codex] [--cli-execute] (source-session-id \| "feature description" \| /path/to/file.md)	SlashCommand(), TodoWrite(), Read(), Bash()

Workflow Test-Fix Generation Command (/workflow:test-fix-gen)

Overview

What It Does

This command creates an independent test-fix workflow session for existing code. It orchestrates a 5-phase process to analyze implementation, generate test requirements, and create executable test generation and fix tasks.

CRITICAL - Command Scope:

This command ONLY generates task JSON files (IMPL-001.json, IMPL-002.json)
Does NOT execute tests or apply fixes - all execution happens in separate orchestrator
Must call /workflow:test-cycle-execute after this command to actually run tests and fixes
Test failure handling happens in test-cycle-execute, not here

Dual-Mode Support

Automatic mode detection based on input pattern:

Mode	Input Pattern	Context Source	Use Case
Session Mode	`WFS-xxx`	Source session summaries	Test validation for completed workflow
Prompt Mode	Text or file path	Direct codebase analysis	Test generation from description

Detection Logic:

if [[ "$input" == WFS-* ]]; then
  MODE="session"  # Use test-context-gather
else
  MODE="prompt"   # Use context-gather
fi

Core Principles

Dual Input Support: Accepts session ID (WFS-xxx) or feature description/file path
Session Isolation: Creates independent WFS-test-[slug] session
Context-First: Gathers implementation context via appropriate method
Format Reuse: Creates standard IMPL-*.json tasks with meta.type: "test-fix"
Manual First: Default to manual fixes, use --use-codex for automation
Automatic Detection: Input pattern determines execution mode

Coordinator Role

This command is a pure planning coordinator:

Does NOT analyze code directly
Does NOT generate tests or documentation
Does NOT execute tests or apply fixes
Does NOT handle test failures or iterations
ONLY coordinates slash commands to generate task JSON files
Parses outputs to pass data between phases
Creates independent test workflow session
All execution delegated to /workflow:test-cycle-execute

Usage

Command Syntax

# Basic syntax
/workflow:test-fix-gen [FLAGS] <INPUT>

# Flags (optional)
--use-codex        # Enable Codex automated fixes in IMPL-002
--cli-execute      # Enable CLI execution in IMPL-001

# Input
<INPUT>            # Session ID, description, or file path

Usage Examples

Session Mode

# Test validation for completed implementation
/workflow:test-fix-gen WFS-user-auth-v2

# With automated fixes
/workflow:test-fix-gen --use-codex WFS-api-endpoints

# With CLI execution
/workflow:test-fix-gen --cli-execute --use-codex WFS-payment-flow

Prompt Mode - Text Description

# Generate tests from feature description
/workflow:test-fix-gen "Test the user authentication API endpoints in src/auth/api.ts"

# With automated fixes
/workflow:test-fix-gen --use-codex "Test user registration and login flows"

Prompt Mode - File Reference

# Generate tests from requirements file
/workflow:test-fix-gen ./docs/api-requirements.md

# With flags
/workflow:test-fix-gen --use-codex --cli-execute ./specs/feature.md

Mode Comparison

Aspect	Session Mode	Prompt Mode
Phase 1	Create `WFS-test-[source]` with `source_session_id`	Create `WFS-test-[slug]` without `source_session_id`
Phase 2	`/workflow:tools:test-context-gather`	`/workflow:tools:context-gather`
Phase 3-5	Identical	Identical
Context	Source session summaries + artifacts	Direct codebase analysis

Execution Flow

Core Execution Rules

Start Immediately: First action is TodoWrite, second is Phase 1 session creation
No Preliminary Analysis: Do not read files before Phase 1
Parse Every Output: Extract required data from each phase for next phase
Sequential Execution: Each phase depends on previous phase's output
Complete All Phases: Do not return until Phase 5 completes
Track Progress: Update TodoWrite after every phase
Automatic Detection: Mode auto-detected from input pattern
Parse Flags: Extract --use-codex and --cli-execute flags for Phase 4

5-Phase Execution

Phase 1: Create Test Session

Command:

Session Mode: SlashCommand("/workflow:session:start --new \"Test validation for [sourceSessionId]\"")
Prompt Mode: SlashCommand("/workflow:session:start --new \"Test generation for: [description]\"")

Input: User argument (session ID, description, or file path)

Expected Behavior:

Creates new session: WFS-test-[slug]
Writes workflow-session.json metadata:
- Session Mode: Includes workflow_type: "test_session", source_session_id: "[sourceId]"
- Prompt Mode: Includes workflow_type: "test_session" only
Returns new session ID

Parse Output:

Extract: testSessionId (pattern: WFS-test-[slug])

Validation:

Session Mode: Source session exists with completed IMPL tasks
Both Modes: New test session directory created with metadata

TodoWrite: Mark phase 1 completed, phase 2 in_progress

Phase 2: Gather Test Context

Command:

Session Mode: SlashCommand("/workflow:tools:test-context-gather --session [testSessionId]")
Prompt Mode: SlashCommand("/workflow:tools:context-gather --session [testSessionId] \"[task_description]\"")

Input: testSessionId from Phase 1

Expected Behavior:

Session Mode:
- Load source session implementation context and summaries
- Analyze test coverage using MCP tools
- Identify files requiring tests
Prompt Mode:
- Analyze codebase based on description
- Identify relevant files and dependencies
Detect test framework and conventions
Generate context package JSON

Parse Output:

Extract: contextPath (pattern: .workflow/[testSessionId]/.process/[test-]context-package.json)

Validation:

Context package created with coverage analysis
Test framework detected
Test conventions documented

TodoWrite: Mark phase 2 completed, phase 3 in_progress

Phase 3: Test Generation Analysis

Command: SlashCommand("/workflow:tools:test-concept-enhanced --session [testSessionId] --context [contextPath]")

Input:

testSessionId from Phase 1
contextPath from Phase 2

Expected Behavior:

Use Gemini to analyze coverage gaps and implementation
Study existing test patterns and conventions
Generate multi-layered test requirements (L0: Static Analysis, L1: Unit, L2: Integration, L3: E2E)
Design test generation strategy with quality assurance criteria
Generate TEST_ANALYSIS_RESULTS.md with structured test layers

Enhanced Test Requirements: For each targeted file/function, Gemini MUST generate:

L0: Static Analysis Requirements:
- Linting rules to enforce (ESLint, Prettier)
- Type checking requirements (TypeScript)
- Anti-pattern detection rules
L1: Unit Test Requirements:
- Happy path scenarios (valid inputs → expected outputs)
- Negative path scenarios (invalid inputs → error handling)
- Edge cases (null, undefined, 0, empty strings/arrays)
L2: Integration Test Requirements:
- Successful component interactions
- Failure handling scenarios (service unavailable, timeout)
L3: E2E Test Requirements (if applicable):
- Key user journeys from start to finish

Parse Output:

Verify .workflow/[testSessionId]/.process/TEST_ANALYSIS_RESULTS.md created

Validation:

TEST_ANALYSIS_RESULTS.md exists with complete sections:
- Coverage Assessment
- Test Framework & Conventions
- Multi-Layered Test Plan (NEW):
  - L0: Static Analysis Plan
  - L1: Unit Test Plan
  - L2: Integration Test Plan
  - L3: E2E Test Plan (if applicable)
- Test Requirements by File (with layer annotations)
- Test Generation Strategy
- Implementation Targets
- Quality Assurance Criteria (NEW):
  - Minimum coverage thresholds
  - Required test types per function
  - Acceptance criteria for test quality
- Success Criteria

TodoWrite: Mark phase 3 completed, phase 4 in_progress

Phase 4: Generate Test Tasks

Command: SlashCommand("/workflow:tools:test-task-generate [--use-codex] [--cli-execute] --session [testSessionId]")

Input:

testSessionId from Phase 1
--use-codex flag (if present) - Controls IMPL-002 fix mode
--cli-execute flag (if present) - Controls IMPL-001 generation mode

Expected Behavior:

Parse TEST_ANALYSIS_RESULTS.md from Phase 3 (multi-layered test plan)
Generate minimum 3 task JSON files (expandable based on complexity):
- IMPL-001.json: Test Understanding & Generation (@code-developer)
- IMPL-001.5-review.json: Test Quality Gate (@test-fix-agent) ← NEW
- IMPL-002.json: Test Execution & Fix Cycle (@test-fix-agent)
- IMPL-003+: Additional tasks if needed for complex projects
Generate IMPL_PLAN.md with multi-layered test strategy
Generate TODO_LIST.md with task checklist

Parse Output:

Verify .workflow/[testSessionId]/.task/IMPL-001.json exists
Verify .workflow/[testSessionId]/.task/IMPL-001.5-review.json exists ← NEW
Verify .workflow/[testSessionId]/.task/IMPL-002.json exists
Verify additional .task/IMPL-*.json if applicable
Verify IMPL_PLAN.md and TODO_LIST.md created

TodoWrite: Mark phase 4 completed, phase 5 in_progress

Phase 5: Return Summary

Return to User:

Independent test-fix workflow created successfully!

Input: [original input]
Mode: [Session|Prompt]
Test Session: [testSessionId]

Tasks Created:
- IMPL-001: Test Understanding & Generation (@code-developer)
- IMPL-001.5: Test Quality Gate - Static Analysis & Coverage (@test-fix-agent) ← NEW
- IMPL-002: Test Execution & Fix Cycle (@test-fix-agent)
[- IMPL-003+: Additional tasks if applicable]

Test Strategy: Multi-Layered (L0: Static, L1: Unit, L2: Integration, L3: E2E)
Test Framework: [detected framework]
Test Files to Generate: [count]
Quality Thresholds:
- Minimum Coverage: 80%
- Static Analysis: Zero critical issues
Max Fix Iterations: 5
Fix Mode: [Manual|Codex Automated]

Review artifacts:
- Test plan: .workflow/[testSessionId]/IMPL_PLAN.md
- Task list: .workflow/[testSessionId]/TODO_LIST.md

CRITICAL - Next Steps:
1. Review IMPL_PLAN.md (now includes multi-layered test strategy)
2. **MUST execute: /workflow:test-cycle-execute**
   - This command only generated task JSON files
   - Test execution and fix iterations happen in test-cycle-execute
   - Do NOT attempt to run tests or fixes in main workflow
3. IMPL-001.5 will validate test quality before fix cycle begins

TodoWrite: Mark phase 5 completed

BOUNDARY NOTE:

Command completes here - only task JSON files generated
All test execution, failure detection, CLI analysis, fix generation happens in /workflow:test-cycle-execute
This command does NOT handle test failures or apply fixes

TodoWrite Progress Tracking

Track all 5 phases:

TodoWrite({todos: [
  {"content": "Create independent test session", "status": "in_progress|completed", "activeForm": "Creating test session"},
  {"content": "Gather test coverage context", "status": "pending|in_progress|completed", "activeForm": "Gathering test coverage context"},
  {"content": "Analyze test requirements with Gemini", "status": "pending|in_progress|completed", "activeForm": "Analyzing test requirements"},
  {"content": "Generate test generation and execution tasks", "status": "pending|in_progress|completed", "activeForm": "Generating test tasks"},
  {"content": "Return workflow summary", "status": "pending|in_progress|completed", "activeForm": "Returning workflow summary"}
]})

Update status to in_progress when starting each phase, completed when done.

Task Specifications

Generates minimum 3 tasks (expandable for complex projects):

IMPL-001: Test Understanding & Generation

Agent: @code-developer

Purpose: Understand source implementation and generate test files following multi-layered test strategy

Task Configuration:

Task ID: IMPL-001
meta.type: "test-gen"
meta.agent: "@code-developer"
context.requirements: Understand source implementation and generate tests across all layers (L0-L3)
flow_control.target_files: Test files to create from TEST_ANALYSIS_RESULTS.md section 5

Execution Flow:

Understand Phase:
- Load TEST_ANALYSIS_RESULTS.md and test context
- Understand source code implementation patterns
- Analyze multi-layered test requirements (L0: Static, L1: Unit, L2: Integration, L3: E2E)
- Identify test scenarios, edge cases, and error paths
Generation Phase:
- Generate L1 unit test files following existing patterns
- Generate L2 integration test files (if applicable)
- Generate L3 E2E test files (if applicable)
- Ensure test coverage aligns with multi-layered requirements
- Include both positive and negative test cases
Verification Phase:
- Verify test completeness and correctness
- Ensure each test has meaningful assertions
- Check for test anti-patterns (tests without assertions, overly broad mocks)

IMPL-001.5: Test Quality Gate ← NEW

Agent: @test-fix-agent

Purpose: Validate test quality before entering fix cycle - prevent "hollow tests" from becoming the source of truth

Task Configuration:

Task ID: IMPL-001.5-review
meta.type: "test-quality-review"
meta.agent: "@test-fix-agent"
context.depends_on: ["IMPL-001"]
context.requirements: Validate generated tests meet quality standards
context.quality_config: Load from .claude/workflows/test-quality-config.json

Execution Flow:

L0: Static Analysis:
- Run linting on test files (ESLint, Prettier)
- Check for test anti-patterns:
  - Tests without assertions (expect() missing)
  - Empty test bodies (it('should...', () => {}))
  - Disabled tests without justification (it.skip, xit)
- Verify TypeScript type safety (if applicable)
Coverage Analysis:
- Run coverage analysis on generated tests
- Calculate coverage percentage for target source files
- Identify uncovered branches and edge cases
Test Quality Metrics:
- Verify minimum coverage threshold met (default: 80%)
- Verify all critical functions have negative test cases
- Verify integration tests cover key component interactions
Quality Gate Decision:
- PASS: Coverage ≥ 80%, zero critical anti-patterns → Proceed to IMPL-002
- FAIL: Coverage < 80% OR critical anti-patterns found → Loop back to IMPL-001 with feedback

Acceptance Criteria:

Static analysis: Zero critical issues
Test coverage: ≥ 80% for target files
Test completeness: All targeted functions have unit tests
Negative test coverage: Each public API has at least one error handling test
Integration coverage: Key component interactions have integration tests (if applicable)

Failure Handling: If quality gate fails:

Generate detailed feedback report (.process/test-quality-report.md)
Update IMPL-001 task with specific improvement requirements
Trigger IMPL-001 re-execution with enhanced context
Maximum 2 quality gate retries before escalating to user

IMPL-002: Test Execution & Fix Cycle

Agent: @test-fix-agent

Purpose: Execute initial tests and trigger orchestrator-managed fix cycles

Note: This task executes tests and reports results. The test-cycle-execute orchestrator manages all fix iterations, CLI analysis, and fix task generation.

Task Configuration:

Task ID: IMPL-002
meta.type: "test-fix"
meta.agent: "@test-fix-agent"
meta.use_codex: true|false (based on --use-codex flag)
context.depends_on: ["IMPL-001"]
context.requirements: Execute and fix tests

Test-Fix Cycle Specification: Note: This specification describes what test-cycle-execute orchestrator will do. The agent only executes single tasks.

Cycle Pattern (orchestrator-managed): test → gemini_diagnose → manual_fix (or codex) → retest
Tools Configuration (orchestrator-controlled):
- Gemini for analysis with bug-fix template → surgical fix suggestions
- Manual fix application (default) OR Codex if --use-codex flag (resume mechanism)
Exit Conditions (orchestrator-enforced):
- Success: All tests pass
- Failure: Max iterations reached (5)

Execution Flow:

Phase 1: Initial test execution
Phase 2: Iterative Gemini diagnosis + manual/Codex fixes
Phase 3: Final validation and certification

IMPL-003+: Additional Tasks (Optional)

Scenarios for Multiple Tasks:

Large projects requiring per-module test generation
Separate integration vs unit test tasks
Specialized test types (performance, security, etc.)

Agent: @code-developer or specialized agents based on requirements

Artifacts & Output

Output Files Structure

Created in .workflow/WFS-test-[session]/:

WFS-test-[session]/
├── workflow-session.json          # Session metadata
├── IMPL_PLAN.md                   # Test generation and execution strategy
├── TODO_LIST.md                   # Task checklist
├── .task/
│   ├── IMPL-001.json              # Test understanding & generation
│   ├── IMPL-002.json              # Test execution & fix cycle
│   └── IMPL-*.json                # Additional tasks (if applicable)
└── .process/
    ├── [test-]context-package.json # Context and coverage analysis
    └── TEST_ANALYSIS_RESULTS.md    # Test requirements and strategy

Session Metadata

File: workflow-session.json

Session Mode includes:

workflow_type: "test_session"
source_session_id: "[sourceSessionId]" (enables automatic cross-session context)

Prompt Mode includes:

workflow_type: "test_session"
No source_session_id field

Complete Data Flow

Example Command: /workflow:test-fix-gen WFS-user-auth

Phase Execution Chain:

Phase 1: session-start → WFS-test-user-auth
Phase 2: test-context-gather → test-context-package.json
Phase 3: test-concept-enhanced → TEST_ANALYSIS_RESULTS.md
Phase 4: test-task-generate → IMPL-001.json + IMPL-002.json (+ additional if needed)
Phase 5: Return summary

Command completes after Phase 5

Reference

Error Handling

Phase	Error Condition	Action
1	Source session not found (session mode)	Return error with source session ID
1	No completed IMPL tasks (session mode)	Return error, source incomplete
2	Context gathering failed	Return error, check source artifacts
3	Gemini analysis failed	Return error, check context package
4	Task generation failed	Retry once, then return error with details

Best Practices

Before Running:
- Ensure implementation is complete (session mode: check summaries exist)
- Commit all implementation changes
- Review source code quality
After Running:
- Review generated IMPL_PLAN.md before execution
- Check TEST_ANALYSIS_RESULTS.md for completeness
- Verify task dependencies in TODO_LIST.md
During Execution:
- Monitor iteration logs in .process/fix-iteration-*
- Track progress with /workflow:status
- Review Gemini diagnostic outputs
Mode Selection:
- Use Session Mode for completed workflow validation
- Use Prompt Mode for ad-hoc test generation
- Use --use-codex for autonomous fix application
- Use --cli-execute for enhanced generation capabilities

Prerequisite Commands:

/workflow:plan or /workflow:execute - Complete implementation session (for Session Mode)
None for Prompt Mode (ad-hoc test generation)

Called by This Command (5 phases):

/workflow:session:start - Phase 1: Create independent test workflow session
/workflow:tools:test-context-gather - Phase 2 (Session Mode): Gather source session context
/workflow:tools:context-gather - Phase 2 (Prompt Mode): Analyze codebase directly
/workflow:tools:test-concept-enhanced - Phase 3: Generate test requirements using Gemini
/workflow:tools:test-task-generate - Phase 4: Generate test task JSONs using action-planning-agent (autonomous, default)
/workflow:tools:test-task-generate --use-codex - Phase 4: With automated Codex fixes for IMPL-002 (when --use-codex flag used)
/workflow:tools:test-task-generate --cli-execute - Phase 4: With CLI execution mode for IMPL-001 test generation (when --cli-execute flag used)

Follow-up Commands:

/workflow:status - Review generated test tasks
/workflow:test-cycle-execute - Execute test generation and iterative fix cycles
/workflow:execute - Standard execution of generated test tasks

21 KiB Raw Blame History

Workflow Test-Fix Generation Command (/workflow:test-fix-gen)

Overview

What It Does

Dual-Mode Support

Core Principles

Coordinator Role

Usage

Command Syntax

Usage Examples

Session Mode

Prompt Mode - Text Description

Prompt Mode - File Reference

Mode Comparison

Execution Flow

Core Execution Rules

5-Phase Execution

Phase 1: Create Test Session

Phase 2: Gather Test Context

Phase 3: Test Generation Analysis

Phase 4: Generate Test Tasks

Phase 5: Return Summary

TodoWrite Progress Tracking

Task Specifications

IMPL-001: Test Understanding & Generation

IMPL-001.5: Test Quality Gate ← NEW

IMPL-002: Test Execution & Fix Cycle

IMPL-003+: Additional Tasks (Optional)

Artifacts & Output

Output Files Structure

Session Metadata

Complete Data Flow

Reference

Error Handling

Best Practices

Related Commands

21 KiB

Raw Blame History