mirror of https://github.com/catlog22/Claude-Code-Workflow.git synced 2026-02-11 02:33:51 +08:00

Files

catlog22 54c3234d84 fix: 为所有 skill 的 .workflow/ 路径添加 projectRoot 前缀

从子目录执行 skill 时，相对路径 .workflow/ 会导致产物落到错误位置。
通过 git rev-parse --show-toplevel || pwd 检测项目根目录，
所有 .workflow/ 路径引用统一加上 {projectRoot} 前缀确保路径正确。

涉及 72 个文件，覆盖 20+ 个 skill。

2026-02-08 13:46:48 +08:00

17 KiB

Raw Blame History

Phase 2: Test-Cycle Execution

Dynamic test-fix execution with adaptive task generation based on runtime analysis. Iterative fix cycles until test pass rate >= 95% or max iterations reached. All agent interactions use spawn_agent → wait → close_agent lifecycle.

vs Standard Execute:

Standard: Pre-defined tasks → Execute sequentially → Done
Test-Cycle: Initial tasks → Test → Analyze failures → Generate fix tasks → Fix → Re-test → Repeat until pass

Agent Roles

Agent	Responsibility
Orchestrator	Loop control, strategy selection, pass rate calculation, threshold decisions
@cli-planning-agent	CLI analysis (Gemini/Qwen/Codex), root cause extraction, task generation, affected test detection
@test-fix-agent	Test execution, code fixes, criticality assignment, result reporting

Core Responsibilities

Orchestrator

Session discovery, task queue management
Pass rate calculation: (passed / total) * 100 from test-results.json
Criticality assessment (high/medium/low)
Strategy selection based on context
Runtime Calculations (from iteration-state.json):
- Current iteration: iterations.length + 1
- Stuck tests: Tests appearing in failed_tests for 3+ consecutive iterations
- Regression: Compare consecutive pass_rate values (>10% drop)
- Max iterations: Read from task.meta.max_iterations
Iteration control (max 10, default)
CLI tool fallback chain: Gemini → Qwen → Codex
Progress tracking
Session auto-complete (pass rate >= 95%)
Explicit Lifecycle: Always close_agent after wait completes

@cli-planning-agent

Execute CLI analysis with bug diagnosis template
Parse output for root causes and fix strategy
Generate IMPL-fix-N.json task definition
Detect affected tests for progressive testing
Save: analysis.md, cli-output.txt

@test-fix-agent

Execute tests, save results to test-results.json
Apply fixes from task.context.fix_strategy
Assign criticality to failures
Update task status

Intelligent Strategy Engine

Auto-selects optimal strategy based on iteration context:

Strategy	Trigger	Behavior
Conservative	Iteration 1-2 (default)	Single targeted fix, full validation
Aggressive	Pass rate >80% + similar failures	Batch fix related issues
Surgical	Regression detected (pass rate drops >10%)	Minimal changes, rollback focus

Selection Logic (in orchestrator):

if (iteration <= 2) return "conservative";
if (passRate > 80 && failurePattern.similarity > 0.7) return "aggressive";
if (regressionDetected) return "surgical";
return "conservative";

Integration: Strategy passed to @cli-planning-agent in prompt for tailored analysis.

Progressive Testing

Runs affected tests during iterations, full suite only for final validation.

How It Works:

@cli-planning-agent analyzes fix_strategy.modification_points
Maps modified files to test files (via imports + integration patterns)
Returns affected_tests[] in task JSON
@test-fix-agent runs: npm test -- ${affected_tests.join(' ')}
Final validation: npm test (full suite)

Benefits: 70-90% iteration speed improvement, instant feedback on fix effectiveness.

Agent Invocation Template

@cli-planning-agent (failure analysis):

// Spawn agent for failure analysis
const analysisAgentId = spawn_agent({
  message: `
## TASK ASSIGNMENT

### MANDATORY FIRST STEPS (Agent Execute)
1. **Read role definition**: ~/.codex/agents/cli-planning-agent.md (MUST read first)
2. Read: {projectRoot}/.workflow/project-tech.json
3. Read: {projectRoot}/.workflow/project-guidelines.json

---

    ## Task Objective
    Analyze test failures and generate fix task JSON for iteration ${N}

    ## Strategy
    ${selectedStrategy} - ${strategyDescription}

    ## MANDATORY FIRST STEPS
    1. Read test results: ${session.test_results_path}
    2. Read test output: ${session.test_output_path}
    3. Read iteration state: ${session.iteration_state_path}

    ## Context Metadata (Orchestrator-Calculated)
    - Session ID: ${sessionId} (from file path)
    - Current Iteration: ${N} (= iterations.length + 1)
    - Max Iterations: ${maxIterations} (from task.meta.max_iterations)
    - Current Pass Rate: ${passRate}%
    - Selected Strategy: ${selectedStrategy} (from iteration-state.json)
    - Stuck Tests: ${stuckTests} (calculated from iterations[].failed_tests history)

    ## CLI Configuration
    - Tool Priority: gemini & codex
    - Template: 01-diagnose-bug-root-cause.txt
    - Timeout: 2400000ms

    ## Expected Deliverables
    1. Task JSON: ${session.task_dir}/IMPL-fix-${N}.json
       - Must include: fix_strategy.test_execution.affected_tests[]
       - Must include: fix_strategy.confidence_score
    2. Analysis report: ${session.process_dir}/iteration-${N}-analysis.md
    3. CLI output: ${session.process_dir}/iteration-${N}-cli-output.txt

    ## Strategy-Specific Requirements
    - Conservative: Single targeted fix, high confidence required
    - Aggressive: Batch fix similar failures, pattern-based approach
    - Surgical: Minimal changes, focus on rollback safety

    ## Success Criteria
    - Concrete fix strategy with modification points (file:function:lines)
    - Affected tests list for progressive testing
    - Root cause analysis (not just symptoms)
`
});

// Wait for analysis completion
const analysisResult = wait({
  ids: [analysisAgentId],
  timeout_ms: 2400000  // 40 minutes (CLI analysis timeout)
});

// Clean up
close_agent({ id: analysisAgentId });

@test-fix-agent (execution):

// Spawn agent for test execution/fixing
const fixAgentId = spawn_agent({
  message: `
## TASK ASSIGNMENT

### MANDATORY FIRST STEPS (Agent Execute)
1. **Read role definition**: ~/.codex/agents/test-fix-agent.md (MUST read first)
2. Read: {projectRoot}/.workflow/project-tech.json
3. Read: {projectRoot}/.workflow/project-guidelines.json

---

    ## Task Objective
    ${taskTypeObjective[task.meta.type]}

    ## MANDATORY FIRST STEPS
    1. Read task JSON: ${session.task_json_path}
    2. Read iteration state: ${session.iteration_state_path}
    3. ${taskTypeSpecificReads[task.meta.type]}

    ## CRITICAL: Syntax Check Priority
    **Before any code modification or test execution:**
    - Run project syntax checker (TypeScript: tsc --noEmit, ESLint, etc.)
    - Verify zero syntax errors before proceeding
    - If syntax errors found: Fix immediately before other work
    - Syntax validation is MANDATORY gate - no exceptions

    ## Session Paths
    - Workflow Dir: ${session.workflow_dir}
    - Task JSON: ${session.task_json_path}
    - Test Results Output: ${session.test_results_path}
    - Test Output Log: ${session.test_output_path}
    - Iteration State: ${session.iteration_state_path}

    ## Task Type: ${task.meta.type}
    ${taskTypeGuidance[task.meta.type]}

    ## Expected Deliverables
    ${taskTypeDeliverables[task.meta.type]}

    ## Success Criteria
    - ${taskTypeSuccessCriteria[task.meta.type]}
    - Update task status in task JSON
    - Save all outputs to specified paths
    - Report completion to orchestrator
`
});

// Wait for execution completion
const fixResult = wait({
  ids: [fixAgentId],
  timeout_ms: 600000  // 10 minutes
});

// Clean up
close_agent({ id: fixAgentId });

// Task Type Configurations
const taskTypeObjective = {
  "test-gen": "Generate comprehensive tests based on requirements",
  "test-fix": "Execute test suite and report results with criticality assessment",
  "test-fix-iteration": "Apply fixes from strategy and validate with tests"
};

const taskTypeSpecificReads = {
  "test-gen": "Read test context: ${session.test_context_path}",
  "test-fix": "Read previous results (if exists): ${session.test_results_path}",
  "test-fix-iteration": "Read fix strategy: ${session.analysis_path}, fix history: ${session.fix_history_path}"
};

const taskTypeGuidance = {
  "test-gen": `
    - Review task.context.requirements for test scenarios
    - Analyze codebase to understand implementation
    - Generate tests covering: happy paths, edge cases, error handling
    - Follow existing test patterns and framework conventions
  `,
  "test-fix": `
    - Run test command from task.context or project config
    - Capture: pass/fail counts, error messages, stack traces
    - Assess criticality for each failure:
      * high: core functionality broken, security issues
      * medium: feature degradation, data integrity issues
      * low: edge cases, flaky tests, env-specific issues
    - Save structured results to test-results.json
  `,
  "test-fix-iteration": `
    - Load fix_strategy from task.context.fix_strategy
    - Identify modification_points: ${task.context.fix_strategy.modification_points}
    - Apply surgical fixes (minimal changes)
    - Test execution mode: ${task.context.fix_strategy.test_execution.mode}
      * affected_only: Run ${task.context.fix_strategy.test_execution.affected_tests}
      * full_suite: Run complete test suite
    - If failures persist: Document in test-results.json, DO NOT analyze (orchestrator handles)
  `
};

const taskTypeDeliverables = {
  "test-gen": "- Test files in target directories\n    - Test coverage report\n    - Summary in .summaries/",
  "test-fix": "- test-results.json (pass_rate, criticality, failures)\n    - test-output.log (full test output)\n    - Summary in .summaries/",
  "test-fix-iteration": "- Modified source files\n    - test-results.json (updated pass_rate)\n    - test-output.log\n    - Summary in .summaries/"
};

const taskTypeSuccessCriteria = {
  "test-gen": "All test files created, executable without errors, coverage documented",
  "test-fix": "Test results saved with accurate pass_rate and criticality, all failures documented",
  "test-fix-iteration": "Fixes applied per strategy, tests executed, results reported (pass/fail to orchestrator)"
};

CLI Tool Configuration

Fallback Chain: Gemini → Qwen → Codex Template: ~/.ccw/workflows/cli-templates/prompts/analysis/01-diagnose-bug-root-cause.txt Timeout: 40min (2400000ms)

Tool Details:

Gemini (primary): gemini-2.5-pro
Qwen (fallback): coder-model
Codex (fallback): gpt-5.1-codex

When to Fallback: HTTP 429, timeout, analysis quality degraded

Session File Structure

{projectRoot}/.workflow/active/WFS-test-{session}/
├── workflow-session.json           # Session metadata
├── IMPL_PLAN.md, TODO_LIST.md
├── .task/
│   ├── IMPL-{001,002}.json         # Initial tasks
│   └── IMPL-fix-{N}.json           # Generated fix tasks
├── .process/
│   ├── iteration-state.json        # Current iteration + strategy + stuck tests
│   ├── test-results.json           # Latest results (pass_rate, criticality)
│   ├── test-output.log             # Full test output
│   ├── fix-history.json            # All fix attempts
│   ├── iteration-{N}-analysis.md   # CLI analysis report
│   └── iteration-{N}-cli-output.txt
└── .summaries/iteration-summaries/

Iteration State JSON

Purpose: Persisted state machine for iteration loop - enables Resume and historical analysis.

{
  "current_task": "IMPL-002",
  "selected_strategy": "aggressive",
  "next_action": "execute_fix_task",
  "iterations": [
    {
      "iteration": 1,
      "pass_rate": 70,
      "strategy": "conservative",
      "failed_tests": ["test_auth_flow", "test_user_permissions"]
    },
    {
      "iteration": 2,
      "pass_rate": 82,
      "strategy": "conservative",
      "failed_tests": ["test_user_permissions", "test_token_expiry"]
    },
    {
      "iteration": 3,
      "pass_rate": 89,
      "strategy": "aggressive",
      "failed_tests": ["test_auth_edge_case"]
    }
  ]
}

Field Descriptions:

current_task: Pointer to active task (essential for Resume)
selected_strategy: Current iteration strategy (runtime state)
next_action: State machine next step (execute_fix_task | retest | complete)
iterations[]: Historical log of all iterations (source of truth for trends)

Completion Conditions

Full Success:

All tasks completed
Pass rate === 100%
Action: Auto-complete session

Partial Success:

All tasks completed
Pass rate >= 95% and < 100%
All failures are "low" criticality
Action: Auto-approve with review note

Failure:

Max iterations (10) reached without 95% pass rate
Pass rate < 95% after max iterations
Action: Generate failure report, mark blocked, return to user

Error Handling

Scenario	Action
Test execution error	Log, retry with error context
CLI analysis failure	Fallback: Gemini → Qwen → Codex → manual
Agent execution error	Save state, close_agent, retry with simplified context
Max iterations reached	Generate failure report, mark blocked
Regression detected	Rollback last fix, switch to surgical strategy
Stuck tests detected	Continue with alternative strategy, document in failure report

CLI Fallback Triggers (Gemini → Qwen → Codex → manual):

Fallback is triggered when any of these conditions occur:

Invalid Output:
- CLI tool fails to generate valid IMPL-fix-N.json (JSON parse error)
- Missing required fields: fix_strategy.modification_points or fix_strategy.affected_tests
Low Confidence:
- fix_strategy.confidence_score < 0.4 (indicates uncertain analysis)
Technical Failures:
- HTTP 429 (rate limit) or 5xx errors
- Timeout (exceeds 2400000ms / 40min)
- Connection errors
Quality Degradation:
- Analysis report < 100 words (too brief, likely incomplete)
- No concrete modification points provided (only general suggestions)
- Same root cause identified 3+ consecutive times (stuck analysis)

Fallback Sequence:

Try primary tool (Gemini)
If trigger detected → Try fallback (Qwen)
If trigger detected again → Try final fallback (Codex)
If all fail → Mark as degraded, use basic pattern matching from fix-history.json, notify user

Lifecycle Error Handling:

try {
  const agentId = spawn_agent({ message: "..." });
  const result = wait({ ids: [agentId], timeout_ms: 2400000 });
  // ... process result ...
  close_agent({ id: agentId });
} catch (error) {
  if (agentId) close_agent({ id: agentId });
  // Save state for resume capability
  throw error;
}

Progress Tracking Pattern

[
  {
    content: "Execute IMPL-001: Generate tests [code-developer]",
    status: "completed"
  },
  {
    content: "Execute IMPL-002: Test & Fix Cycle [ITERATION]",
    status: "in_progress"
  },
  {
    content: "  → Iteration 1: Initial test (pass: 70%, conservative)",
    status: "completed"
  },
  {
    content: "  → Iteration 2: Fix validation (pass: 82%, conservative)",
    status: "completed"
  },
  {
    content: "  → Iteration 3: Batch fix auth (pass: 89%, aggressive)",
    status: "in_progress"
  }
]

Update Rules:

Add iteration item with: strategy, pass rate
Mark completed after each iteration
Update parent task when all complete

Commit Strategy

Automatic Commits (orchestrator-managed):

The orchestrator automatically creates git commits at key checkpoints to enable safe rollback:

After Successful Iteration (pass rate increased):

git add .
git commit -m "test-cycle: iteration ${N} - ${strategy} strategy (pass: ${oldRate}% → ${newRate}%)"

Before Rollback (regression detected):

# Current state preserved, then:
git revert HEAD
git commit -m "test-cycle: rollback iteration ${N} - regression detected (pass: ${newRate}% < ${oldRate}%)"

Commit Content:

Modified source files from fix application
Updated test-results.json, iteration-state.json
Excludes: temporary files, logs

Benefits:

Rollback Safety: Each iteration is a revert point
Progress Tracking: Git history shows iteration evolution
Audit Trail: Clear record of which strategy/iteration caused issues
Resume Capability: Can resume from any checkpoint

Note: Final session completion creates additional commit with full summary.

Post-Completion Expansion

完成后询问用户是否扩展为issue(test/enhance/refactor/doc)，选中项创建新 issue: "{summary} - {dimension}"

Best Practices

Default Settings Work: 10 iterations sufficient for most cases
Automatic Commits: Orchestrator commits after each successful iteration - no manual intervention needed
Trust Strategy Engine: Auto-selection based on proven heuristics
Monitor Logs: Check .process/iteration-N-analysis.md for CLI analysis insights
Progressive Testing: Saves 70-90% iteration time automatically
Always Close Agents: Ensure close_agent is called after every wait completes, including error paths

17 KiB Raw Blame History