Refactor team edict agent and task schemas; remove deprecated files

- Deleted Zhongshu Planner agent documentation as it is no longer needed. - Removed agent instruction documentation to streamline task assignment process. - Eliminated tasks schema file to simplify task management. - Updated Codex Lens installation instructions to use 'uv' for pip commands. - Bumped version to 0.4.1 in pyproject.toml and adjusted dependencies. - Enhanced API embedding with text truncation and automatic batch splitting on 413 errors. - Improved indexing pipeline with metadata registration and progress reporting. - Converted index_project and index_update functions to async for better performance.
2026-03-20 19:03:51 +08:00 · 2026-03-19 15:17:48 +08:00
parent 00672ec8e5
commit 54071473fc
32 changed files with 511 additions and 4205 deletions
--- a/.claude/skills/workflow-lite-test-review/SKILL.md
+++ b/.claude/skills/workflow-lite-test-review/SKILL.md
@@ -27,20 +27,19 @@ Test review and fix engine for lite-execute chain or standalone invocation.

 **Trigger**: `--in-memory` flag or `testReviewContext` global variable available

-**Input Source**: `testReviewContext` global variable set by lite-execute Step 6
+**Input Source**: `testReviewContext` global variable set by lite-execute Step 4

-**Behavior**: Skip session discovery (already resolved), inherit review tool from execution chain, proceed directly to TR-Phase 1.
+**Behavior**: Skip session discovery, inherit reviewTool from execution chain, proceed directly to TR-Phase 1.

-> **Note**: lite-execute Step 6 is the chain gate. Mode 1 invocation means execution is complete — proceed with test review.
+> **Note**: lite-execute Step 4 is the chain gate. Mode 1 invocation means execution is complete — proceed with test review.

 ### Mode 2: Standalone

 **Trigger**: User calls with session path or `--last`

-**Behavior**: Discover session → load plan + tasks → detect test tool → proceed to TR-Phase 1.
+**Behavior**: Discover session → load plan + tasks → `reviewTool = 'agent'` → proceed to TR-Phase 1.

 ```javascript
-// Session discovery
 let sessionPath, plan, taskFiles, reviewTool

 if (testReviewContext) {
@@ -50,21 +49,11 @@ if (testReviewContext) {
  taskFiles = testReviewContext.taskFiles.map(tf => JSON.parse(Read(tf.path)))
  reviewTool = testReviewContext.reviewTool || 'agent'
 } else {
-  // Mode 2: standalone
-  const args = $ARGUMENTS
-  if (args.includes('--last') || !args.trim() || args.trim() === '--skip-fix') {
-    const sessions = Glob('.workflow/.lite-plan/*/plan.json')
-    if (sessions.length === 0) {
-      console.error('No lite-plan sessions found.')
-      return
-    }
-    sessionPath = sessions[sessions.length - 1].replace(/[/\\]plan\.json$/, '')
-  } else {
-    sessionPath = args.replace(/--skip-fix|--last/g, '').trim()
-  }
+  // Mode 2: standalone — find last session or use provided path
+  sessionPath = resolveSessionPath($ARGUMENTS)  // Glob('.workflow/.lite-plan/*/plan.json'), take last
  plan = JSON.parse(Read(`${sessionPath}/plan.json`))
  taskFiles = plan.task_ids.map(id => JSON.parse(Read(`${sessionPath}/.task/${id}.json`)))
-  reviewTool = 'agent'  // default for standalone
+  reviewTool = 'agent'
 }

 const skipFix = $ARGUMENTS?.includes('--skip-fix') || false
@@ -74,7 +63,7 @@ const skipFix = $ARGUMENTS?.includes('--skip-fix') || false

 | Phase | Core Action | Output |
 |-------|-------------|--------|
-| TR-Phase 1 | Detect test framework + gather changes | testConfig |
+| TR-Phase 1 | Detect test framework + gather changes | testConfig, changedFiles |
 | TR-Phase 2 | Review implementation against convergence criteria | reviewResults[] |
 | TR-Phase 3 | Run tests + generate checklist | test-checklist.json |
 | TR-Phase 4 | Auto-fix failures (iterative, max 3 rounds) | Fixed code + updated checklist |
@@ -82,180 +71,68 @@ const skipFix = $ARGUMENTS?.includes('--skip-fix') || false

 ## TR-Phase 0: Initialize

-```javascript
-const sessionId = sessionPath.split('/').pop()
-
-TodoWrite({ todos: [
-  { content: "TR-Phase 1: Detect & Gather", status: "in_progress", activeForm: "Detecting test framework" },
-  { content: "TR-Phase 2: Review Convergence", status: "pending" },
-  { content: "TR-Phase 3: Run Tests", status: "pending" },
-  { content: "TR-Phase 4: Auto-Fix", status: "pending" },
-  { content: "TR-Phase 5: Report & Sync", status: "pending" }
-]})
-```
+Set `sessionId` from sessionPath. Create TodoWrite with 5 phases (Phase 1 = in_progress, rest = pending).

 ## TR-Phase 1: Detect Test Framework & Gather Changes

-```javascript
-// Detect test framework
-const hasPackageJson = Glob('package.json').length > 0
-const hasPyproject = Glob('pyproject.toml').length > 0
-const hasCargo = Glob('Cargo.toml').length > 0
-const hasGoMod = Glob('go.mod').length > 0
+**Test framework detection** (check in order, first match wins):

-let testConfig = { command: null, framework: null, type: null }
+| File | Framework | Command |
+|------|-----------|---------|
+| `package.json` with `scripts.test` | jest/vitest | `npm test` |
+| `package.json` with `scripts['test:unit']` | jest/vitest | `npm run test:unit` |
+| `pyproject.toml` | pytest | `python -m pytest -v --tb=short` |
+| `Cargo.toml` | cargo-test | `cargo test` |
+| `go.mod` | go-test | `go test ./...` |

-if (hasPackageJson) {
-  const pkg = JSON.parse(Read('package.json'))
-  const scripts = pkg.scripts || {}
-  if (scripts.test) { testConfig = { command: 'npm test', framework: 'jest/vitest', type: 'node' } }
-  else if (scripts['test:unit']) { testConfig = { command: 'npm run test:unit', framework: 'jest/vitest', type: 'node' } }
-} else if (hasPyproject) {
-  testConfig = { command: 'python -m pytest -v --tb=short', framework: 'pytest', type: 'python' }
-} else if (hasCargo) {
-  testConfig = { command: 'cargo test', framework: 'cargo-test', type: 'rust' }
-} else if (hasGoMod) {
-  testConfig = { command: 'go test ./...', framework: 'go-test', type: 'go' }
-}
+**Gather git changes**: `git diff --name-only HEAD~5..HEAD` → `changedFiles[]`

-// Gather git changes
-const changedFiles = Bash('git diff --name-only HEAD~5..HEAD 2>/dev/null || git diff --name-only HEAD')
-  .split('\n').filter(Boolean)
-const gitDiffStat = Bash('git diff --stat HEAD~5..HEAD 2>/dev/null || git diff --stat HEAD')
-
-console.log(`Test Framework: ${testConfig.framework || 'unknown'} | Command: ${testConfig.command || 'none'}`)
-console.log(`Changed Files: ${changedFiles.length}`)
-```
+Output: `testConfig = { command, framework, type }` + `changedFiles[]`

 // TodoWrite: Phase 1 → completed, Phase 2 → in_progress

 ## TR-Phase 2: Review Implementation Against Plan

-For each task, verify convergence criteria using agent or CLI review tool.
+**Skip if**: `reviewTool === 'skip'`  — set all tasks to PASS, proceed to Phase 3.
+
+For each task, verify convergence criteria and identify test gaps.

 **Agent Review** (reviewTool === 'agent', default):

-```javascript
-const reviewResults = []
-
-for (const task of taskFiles) {
-  const criteria = task.convergence?.criteria || []
-  const testReqs = task.test || {}
-
-  // Find actual changed files matching task scope
-  const taskTargetFiles = (task.files || [])
-    .map(f => f.path)
-    .filter(p => changedFiles.some(c => c.includes(p) || p.includes(c)))
-
-  // Read implementation to verify criteria
-  const fileContents = taskTargetFiles.map(p => {
-    try { return { path: p, content: Read(p) } }
-    catch { return { path: p, content: null } }
-  }).filter(f => f.content)
-
-  const review = {
-    taskId: task.id,
-    title: task.title,
-    criteria_met: [],
-    criteria_unmet: [],
-    test_gaps: [],
-    files_reviewed: taskTargetFiles
-  }
-
-  // Agent evaluates each criterion against file contents
-  for (const criterion of criteria) {
-    // Check: does implementation satisfy this criterion?
-    // Analyze file contents, look for expected patterns/functions/logic
-    const met = /* agent evaluation based on fileContents */ true_or_false
-    if (met) review.criteria_met.push(criterion)
-    else review.criteria_unmet.push(criterion)
-  }
-
-  // Check test coverage gaps
-  const hasTestFiles = changedFiles.some(f =>
-    /test[_\-.]|spec[_\-.]|\/__tests__\/|\/tests\//.test(f)
-  )
-  if (testReqs.unit?.length > 0 && !hasTestFiles) {
-    testReqs.unit.forEach(u => review.test_gaps.push({ type: 'unit', desc: u }))
-  }
-  if (testReqs.integration?.length > 0) {
-    testReqs.integration.forEach(i => review.test_gaps.push({ type: 'integration', desc: i }))
-  }
-
-  reviewResults.push(review)
-}
-```
+For each task in taskFiles:
+1. Extract `convergence.criteria[]` and `test` requirements
+2. Find changed files matching `task.files[].path` against `changedFiles`
+3. Read matched files, evaluate each criterion against implementation
+4. Check test coverage: if `task.test.unit` exists but no test files in changedFiles → mark as test gap
+5. Same for `task.test.integration`
+6. Build `reviewResult = { taskId, title, criteria_met[], criteria_unmet[], test_gaps[], files_reviewed[] }`

 **CLI Review** (reviewTool === 'gemini' or 'codex'):

 ```javascript
-if (reviewTool !== 'agent') {
-  const reviewId = `${sessionId}-tr-review`
-  Bash(`ccw cli -p "PURPOSE: Post-execution test review — verify convergence criteria met and identify test gaps
+const reviewId = `${sessionId}-tr-review`
+Bash(`ccw cli -p "PURPOSE: Post-execution test review — verify convergence criteria met and identify test gaps
 TASK: • Read plan.json and .task/*.json convergence criteria • For each criterion, check implementation in changed files • Identify missing unit/integration tests • List unmet criteria with file:line evidence
 MODE: analysis
 CONTEXT: @${sessionPath}/plan.json @${sessionPath}/.task/*.json @**/* | Memory: lite-execute completed, reviewing convergence
 EXPECTED: Per-task verdict table (PASS/PARTIAL/FAIL) + unmet criteria list + test gap list
 CONSTRAINTS: Read-only | Focus on convergence verification" --tool ${reviewTool} --mode analysis --id ${reviewId}`, { run_in_background: true })
-  // STOP - wait for hook callback, then parse CLI output into reviewResults format
-}
+// STOP - wait for hook callback, then parse CLI output into reviewResults format
 ```

 // TodoWrite: Phase 2 → completed, Phase 3 → in_progress

 ## TR-Phase 3: Run Tests & Generate Checklist

-```javascript
-// Build checklist structure
-const testChecklist = {
-  session: sessionId,
-  plan_summary: plan.summary,
-  generated_at: new Date().toISOString(),
-  test_config: testConfig,
-  tasks: reviewResults.map(review => {
-    const task = taskFiles.find(t => t.id === review.taskId)
-    const testReqs = task.test || {}
-    return {
-      task_id: review.taskId,
-      title: review.title,
-      status: review.criteria_unmet.length === 0 ? 'PASS'
-        : review.criteria_met.length > 0 ? 'PARTIAL' : 'FAIL',
-      convergence: { met: review.criteria_met, unmet: review.criteria_unmet },
-      test_items: [
-        ...(testReqs.unit || []).map(u => ({ type: 'unit', desc: u, status: 'pending' })),
-        ...(testReqs.integration || []).map(i => ({ type: 'integration', desc: i, status: 'pending' })),
-        ...(testReqs.success_metrics || []).map(m => ({ type: 'metric', desc: m, status: 'pending' })),
-        ...review.test_gaps.map(g => ({ type: g.type, desc: g.desc, status: 'missing' }))
-      ]
-    }
-  }),
-  execution: null
-}
+**Build checklist** from reviewResults:
+- Per task: status = `PASS` (all criteria met) / `PARTIAL` (some met) / `FAIL` (none met)
+- Collect test_items from `task.test.unit[]`, `task.test.integration[]`, `task.test.success_metrics[]` + review test_gaps

-// Run tests if framework detected
-if (testConfig.command) {
-  console.log(`Running: ${testConfig.command}`)
-  const testResult = Bash(testConfig.command, { timeout: 300000 })
+**Run tests** if `testConfig.command` exists:
+- Execute with 5min timeout
+- Parse output: detect passed/failed patterns → `overall: 'PASS' | 'FAIL' | 'UNKNOWN'`

-  const passed = /(\d+) passed/.test(testResult) || /PASSED/.test(testResult) || /ok \d+/.test(testResult)
-  const failMatch = testResult.match(/(\d+) failed/)
-  const hasFail = failMatch || /FAILED/.test(testResult) || /FAIL/.test(testResult)
-
-  testChecklist.execution = {
-    command: testConfig.command,
-    timestamp: new Date().toISOString(),
-    raw_output: testResult.slice(-3000),  // keep tail for error context
-    overall: hasFail ? 'FAIL' : (passed ? 'PASS' : 'UNKNOWN'),
-    fail_count: failMatch ? parseInt(failMatch[1]) : (hasFail ? -1 : 0)
-  }
-
-  console.log(`Result: ${testChecklist.execution.overall}`)
-} else {
-  console.log('No test command detected. Skipping test execution.')
-}
-
-Write(`${sessionPath}/test-checklist.json`, JSON.stringify(testChecklist, null, 2))
-```
+**Write** `${sessionPath}/test-checklist.json`

 // TodoWrite: Phase 3 → completed, Phase 4 → in_progress

@@ -263,28 +140,16 @@ Write(`${sessionPath}/test-checklist.json`, JSON.stringify(testChecklist, null,

 **Skip if**: `skipFix === true` OR `testChecklist.execution?.overall !== 'FAIL'`

-**Max iterations**: 3
+**Max iterations**: 3. Each iteration:
+
+1. Delegate to test-fix-agent:

 ```javascript
-if (skipFix || !testChecklist.execution || testChecklist.execution.overall !== 'FAIL') {
-  console.log(testChecklist.execution?.overall === 'PASS'
-    ? 'All tests passed. Skipping fix phase.'
-    : 'Skipping auto-fix (--skip-fix or no test execution).')
-  // TodoWrite: Phase 4 → completed (skipped)
-} else {
-  let iteration = 0
-  const MAX_ITERATIONS = 3
-
-  while (iteration < MAX_ITERATIONS && testChecklist.execution.overall === 'FAIL') {
-    iteration++
-    console.log(`\n--- Fix Iteration ${iteration}/${MAX_ITERATIONS} ---`)
-
-    // Use test-fix-agent for fixing
-    Agent({
-      subagent_type: "test-fix-agent",
-      run_in_background: false,
-      description: `Fix tests (iter ${iteration})`,
-      prompt: `## Test Fix Iteration ${iteration}/${MAX_ITERATIONS}
+Agent({
+  subagent_type: "test-fix-agent",
+  run_in_background: false,
+  description: `Fix tests (iter ${iteration})`,
+  prompt: `## Test Fix Iteration ${iteration}/${MAX_ITERATIONS}

 **Test Command**: ${testConfig.command}
 **Framework**: ${testConfig.framework}
@@ -305,98 +170,38 @@ ${testChecklist.execution.raw_output}
 3. Run \`${testConfig.command}\` to verify fix
 4. If fix introduces new failures, revert and try alternative approach
 5. Return: what was fixed, which files changed, test result after fix`
-    })
-
-    // Re-run tests after fix
-    const retestResult = Bash(testConfig.command, { timeout: 300000 })
-    const hasFail = /failed|FAIL/.test(retestResult)
-
-    testChecklist.execution = {
-      command: testConfig.command,
-      timestamp: new Date().toISOString(),
-      raw_output: retestResult.slice(-3000),
-      overall: hasFail ? 'FAIL' : 'PASS',
-      fix_iteration: iteration
-    }
-
-    Write(`${sessionPath}/test-checklist.json`, JSON.stringify(testChecklist, null, 2))
-
-    if (!hasFail) {
-      console.log(`Tests passed after iteration ${iteration}.`)
-      break
-    }
-  }
-
-  if (testChecklist.execution.overall === 'FAIL') {
-    console.log(`Tests still failing after ${MAX_ITERATIONS} iterations. Manual investigation needed.`)
-  }
-}
+})
 ```

+2. Re-run `testConfig.command` → update `testChecklist.execution`
+3. Write updated `test-checklist.json`
+4. Break if tests pass; continue if still failing
+
+If still failing after 3 iterations → log "Manual investigation needed"
+
 // TodoWrite: Phase 4 → completed, Phase 5 → in_progress

 ## TR-Phase 5: Report & Sync

 > **CHECKPOINT**: This step is MANDATORY. Always generate report and trigger sync.

+**Generate `test-review.md`** with sections:
+- Header: session, summary, timestamp, framework
+- **Task Verdicts** table: task_id | status | convergence (met/total) | test_items | gaps
+- **Unmet Criteria**: per-task checklist of unmet items
+- **Test Gaps**: list of missing unit/integration tests
+- **Test Execution**: command, result, fix iteration (if applicable)
+
+**Write** `${sessionPath}/test-review.md`
+
+**Chain to session:sync**:
 ```javascript
-// Generate markdown report
-const report = `# Test Review Report
-
-**Session**: ${sessionId}
-**Summary**: ${plan.summary}
-**Generated**: ${new Date().toISOString()}
-**Test Framework**: ${testConfig.framework || 'unknown'}
-
-## Task Verdicts
-
-| Task | Status | Convergence | Test Items | Gaps |
-|------|--------|-------------|------------|------|
-${testChecklist.tasks.map(t =>
-  `| ${t.task_id} | ${t.status} | ${t.convergence.met.length}/${t.convergence.met.length + t.convergence.unmet.length} | ${t.test_items.length} | ${t.test_items.filter(i => i.status === 'missing').length} |`
-).join('\n')}
-
-## Unmet Criteria
-
-${testChecklist.tasks.filter(t => t.convergence.unmet.length > 0).map(t =>
-  `### ${t.task_id}: ${t.title}\n${t.convergence.unmet.map(u => \`- [ ] \${u}\`).join('\n')}`
-).join('\n\n') || 'All criteria met.'}
-
-## Test Gaps
-
-${testChecklist.tasks.flatMap(t => t.test_items.filter(i => i.status === 'missing')).map(i =>
-  \`- [ ] (\${i.type}) \${i.desc}\`
-).join('\n') || 'No gaps detected.'}
-
-${testChecklist.execution ? `## Test Execution
-
-**Command**: \\\`${testChecklist.execution.command}\\\`
-**Result**: ${testChecklist.execution.overall}
-${testChecklist.execution.fix_iteration ? `**Fixed in iteration**: ${testChecklist.execution.fix_iteration}` : ''}
-` : '## Test Execution\n\nNo test framework detected.'}
-`
-
-Write(`${sessionPath}/test-review.md`, report)
-console.log(`Report: ${sessionPath}/test-review.md`)
-console.log(`Checklist: ${sessionPath}/test-checklist.json`)
-
-// Chain to session:sync
 Skill({ skill: "workflow:session:sync", args: `-y "Test review: ${testChecklist.execution?.overall || 'no-test'} — ${plan.summary}"` })
 ```

 // TodoWrite: Phase 5 → completed

-**Display summary**:
-```javascript
-console.log(`
-── Test Review Complete ──
-${testChecklist.tasks.map(t => {
-  const icon = t.status === 'PASS' ? '[PASS]' : t.status === 'PARTIAL' ? '[PARTIAL]' : '[FAIL]'
-  return `${icon} ${t.task_id}: ${t.title} (${t.convergence.met.length}/${t.convergence.met.length + t.convergence.unmet.length})`
-}).join('\n')}
-Test: ${testChecklist.execution?.overall || 'skipped'}${testChecklist.execution?.fix_iteration ? ` (fixed iter ${testChecklist.execution.fix_iteration})` : ''}
-`)
-```
+**Display summary**: Per-task verdict with [PASS]/[PARTIAL]/[FAIL] icons, convergence ratio, overall test result.

 ## Data Structures

@@ -406,8 +211,8 @@ Test: ${testChecklist.execution?.overall || 'skipped'}${testChecklist.execution?
 {
  planObject: { /* same as executionContext.planObject */ },
  taskFiles: [{ id: string, path: string }],
-  reviewTool: "agent" | "gemini" | "codex",   // inherited from lite-execute codeReviewTool
-  executionResults: [...],                      // previousExecutionResults from lite-execute
+  reviewTool: "skip" | "agent" | "gemini" | "codex",
+  executionResults: [...],
  originalUserInput: string,
  session: {
    id: string,
@@ -417,6 +222,31 @@ Test: ${testChecklist.execution?.overall || 'skipped'}${testChecklist.execution?
 }
 ```

+### testChecklist (Output artifact)
+
+```javascript
+{
+  session: string,
+  plan_summary: string,
+  generated_at: string,
+  test_config: { command, framework, type },
+  tasks: [{
+    task_id: string,
+    title: string,
+    status: "PASS" | "PARTIAL" | "FAIL",
+    convergence: { met: string[], unmet: string[] },
+    test_items: [{ type: "unit"|"integration"|"metric", desc: string, status: "pending"|"missing" }]
+  }],
+  execution: {
+    command: string,
+    timestamp: string,
+    raw_output: string,       // last 3000 chars
+    overall: "PASS" | "FAIL" | "UNKNOWN",
+    fix_iteration?: number
+  } | null
+}
+```
+
 ## Session Folder Structure (after test-review)

 ```
@@ -426,8 +256,8 @@ Test: ${testChecklist.execution?.overall || 'skipped'}${testChecklist.execution?
 ├── planning-context.md
 ├── plan.json
 ├── .task/TASK-*.json
-├── test-checklist.json          # NEW: structured test results
-└── test-review.md               # NEW: human-readable report
+├── test-checklist.json          # structured test results
+└── test-review.md               # human-readable report
 ```

 ## Error Handling