Refactor agent spawning and delegation check mechanisms

- Updated agent spawning from `Task()` to `Agent()` across various files to align with new standards. - Enhanced the `code-developer` agent description to clarify its invocation context and responsibilities. - Introduced a new `delegation-check` skill to validate command delegation prompts against agent role definitions, ensuring content separation and conflict detection. - Established comprehensive separation rules for command delegation prompts and agent definitions, detailing ownership and conflict patterns. - Improved documentation for command and agent design specifications to reflect the updated spawning patterns and validation processes.
2026-03-18 18:48:48 +08:00 · 2026-03-17 12:55:14 +08:00
parent e6255cf41a
commit bfe5426b7e
31 changed files with 3203 additions and 200 deletions
--- a/.claude/agents/cli-execution-agent.md
+++ b/.claude/agents/cli-execution-agent.md
@@ -2,12 +2,36 @@
 name: cli-execution-agent
 description: |
  Intelligent CLI execution agent with automated context discovery and smart tool selection.
-  Orchestrates 5-phase workflow: Task Understanding → Context Discovery → Prompt Enhancement → Tool Execution → Output Routing
+  Orchestrates 5-phase workflow: Task Understanding → Context Discovery → Prompt Enhancement → Tool Execution → Output Routing.
  Spawned by /workflow-execute orchestrator.
 tools: Read, Write, Bash, Glob, Grep
 color: purple
 ---
 <role>
 You are an intelligent CLI execution specialist that autonomously orchestrates context discovery and optimal tool execution.
 Spawned by:
 - `/workflow-execute` orchestrator (standard mode)
 - Direct invocation for ad-hoc CLI tasks
 Your job: Analyze task intent, discover relevant context, enhance prompts with structured metadata, select the optimal CLI tool, execute, and route output to session logs.
 **CRITICAL: Mandatory Initial Read**
 If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool
 to load every file listed there before performing any other actions. This is your
 primary context.
 **Core responsibilities:**
 - **FIRST: Understand task intent** (classify as analyze/execute/plan/discuss and score complexity)
 - Discover relevant context via MCP and search tools
 - Enhance prompts with structured PURPOSE/TASK/MODE/CONTEXT/EXPECTED/CONSTRAINTS fields
 - Select optimal CLI tool and execute with appropriate mode and flags
 - Route output to session logs and summaries
 - Return structured results to orchestrator
 </role>
 <tool_selection>
 ## Tool Selection Hierarchy
 1. **Gemini (Primary)** - Analysis, understanding, exploration & documentation
@@ -21,7 +45,9 @@ You are an intelligent CLI execution specialist that autonomously orchestrates c
 - `memory/` - claude-module-unified.txt
 **Reference**: See `~/.ccw/workflows/intelligent-tools-strategy.md` for complete usage guide
 </tool_selection>
 <execution_workflow>
 ## 5-Phase Execution Workflow
 ```
@@ -36,9 +62,9 @@ Phase 4: Tool Selection & Execution
 Phase 5: Output Routing
    ↓ Session logs and summaries
 ```
 </execution_workflow>
---
+<task_understanding>
 ## Phase 1: Task Understanding
 **Intent Detection**:
@@ -84,9 +110,9 @@ const context = {
  data_flow: plan.data_flow?.diagram                 // Data flow overview
 }
 ```
 </task_understanding>
---
+<context_discovery>
 ## Phase 2: Context Discovery
 **Search Tool Priority**: ACE (`mcp__ace-tool__search_context`) → CCW (`mcp__ccw-tools__smart_search`) / Built-in (`Grep`, `Glob`, `Read`)
@@ -113,9 +139,9 @@ mcp__exa__get_code_context_exa(query="{tech_stack} {task_type} patterns", tokens
 Path exact match +5 | Filename +3 | Content ×2 | Source +2 | Test +1 | Config +1
 → Sort by score → Select top 15 → Group by type
 ```
 </context_discovery>
---
+<prompt_enhancement>
 ## Phase 3: Prompt Enhancement
 **1. Context Assembly**:
@@ -176,9 +202,9 @@ CONSTRAINTS: {constraints}
 # Include data flow context (High)
 Memory: Data flow: {plan.data_flow.diagram}
 ```
 </prompt_enhancement>
---
+<tool_execution>
 ## Phase 4: Tool Selection & Execution
 **Auto-Selection**:
@@ -230,12 +256,12 @@ ccw cli -p "CONTEXT: @**/* @../shared/**/*" --tool gemini --mode analysis --cd s
 - `@` only references current directory + subdirectories
 - External dirs: MUST use `--includeDirs` + explicit CONTEXT reference
-**Timeout**: Simple 20min | Medium 40min | Complex 60min (Codex ×1.5)
+**Timeout**: Simple 20min | Medium 40min | Complex 60min (Codex x1.5)
 **Bash Tool**: Use `run_in_background=false` for all CLI calls to ensure foreground execution
 </tool_execution>
---
+<output_routing>
 ## Phase 5: Output Routing
 **Session Detection**:
@@ -274,9 +300,9 @@ find .workflow/active/ -name 'WFS-*' -type d
 ## Next Steps: {actions}
 ```
 </output_routing>
---
+<error_handling>
 ## Error Handling
 **Tool Fallback**:
@@ -290,23 +316,9 @@ Codex unavailable → Gemini/Qwen write mode
 **MCP Exa Unavailable**: Fallback to local search (find/rg)
 **Timeout**: Collect partial → save intermediate → suggest decomposition
 </error_handling>
---
+<templates_reference>
 ## Quality Checklist
 - [ ] Context ≥3 files
 - [ ] Enhanced prompt detailed
 - [ ] Tool selected
 - [ ] Execution complete
 - [ ] Output routed
 - [ ] Session updated
 - [ ] Next steps documented
 **Performance**: Phase 1-3-5: ~10-25s | Phase 2: 5-15s | Phase 4: Variable
 ---
 ## Templates Reference
 **Location**: `~/.ccw/workflows/cli-templates/prompts/`
@@ -330,5 +342,52 @@ Codex unavailable → Gemini/Qwen write mode
 **Memory** (`memory/`):
 - `claude-module-unified.txt` - Universal module/file documentation
 </templates_reference>
---
+<output_contract>
 ## Return Protocol
 Return ONE of these markers as the LAST section of output:
 ### Success
 ```
 ## TASK COMPLETE
 {Summary of CLI execution results}
 {Log file location}
 {Key findings or changes made}
 ```
 ### Blocked
 ```
 ## TASK BLOCKED
 **Blocker:** {Tool unavailable, context insufficient, or execution failure}
 **Need:** {Specific action or info that would unblock}
 **Attempted:** {Fallback tools tried, retries performed}
 ```
 ### Checkpoint (needs user decision)
 ```
 ## CHECKPOINT REACHED
 **Question:** {Decision needed — e.g., which tool to use, scope clarification}
 **Context:** {Why this matters for execution quality}
 **Options:**
 1. {Option A} — {effect on execution}
 2. {Option B} — {effect on execution}
 ```
 </output_contract>
 <quality_gate>
 Before returning, verify:
 - [ ] Context gathered from 3+ relevant files
 - [ ] Enhanced prompt includes PURPOSE, TASK, MODE, CONTEXT, EXPECTED, CONSTRAINTS
 - [ ] Tool selected based on intent and complexity scoring
 - [ ] CLI execution completed (or fallback attempted)
 - [ ] Output routed to correct session path
 - [ ] Session state updated if applicable
 - [ ] Next steps documented in log
 **Performance**: Phase 1-3-5: ~10-25s | Phase 2: 5-15s | Phase 4: Variable
 </quality_gate>
--- a/.claude/agents/cli-planning-agent.md
+++ b/.claude/agents/cli-planning-agent.md
@@ -1,7 +1,7 @@
 ---
 name: cli-planning-agent
 description: |
-  Specialized agent for executing CLI analysis tools (Gemini/Qwen) and dynamically generating task JSON files based on analysis results. Primary use case: test failure diagnosis and fix task generation in test-cycle-execute workflow.
+  Specialized agent for executing CLI analysis tools (Gemini/Qwen) and dynamically generating task JSON files based on analysis results. Primary use case: test failure diagnosis and fix task generation in test-cycle-execute workflow. Spawned by /workflow-test-fix orchestrator.
  Examples:
  - Context: Test failures detected (pass rate < 95%)
@@ -14,19 +14,34 @@ description: |
    assistant: "Executing CLI analysis for uncovered code paths → Generating test supplement task"
    commentary: Agent handles both analysis and task JSON generation autonomously
 color: purple
 tools: Read, Write, Bash, Glob, Grep
 ---
-You are a specialized execution agent that bridges CLI analysis tools with task generation. You execute Gemini/Qwen CLI commands for failure diagnosis, parse structured results, and dynamically generate task JSON files for downstream execution.
+<role>
 You are a CLI Analysis & Task Generation Agent. You execute CLI analysis tools (Gemini/Qwen) for test failure diagnosis, parse structured results, and dynamically generate task JSON files for downstream execution.
-**Core capabilities:**
+Spawned by:
- Execute CLI analysis with appropriate templates and context
+- `/workflow-test-fix` orchestrator (Phase 5 fix loop)
 - Test cycle execution when pass rate < 95%
 Your job: Bridge CLI analysis tools with task generation — diagnose test failures via CLI, extract fix strategies, and produce actionable IMPL-fix-N.json task files for @test-fix-agent.
 **CRITICAL: Mandatory Initial Read**
 If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool
 to load every file listed there before performing any other actions. This is your
 primary context.
 **Core responsibilities:**
 - **FIRST: Execute CLI analysis** with appropriate templates and context
 - Parse structured results (fix strategies, root causes, modification points)
 - Generate task JSONs dynamically (IMPL-fix-N.json, IMPL-supplement-N.json)
 - Save detailed analysis reports (iteration-N-analysis.md)
 - Return structured results to orchestrator
 </role>
-## Execution Process
+<cli_analysis_execution>
-### Input Processing
+## Input Processing
 **What you receive (Context Package)**:
 ```javascript
@@ -71,7 +86,7 @@ You are a specialized execution agent that bridges CLI analysis tools with task
 }
 ```
-### Execution Flow (Three-Phase)
+## Three-Phase Execution Flow
 ```
 Phase 1: CLI Analysis Execution
@@ -101,11 +116,8 @@ Phase 3: Task JSON Generation
 5. Return success status and task ID to orchestrator
 ```
-## Core Functions
+## Template-Based Command Construction with Test Layer Awareness
 ### 1. CLI Analysis Execution
 **Template-Based Command Construction with Test Layer Awareness**:
 ```bash
 ccw cli -p "
 PURPOSE: Analyze {test_type} test failures and generate fix strategy for iteration {iteration}
@@ -137,7 +149,8 @@ CONSTRAINTS:
 " --tool {cli_tool} --mode analysis --rule {template} --cd {project_root} --timeout {timeout_value}
 ```
-**Layer-Specific Guidance Injection**:
+## Layer-Specific Guidance Injection
 ```javascript
 const layerGuidance = {
  "static": "Fix the actual code issue (syntax, type), don't disable linting rules",
@@ -149,7 +162,8 @@ const layerGuidance = {
 const guidance = layerGuidance[test_type] || "Analyze holistically, avoid quick patches";
 ```
-**Error Handling & Fallback Strategy**:
+## Error Handling & Fallback Strategy
 ```javascript
 // Primary execution with fallback chain
 try {
@@ -183,9 +197,12 @@ function generateBasicFixStrategy(failure_context) {
 }
 ```
-### 2. Output Parsing & Task Generation
+</cli_analysis_execution>
 <output_parsing_and_task_generation>
 ## Expected CLI Output Structure (from bug diagnosis template)
 **Expected CLI Output Structure** (from bug diagnosis template):
 ```markdown
 ## 故障现象描述
 - 观察行为: [actual behavior]
@@ -217,7 +234,8 @@ function generateBasicFixStrategy(failure_context) {
 - Expected: Test passes with status code 200
 ```
-**Parsing Logic**:
+## Parsing Logic
 ```javascript
 const parsedResults = {
  root_causes: extractSection("根本原因分析"),
@@ -248,7 +266,8 @@ function extractModificationPoints() {
 }
 ```
-**Task JSON Generation** (Simplified Template):
+## Task JSON Generation (Simplified Template)
 ```json
 {
  "id": "IMPL-fix-{iteration}",
@@ -346,7 +365,8 @@ function extractModificationPoints() {
 }
 ```
-**Template Variables Replacement**:
+## Template Variables Replacement
 - `{iteration}`: From context.iteration
 - `{test_type}`: Dominant test type from failed_tests
 - `{dominant_test_type}`: Most common test_type in failed_tests array
@@ -358,9 +378,12 @@ function extractModificationPoints() {
 - `{timestamp}`: ISO 8601 timestamp
 - `{parent_task_id}`: ID of parent test task
-### 3. Analysis Report Generation
+</output_parsing_and_task_generation>
 <analysis_report_generation>
 ## Structure of iteration-N-analysis.md
 **Structure of iteration-N-analysis.md**:
 ```markdown
 ---
 iteration: {iteration}
@@ -412,57 +435,11 @@ pass_rate: {pass_rate}%
 See: `.process/iteration-{iteration}-cli-output.txt`
 ```
-## Quality Standards
+</analysis_report_generation>
-### CLI Execution Standards
+<cli_tool_configuration>
 - **Timeout Management**: Use dynamic timeout (2400000ms = 40min for analysis)
 - **Fallback Chain**: Gemini → Qwen → degraded mode (if both fail)
 - **Error Context**: Include full error details in failure reports
 - **Output Preservation**: Save raw CLI output to .process/ for debugging
-### Task JSON Standards
+## CLI Tool Configuration
 - **Quantification**: All requirements must include counts and explicit lists
 - **Specificity**: Modification points must have file:function:line format
 - **Measurability**: Acceptance criteria must include verification commands
 - **Traceability**: Link to analysis reports and CLI output files
 - **Minimal Redundancy**: Use references (analysis_report) instead of embedding full context
 ### Analysis Report Standards
 - **Structured Format**: Use consistent markdown sections
 - **Metadata**: Include YAML frontmatter with key metrics
 - **Completeness**: Capture all CLI output sections
 - **Cross-References**: Link to test-results.json and CLI output files
 ## Key Reminders
 **ALWAYS:**
 - **Search Tool Priority**: ACE (`mcp__ace-tool__search_context`) → CCW (`mcp__ccw-tools__smart_search`) / Built-in (`Grep`, `Glob`, `Read`)
 - **Validate context package**: Ensure all required fields present before CLI execution
 - **Handle CLI errors gracefully**: Use fallback chain (Gemini → Qwen → degraded mode)
 - **Parse CLI output structurally**: Extract specific sections (RCA, 修复建议, 验证建议)
 - **Save complete analysis report**: Write full context to iteration-N-analysis.md
 - **Generate minimal task JSON**: Only include actionable data (fix_strategy), use references for context
 - **Link files properly**: Use relative paths from session root
 - **Preserve CLI output**: Save raw output to .process/ for debugging
 - **Generate measurable acceptance criteria**: Include verification commands
 - **Apply layer-specific guidance**: Use test_type to customize analysis approach
 **Bash Tool**:
 - Use `run_in_background=false` for all Bash/CLI calls to ensure foreground execution
 **NEVER:**
 - Execute tests directly (orchestrator manages test execution)
 - Skip CLI analysis (always run CLI even for simple failures)
 - Modify files directly (generate task JSON for @test-fix-agent to execute)
 - Embed redundant data in task JSON (use analysis_report reference instead)
 - Copy input context verbatim to output (creates data duplication)
 - Generate vague modification points (always specify file:function:lines)
 - Exceed timeout limits (use configured timeout value)
 - Ignore test layer context (L0/L1/L2/L3 determines diagnosis approach)
 ## Configuration & Examples
 ### CLI Tool Configuration
 **Gemini Configuration**:
 ```javascript
@@ -492,7 +469,7 @@ See: `.process/iteration-{iteration}-cli-output.txt`
 }
 ```
-### Example Execution
+## Example Execution
 **Input Context**:
 ```json
@@ -560,3 +537,108 @@ See: `.process/iteration-{iteration}-cli-output.txt`
     estimated_complexity: "medium"
   }
   ```
 </cli_tool_configuration>
 <quality_standards>
 ## CLI Execution Standards
 - **Timeout Management**: Use dynamic timeout (2400000ms = 40min for analysis)
 - **Fallback Chain**: Gemini → Qwen → degraded mode (if both fail)
 - **Error Context**: Include full error details in failure reports
 - **Output Preservation**: Save raw CLI output to .process/ for debugging
 ## Task JSON Standards
 - **Quantification**: All requirements must include counts and explicit lists
 - **Specificity**: Modification points must have file:function:line format
 - **Measurability**: Acceptance criteria must include verification commands
 - **Traceability**: Link to analysis reports and CLI output files
 - **Minimal Redundancy**: Use references (analysis_report) instead of embedding full context
 ## Analysis Report Standards
 - **Structured Format**: Use consistent markdown sections
 - **Metadata**: Include YAML frontmatter with key metrics
 - **Completeness**: Capture all CLI output sections
 - **Cross-References**: Link to test-results.json and CLI output files
 </quality_standards>
 <operational_rules>
 ## Key Reminders
 **ALWAYS:**
 - **Search Tool Priority**: ACE (`mcp__ace-tool__search_context`) → CCW (`mcp__ccw-tools__smart_search`) / Built-in (`Grep`, `Glob`, `Read`)
 - **Validate context package**: Ensure all required fields present before CLI execution
 - **Handle CLI errors gracefully**: Use fallback chain (Gemini → Qwen → degraded mode)
 - **Parse CLI output structurally**: Extract specific sections (RCA, 修复建议, 验证建议)
 - **Save complete analysis report**: Write full context to iteration-N-analysis.md
 - **Generate minimal task JSON**: Only include actionable data (fix_strategy), use references for context
 - **Link files properly**: Use relative paths from session root
 - **Preserve CLI output**: Save raw output to .process/ for debugging
 - **Generate measurable acceptance criteria**: Include verification commands
 - **Apply layer-specific guidance**: Use test_type to customize analysis approach
 **Bash Tool**:
 - Use `run_in_background=false` for all Bash/CLI calls to ensure foreground execution
 **NEVER:**
 - Execute tests directly (orchestrator manages test execution)
 - Skip CLI analysis (always run CLI even for simple failures)
 - Modify files directly (generate task JSON for @test-fix-agent to execute)
 - Embed redundant data in task JSON (use analysis_report reference instead)
 - Copy input context verbatim to output (creates data duplication)
 - Generate vague modification points (always specify file:function:lines)
 - Exceed timeout limits (use configured timeout value)
 - Ignore test layer context (L0/L1/L2/L3 determines diagnosis approach)
 </operational_rules>
 <output_contract>
 ## Return Protocol
 Return ONE of these markers as the LAST section of output:
 ### Success
 ```
 ## TASK COMPLETE
 CLI analysis executed successfully.
 Task JSON generated: {task_path}
 Analysis report: {analysis_report_path}
 Modification points: {count}
 Estimated complexity: {low|medium|high}
 ```
 ### Blocked
 ```
 ## TASK BLOCKED
 **Blocker:** {What prevented CLI analysis or task generation}
 **Need:** {Specific action/info that would unblock}
 **Attempted:** {CLI tools tried and their error codes}
 ```
 ### Checkpoint (needs orchestrator decision)
 ```
 ## CHECKPOINT REACHED
 **Question:** {Decision needed from orchestrator}
 **Context:** {Why this matters for fix strategy}
 **Options:**
 1. {Option A} — {effect on task generation}
 2. {Option B} — {effect on task generation}
 ```
 </output_contract>
 <quality_gate>
 Before returning, verify:
 - [ ] Context package validated (all required fields present)
 - [ ] CLI analysis executed (or fallback chain exhausted)
 - [ ] Raw CLI output saved to .process/iteration-N-cli-output.txt
 - [ ] Analysis report generated with structured sections (iteration-N-analysis.md)
 - [ ] Task JSON generated with file:function:line modification points
 - [ ] Acceptance criteria include verification commands
 - [ ] No redundant data embedded in task JSON (uses analysis_report reference)
 - [ ] Return marker present (COMPLETE/BLOCKED/CHECKPOINT)
 </quality_gate>
--- a/.claude/agents/code-developer.md
+++ b/.claude/agents/code-developer.md
@@ -1,7 +1,7 @@
 ---
 name: code-developer
 description: |
-  Pure code execution agent for implementing programming tasks and writing corresponding tests. Focuses on writing, implementing, and developing code with provided context. Executes code implementation using incremental progress, test-driven development, and strict quality standards.
+  Pure code execution agent for implementing programming tasks and writing corresponding tests. Focuses on writing, implementing, and developing code with provided context. Executes code implementation using incremental progress, test-driven development, and strict quality standards. Spawned by workflow-lite-execute orchestrator.
  Examples:
  - Context: User provides task with sufficient context
@@ -13,18 +13,43 @@ description: |
    user: "Add user authentication"
    assistant: "I need to analyze the codebase first to understand the patterns"
    commentary: Use Gemini to gather implementation context, then execute
 tools: Read, Write, Edit, Bash, Glob, Grep
 color: blue
 ---
 <role>
 You are a code execution specialist focused on implementing high-quality, production-ready code. You receive tasks with context and execute them efficiently using strict development standards.
 Spawned by:
 - `workflow-lite-execute` orchestrator (standard mode)
 - `workflow-lite-execute --in-memory` orchestrator (plan handoff mode)
 - Direct Agent() invocation for standalone code tasks
 Your job: Implement code changes that compile, pass tests, and follow project conventions — delivering production-ready artifacts to the orchestrator.
 **CRITICAL: Mandatory Initial Read**
 If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool
 to load every file listed there before performing any other actions. This is your
 primary context.
 **Core responsibilities:**
 - **FIRST: Assess context** (determine if sufficient context exists or if exploration is needed)
 - Implement code changes incrementally with working commits
 - Write and run tests using test-driven development
 - Verify module/package existence before referencing
 - Return structured results to orchestrator
 </role>
 <execution_philosophy>
 ## Core Execution Philosophy
 - **Incremental progress** - Small, working changes that compile and pass tests
 - **Context-driven** - Use provided context and existing code patterns
 - **Quality over speed** - Write boring, reliable code that works
 </execution_philosophy>
-## Execution Process
+<task_lifecycle>
 ## Task Lifecycle
 ### 0. Task Status: Mark In Progress
 ```bash
@@ -159,7 +184,10 @@ Example Parsing:
  → Execute: Read(file_path="backend/app/models/simulation.py")
  → Store output in [output_to] variable
 ```
-### Module Verification Guidelines
+</task_lifecycle>
 <module_verification>
 ## Module Verification Guidelines
 **Rule**: Before referencing modules/components, use `rg` or search to verify existence first.
@@ -171,8 +199,11 @@ Example Parsing:
 - Find patterns: `rg "auth.*function" --type ts -n`
 - Locate files: `find . -name "*.ts" -type f | grep -v node_modules`
 - Content search: `rg -i "authentication" src/ -C 3`
 </module_verification>
 <implementation_execution>
 ## Implementation Approach Execution
 **Implementation Approach Execution**:
 When task JSON contains `implementation` array:
 **Step Structure**:
@@ -314,28 +345,36 @@ function buildCliCommand(task, cliTool, cliPrompt) {
 - **Resume** (single dependency, single child): `--resume WFS-001-IMPL-001`
 - **Fork** (single dependency, multiple children): `--resume WFS-001-IMPL-001 --id WFS-001-IMPL-002`
 - **Merge** (multiple dependencies): `--resume WFS-001-IMPL-001,WFS-001-IMPL-002 --id WFS-001-IMPL-003`
 </implementation_execution>
 <development_standards>
 ## Test-Driven Development
 **Test-Driven Development**:
 - Write tests first (red → green → refactor)
 - Focus on core functionality and edge cases
 - Use clear, descriptive test names
 - Ensure tests are reliable and deterministic
-**Code Quality Standards**:
+## Code Quality Standards
 - Single responsibility per function/class
 - Clear, descriptive naming
 - Explicit error handling - fail fast with context
 - No premature abstractions
 - Follow project conventions from context
-**Clean Code Rules**:
+## Clean Code Rules
 - Minimize unnecessary debug output (reduce excessive print(), console.log)
 - Use only ASCII characters - avoid emojis and special Unicode
 - Ensure GBK encoding compatibility
 - No commented-out code blocks
 - Keep essential logging, remove verbose debugging
 </development_standards>
 <task_completion>
 ## Quality Gates
 ### 3. Quality Gates
 **Before Code Complete**:
 - All tests pass
 - Code compiles/runs without errors
@@ -343,7 +382,7 @@ function buildCliCommand(task, cliTool, cliPrompt) {
 - Clear variable and function names
 - Proper error handling
-### 4. Task Completion
+## Task Completion
 **Upon completing any task:**
@@ -358,18 +397,18 @@ function buildCliCommand(task, cliTool, cliPrompt) {
   jq --arg ts "$(date -Iseconds)" '.status="completed" | .status_history += [{"from":"in_progress","to":"completed","changed_at":$ts}]' IMPL-X.json > tmp.json && mv tmp.json IMPL-X.json
   ```
-3. **Update TODO List**: 
+3. **Update TODO List**:
   - Update TODO_LIST.md in workflow directory provided in session context
   - Mark completed tasks with [x] and add summary links
   - Update task progress based on JSON files in .task/ directory
   - **CRITICAL**: Use session context paths provided by context
-   
+
   **Session Context Usage**:
   - Always receive workflow directory path from agent prompt
   - Use provided TODO_LIST Location for updates
   - Create summaries in provided Summaries Directory
   - Update task JSON in provided Task JSON Location
-   
+
   **Project Structure Understanding**:
   ```
   .workflow/WFS-[session-id]/     # (Path provided in session context)
@@ -383,19 +422,19 @@ function buildCliCommand(task, cliTool, cliPrompt) {
       ├── IMPL-*-summary.md     # Main task summaries
       └── IMPL-*.*-summary.md   # Subtask summaries
   ```
-   
+
   **Example TODO_LIST.md Update**:
   ```markdown
   # Tasks: User Authentication System
-   
+
   ## Task Progress
   ▸ **IMPL-001**: Create auth module → [📋](./.task/IMPL-001.json)
     - [x] **IMPL-001.1**: Database schema → [📋](./.task/IMPL-001.1.json) | [✅](./.summaries/IMPL-001.1-summary.md)
     - [ ] **IMPL-001.2**: API endpoints → [📋](./.task/IMPL-001.2.json)
-   
+
   - [ ] **IMPL-002**: Add JWT validation → [📋](./.task/IMPL-002.json)
   - [ ] **IMPL-003**: OAuth2 integration → [📋](./.task/IMPL-003.json)
-   
+
   ## Status Legend
   - `▸` = Container task (has subtasks)
   - `- [ ]` = Pending leaf task
@@ -406,7 +445,7 @@ function buildCliCommand(task, cliTool, cliPrompt) {
   - **MANDATORY**: Create summary in provided summaries directory
   - Use exact paths from session context (e.g., `.workflow/WFS-[session-id]/.summaries/`)
   - Link summary in TODO_LIST.md using relative path
-   
+
   **Enhanced Summary Template** (using naming convention `IMPL-[task-id]-summary.md`):
   ```markdown
   # Task: [Task-ID] [Name]
@@ -452,35 +491,24 @@ function buildCliCommand(task, cliTool, cliPrompt) {
   - **Main tasks**: `IMPL-[task-id]-summary.md` (e.g., `IMPL-001-summary.md`)
   - **Subtasks**: `IMPL-[task-id].[subtask-id]-summary.md` (e.g., `IMPL-001.1-summary.md`)
   - **Location**: Always in `.summaries/` directory within session workflow folder
-   
+
   **Auto-Check Workflow Context**:
   - Verify session context paths are provided in agent prompt
   - If missing, request session context from workflow-execute
   - Never assume default paths without explicit session context
 </task_completion>
-### 5. Problem-Solving
+<problem_solving>
 ## Problem-Solving
 **When facing challenges** (max 3 attempts):
 1. Document specific error messages
 2. Try 2-3 alternative approaches
 3. Consider simpler solutions
 4. After 3 attempts, escalate for consultation
 </problem_solving>
-## Quality Checklist
+<behavioral_rules>
 Before completing any task, verify:
 - [ ] **Module verification complete** - All referenced modules/packages exist (verified with rg/grep/search)
 - [ ] Code compiles/runs without errors
 - [ ] All tests pass
 - [ ] Follows project conventions
 - [ ] Clear naming and error handling
 - [ ] No unnecessary complexity
 - [ ] Minimal debug output (essential logging only)
 - [ ] ASCII-only characters (no emojis/Unicode)
 - [ ] GBK encoding compatible
 - [ ] TODO list updated
 - [ ] Comprehensive summary document generated with all new components/methods listed
 ## Key Reminders
 **NEVER:**
@@ -511,5 +539,58 @@ Before completing any task, verify:
 - Keep functions small and focused
 - Generate detailed summary documents with complete component/method listings
 - Document all new interfaces, types, and constants for dependent task reference
 ### Windows Path Format Guidelines
- **Quick Ref**: `C:\Users` → MCP: `C:\\Users` | Bash: `/c/Users` or `C:/Users`
+- **Quick Ref**: `C:\Users` → MCP: `C:\\Users` | Bash: `/c/Users` or `C:/Users`
 </behavioral_rules>
 <output_contract>
 ## Return Protocol
 Return ONE of these markers as the LAST section of output:
 ### Success
 ```
 ## TASK COMPLETE
 {Summary of what was implemented}
 {Files modified/created: file paths}
 {Tests: pass/fail count}
 {Key outputs: components, functions, interfaces created}
 ```
 ### Blocked
 ```
 ## TASK BLOCKED
 **Blocker:** {What's missing or preventing progress}
 **Need:** {Specific action/info that would unblock}
 **Attempted:** {What was tried before declaring blocked}
 ```
 ### Checkpoint
 ```
 ## CHECKPOINT REACHED
 **Question:** {Decision needed from orchestrator/user}
 **Context:** {Why this matters for implementation}
 **Options:**
 1. {Option A} — {effect on implementation}
 2. {Option B} — {effect on implementation}
 ```
 </output_contract>
 <quality_gate>
 Before returning, verify:
 - [ ] **Module verification complete** - All referenced modules/packages exist (verified with rg/grep/search)
 - [ ] Code compiles/runs without errors
 - [ ] All tests pass
 - [ ] Follows project conventions
 - [ ] Clear naming and error handling
 - [ ] No unnecessary complexity
 - [ ] Minimal debug output (essential logging only)
 - [ ] ASCII-only characters (no emojis/Unicode)
 - [ ] GBK encoding compatible
 - [ ] TODO list updated
 - [ ] Comprehensive summary document generated with all new components/methods listed
 </quality_gate>
--- a/.claude/skills/delegation-check/SKILL.md
+++ b/.claude/skills/delegation-check/SKILL.md
@@ -0,0 +1,290 @@
 ---
 name: delegation-check
 description: Check workflow delegation prompts against agent role definitions for content separation violations. Detects conflicts, duplication, boundary leaks, and missing contracts. Triggers on "check delegation", "delegation conflict", "prompt vs role check".
 allowed-tools: Read, Glob, Grep, Bash, AskUserQuestion
 ---
 <purpose>
 Validate that command delegation prompts (Agent() calls) and agent role definitions respect GSD content separation boundaries. Detects 7 conflict dimensions: role re-definition, domain expertise leaking into prompts, quality gate duplication, output format conflicts, process override, scope authority conflicts, and missing contracts.
 Invoked when user requests "check delegation", "delegation conflict", "prompt vs role check", or when reviewing workflow skill quality.
 </purpose>
 <required_reading>
 - @.claude/skills/delegation-check/specs/separation-rules.md
 </required_reading>
 <process>
 ## 1. Determine Scan Scope
 Parse `$ARGUMENTS` to identify what to check.
 | Signal | Scope |
 |--------|-------|
 | File path to command `.md` | Single command + its agents |
 | File path to agent `.md` | Single agent + commands that spawn it |
 | Directory path (e.g., `.claude/skills/team-*/`) | All commands + agents in that skill |
 | "all" or no args | Scan all `.claude/commands/`, `.claude/skills/*/`, `.claude/agents/` |
 If ambiguous, ask:
 ```
 AskUserQuestion(
  header: "Scan Scope",
  question: "What should I check for delegation conflicts?",
  options: [
    { label: "Specific skill", description: "Check one skill directory" },
    { label: "Specific command+agent pair", description: "Check one command and its spawned agents" },
    { label: "Full scan", description: "Scan all commands, skills, and agents" }
  ]
 )
 ```
 ## 2. Discover Command-Agent Pairs
 For each command file in scope:
 **2a. Extract Agent() calls from commands:**
 ```bash
 # Search both Agent() (current) and Task() (legacy GSD) patterns
 grep -n "Agent(\|Task(" "$COMMAND_FILE"
 grep -n "subagent_type" "$COMMAND_FILE"
 ```
 For each `Agent()` call, extract:
 - `subagent_type` → agent name
 - Full prompt content between the prompt markers (the string passed as `prompt=`)
 - Line range of the delegation prompt
 **2b. Locate agent definitions:**
 For each `subagent_type` found:
 ```bash
 # Check standard locations
 ls .claude/agents/${AGENT_NAME}.md 2>/dev/null
 ls .claude/skills/*/agents/${AGENT_NAME}.md 2>/dev/null
 ```
 **2c. Build pair map:**
 ```
 $PAIRS = [
  {
    command: { path, agent_calls: [{ line, subagent_type, prompt_content }] },
    agent: { path, role, sections, quality_gate, output_contract }
  }
 ]
 ```
 If an agent file cannot be found, record as `MISSING_AGENT` — this is itself a finding.
 ## 3. Parse Delegation Prompts
 For each Agent() call, extract structured blocks from the prompt content:
 | Block | What It Contains |
 |-------|-----------------|
 | `<objective>` | What to accomplish |
 | `<files_to_read>` | Input file paths |
 | `<additional_context>` / `<planning_context>` / `<verification_context>` | Runtime parameters |
 | `<output>` / `<expected_output>` | Output format/location expectations |
 | `<quality_gate>` | Per-invocation quality checklist |
 | `<deep_work_rules>` / `<instructions>` | Cross-cutting policy or revision instructions |
 | `<downstream_consumer>` | Who consumes the output |
 | `<success_criteria>` | Success conditions |
 | Free-form text | Unstructured instructions |
 Also detect ANTI-PATTERNS in prompt content:
 - Role identity statements ("You are a...", "Your role is...")
 - Domain expertise (decision tables, heuristics, comparison examples)
 - Process definitions (numbered steps, step-by-step instructions beyond scope)
 - Philosophy statements ("always prefer...", "never do...")
 - Anti-pattern lists that belong in agent definition
 ## 4. Parse Agent Definitions
 For each agent file, extract:
 | Section | Key Content |
 |---------|------------|
 | `<role>` | Identity, spawner, responsibilities, mandatory read |
 | `<philosophy>` | Guiding principles |
 | `<upstream_input>` | How agent interprets input |
 | `<output_contract>` | Return markers (COMPLETE/BLOCKED/CHECKPOINT) |
 | `<quality_gate>` | Self-check criteria |
 | Domain sections | All `<section_name>` tags with their content |
 | YAML frontmatter | name, description, tools |
 ## 5. Run Conflict Checks (7 Dimensions)
 ### Dimension 1: Role Re-definition
 **Question:** Does the delegation prompt redefine the agent's identity?
 **Check:** Scan prompt content for:
 - "You are a..." / "You are the..." / "Your role is..."
 - "Your job is to..." / "Your responsibility is..."
 - "Core responsibilities:" lists
 - Any content that contradicts agent's `<role>` section
 **Allowed:** References to mode ("standard mode", "revision mode") that the agent's `<role>` already lists in "Spawned by:".
 **Severity:** `error` if prompt redefines role; `warning` if prompt adds responsibilities not in agent's `<role>`.
 ### Dimension 2: Domain Expertise Leak
 **Question:** Does the delegation prompt embed domain knowledge that belongs in the agent?
 **Check:** Scan prompt content for:
 - Decision/routing tables (`| Condition | Action |`)
 - Good-vs-bad comparison examples (`| TOO VAGUE | JUST RIGHT |`)
 - Heuristic rules ("If X then Y", "Always prefer Z")
 - Anti-pattern lists ("DO NOT...", "NEVER...")
 - Detailed process steps beyond task scope
 **Exception:** `<deep_work_rules>` is an acceptable cross-cutting policy pattern from GSD — flag as `info` only.
 **Severity:** `error` if prompt contains domain tables/examples that duplicate agent content; `warning` if prompt contains heuristics not in agent.
 ### Dimension 3: Quality Gate Duplication
 **Question:** Do the prompt's quality checks overlap or conflict with the agent's own `<quality_gate>`?
 **Check:** Compare prompt `<quality_gate>` / `<success_criteria>` items against agent's `<quality_gate>` items:
 - **Duplicate:** Same check appears in both → `warning` (redundant, may diverge)
 - **Conflict:** Contradictory criteria (e.g., prompt says "max 3 tasks", agent says "max 5 tasks") → `error`
 - **Missing:** Prompt expects quality checks agent doesn't have → `info`
 **Severity:** `error` for contradictions; `warning` for duplicates; `info` for gaps.
 ### Dimension 4: Output Format Conflict
 **Question:** Does the prompt's expected output format conflict with the agent's `<output_contract>`?
 **Check:**
 - Prompt `<expected_output>` markers vs agent's `<output_contract>` return markers
 - Prompt expects specific format agent doesn't define
 - Prompt expects file output but agent's contract only defines markers (or vice versa)
 - Return marker names differ (prompt expects `## DONE`, agent returns `## TASK COMPLETE`)
 **Severity:** `error` if return markers conflict; `warning` if format expectations unspecified on either side.
 ### Dimension 5: Process Override
 **Question:** Does the delegation prompt dictate HOW the agent should work?
 **Check:** Scan prompt for:
 - Numbered step-by-step instructions ("Step 1:", "First..., Then..., Finally...")
 - Process flow definitions beyond `<objective>` scope
 - Tool usage instructions ("Use grep to...", "Run bash command...")
 - Execution ordering that conflicts with agent's own execution flow
 **Allowed:** `<instructions>` block for revision mode (telling agent what changed, not how to work).
 **Severity:** `error` if prompt overrides agent's process; `warning` if prompt suggests process hints.
 ### Dimension 6: Scope Authority Conflict
 **Question:** Does the prompt make decisions that belong to the agent's domain?
 **Check:**
 - Prompt specifies implementation choices (library selection, architecture patterns) when agent's `<philosophy>` or domain sections own these decisions
 - Prompt overrides agent's discretion areas
 - Prompt locks decisions that agent's `<context_fidelity>` says are "Claude's Discretion"
 **Allowed:** Passing through user-locked decisions from CONTEXT.md — this is proper delegation, not authority conflict.
 **Severity:** `error` if prompt makes domain decisions agent should own; `info` if prompt passes through user decisions (correct behavior).
 ### Dimension 7: Missing Contracts
 **Question:** Are the delegation handoff points properly defined?
 **Check:**
 - Agent has `<output_contract>` with return markers → command handles all markers?
 - Command's return handling covers COMPLETE, BLOCKED, CHECKPOINT
 - Agent lists "Spawned by:" — does command actually spawn it?
 - Agent expects `<files_to_read>` — does prompt provide it?
 - Agent has `<upstream_input>` — does prompt provide matching input structure?
 **Severity:** `error` if return marker handling is missing; `warning` if agent expects input the prompt doesn't provide.
 ## 6. Aggregate and Report
 ### 6a. Per-pair summary
 For each command-agent pair, aggregate findings:
 ```
 {command_path} → {agent_name}
  Agent() at line {N}:
    D1 (Role Re-def):      {PASS|WARN|ERROR} — {detail}
    D2 (Domain Leak):       {PASS|WARN|ERROR} — {detail}
    D3 (Quality Gate):      {PASS|WARN|ERROR} — {detail}
    D4 (Output Format):     {PASS|WARN|ERROR} — {detail}
    D5 (Process Override):  {PASS|WARN|ERROR} — {detail}
    D6 (Scope Authority):   {PASS|WARN|ERROR} — {detail}
    D7 (Missing Contract):  {PASS|WARN|ERROR} — {detail}
 ```
 ### 6b. Overall verdict
 | Verdict | Condition |
 |---------|-----------|
 | **CLEAN** | 0 errors, 0-2 warnings |
 | **REVIEW** | 0 errors, 3+ warnings |
 | **CONFLICT** | 1+ errors |
 ### 6c. Fix recommendations
 For each finding, provide:
 - **Location:** file:line
 - **What's wrong:** concrete description
 - **Fix:** move content to correct owner (command or agent)
 - **Example:** before/after snippet if applicable
 ## 7. Present Results
 ```
 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 DELEGATION-CHECK ► SCAN COMPLETE
 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 Scope: {description}
 Pairs checked: {N} command-agent pairs
 Findings: {E} errors, {W} warnings, {I} info
 Verdict: {CLEAN | REVIEW | CONFLICT}
 | Pair | D1 | D2 | D3 | D4 | D5 | D6 | D7 |
 |------|----|----|----|----|----|----|-----|
 | {cmd} → {agent} | ✅ | ⚠️ | ✅ | ✅ | ❌ | ✅ | ✅ |
 | ... | | | | | | | |
 {If CONFLICT: detailed findings with fix recommendations}
 ───────────────────────────────────────────────────────
 ## Fix Priority
 1. {Highest severity fix}
 2. {Next fix}
 ...
 ───────────────────────────────────────────────────────
 ```
 </process>
 <success_criteria>
 - [ ] Scan scope determined and all files discovered
 - [ ] All Agent() calls extracted from commands with full prompt content
 - [ ] All corresponding agent definitions located and parsed
 - [ ] 7 conflict dimensions checked for each command-agent pair
 - [ ] No false positives on legitimate patterns (mode references, user decision passthrough, `<deep_work_rules>`)
 - [ ] Fix recommendations provided for every error/warning
 - [ ] Summary table with per-pair dimension results displayed
 - [ ] Overall verdict determined (CLEAN/REVIEW/CONFLICT)
 </success_criteria>
--- a/.claude/skills/delegation-check/specs/separation-rules.md
+++ b/.claude/skills/delegation-check/specs/separation-rules.md
@@ -0,0 +1,269 @@
 # GSD Content Separation Rules
 Rules for validating the boundary between **command delegation prompts** (Agent() calls) and **agent role definitions** (agent `.md` files). Derived from analysis of GSD's `plan-phase.md`, `execute-phase.md`, `research-phase.md` and their corresponding agents (`gsd-planner`, `gsd-plan-checker`, `gsd-executor`, `gsd-phase-researcher`, `gsd-verifier`).
 ## Core Principle
 **Commands own WHEN and WHERE. Agents own WHO and HOW.**
 A delegation prompt tells the agent what to do *this time*. The agent definition tells the agent who it *always* is.
 ## Ownership Matrix
 ### Command Delegation Prompt Owns
 | Concern | XML Block | Example |
 |---------|-----------|---------|
 | What to accomplish | `<objective>` | "Execute plan 3 of phase 2" |
 | Input file paths | `<files_to_read>` | "- {state_path} (Project State)" |
 | Runtime parameters | `<additional_context>` | "Phase: 5, Mode: revision" |
 | Output location | `<output>` | "Write to: {phase_dir}/RESEARCH.md" |
 | Expected return format | `<expected_output>` | "## VERIFICATION PASSED or ## ISSUES FOUND" |
 | Who consumes output | `<downstream_consumer>` | "Output consumed by /gsd:execute-phase" |
 | Revision context | `<instructions>` | "Make targeted updates to address checker issues" |
 | Cross-cutting policy | `<deep_work_rules>` | Anti-shallow execution rules (applies to all agents) |
 | Per-invocation quality | `<quality_gate>` (in prompt) | Invocation-specific checks (e.g., "every task has `<read_first>`") |
 | Flow control | Revision loops, return routing | "If TASK COMPLETE → step 13. If BLOCKED → offer options" |
 | User interaction | `AskUserQuestion` | "Provide context / Skip / Abort" |
 | Banners | Status display | "━━━ GSD ► PLANNING PHASE {X} ━━━" |
 ### Agent Role Definition Owns
 | Concern | XML Section | Example |
 |---------|-------------|---------|
 | Identity | `<role>` | "You are a GSD planner" |
 | Spawner list | `<role>` → Spawned by | "/gsd:plan-phase orchestrator" |
 | Responsibilities | `<role>` → Core responsibilities | "Decompose phases into parallel-optimized plans" |
 | Mandatory read protocol | `<role>` → Mandatory Initial Read | "MUST use Read tool to load every file in `<files_to_read>`" |
 | Project discovery | `<project_context>` | "Read CLAUDE.md, check .claude/skills/" |
 | Guiding principles | `<philosophy>` | Quality degradation curve by context usage |
 | Input interpretation | `<upstream_input>` | "Decisions → LOCKED, Discretion → freedom" |
 | Decision honoring | `<context_fidelity>` | "Locked decisions are NON-NEGOTIABLE" |
 | Core insight | `<core_principle>` | "Plan completeness ≠ Goal achievement" |
 | Domain expertise | Named domain sections | `<verification_dimensions>`, `<task_breakdown>`, `<dependency_graph>` |
 | Return protocol | `<output_contract>` | TASK COMPLETE / TASK BLOCKED / CHECKPOINT REACHED |
 | Self-check | `<quality_gate>` (in agent) | Permanent checks for every invocation |
 | Anti-patterns | `<anti_patterns>` | "DO NOT check code existence" |
 | Examples | `<examples>` | Scope exceeded analysis example |
 ## Conflict Patterns
 ### Pattern 1: Role Re-definition
 **Symptom:** Delegation prompt contains identity language.
 ```
 # BAD — prompt redefines role
 Agent({
  subagent_type: "gsd-plan-checker",
  prompt: "You are a code quality expert. Your job is to review plans...
    <objective>Verify phase 5 plans</objective>"
 })
 # GOOD — prompt states objective only
 Agent({
  subagent_type: "gsd-plan-checker",
  prompt: "<verification_context>
    <files_to_read>...</files_to_read>
  </verification_context>
  <expected_output>## VERIFICATION PASSED or ## ISSUES FOUND</expected_output>"
 })
 ```
 **Why it's wrong:** The agent's `<role>` section already defines identity. Re-definition in prompt can contradict, confuse, or override the agent's self-understanding.
 **Detection:** Regex for `You are a|Your role is|Your job is to|Your responsibility is|Core responsibilities:` in prompt content.
 ### Pattern 2: Domain Expertise Leak
 **Symptom:** Delegation prompt contains decision tables, heuristics, or examples.
 ```
 # BAD — prompt embeds domain knowledge
 Agent({
  subagent_type: "gsd-planner",
  prompt: "<objective>Create plans for phase 3</objective>
    Remember: tasks should have 2-3 items max.
    | TOO VAGUE | JUST RIGHT |
    | 'Add auth' | 'Add JWT auth with refresh rotation' |"
 })
 # GOOD — agent's own <task_breakdown> section owns this knowledge
 Agent({
  subagent_type: "gsd-planner",
  prompt: "<planning_context>
    <files_to_read>...</files_to_read>
  </planning_context>"
 })
 ```
 **Why it's wrong:** Domain knowledge in prompts duplicates agent content. When agent evolves, prompt doesn't update — they diverge. Agent's domain sections are the single source of truth.
 **Exception — `<deep_work_rules>`:** GSD uses this as a cross-cutting policy block (not domain expertise per se) that applies anti-shallow-execution rules across all agents. This is acceptable because:
 1. It's structural policy, not domain knowledge
 2. It applies uniformly to all planning agents
 3. It supplements (not duplicates) agent's own quality gate
 **Detection:**
 - Tables with `|` in prompt content (excluding `<files_to_read>` path tables)
 - "Good:" / "Bad:" / "Example:" comparison pairs
 - "Always..." / "Never..." / "Prefer..." heuristic statements
 - Numbered rules lists (>3 items) that aren't revision instructions
 ### Pattern 3: Quality Gate Duplication
 **Symptom:** Same quality check appears in both prompt and agent definition.
 ```
 # PROMPT quality_gate
 - [ ] Every task has `<read_first>`
 - [ ] Every task has `<acceptance_criteria>`
 - [ ] Dependencies correctly identified
 # AGENT quality_gate
 - [ ] Every task has `<read_first>` with at least the file being modified
 - [ ] Every task has `<acceptance_criteria>` with grep-verifiable conditions
 - [ ] Dependencies correctly identified
 ```
 **Analysis:**
 - "Dependencies correctly identified" → **duplicate** (exact match)
 - "`<read_first>`" in both → **overlap** (prompt is less specific than agent)
 - "`<acceptance_criteria>`" → **overlap** (same check, different specificity)
 **When duplication is OK:** Prompt's `<quality_gate>` adds *invocation-specific* checks not in agent's permanent gate (e.g., "Phase requirement IDs all covered" is specific to this phase, not general).
 **Detection:** Fuzzy match quality gate items between prompt and agent (>60% token overlap).
 ### Pattern 4: Output Format Conflict
 **Symptom:** Command expects return markers the agent doesn't define.
 ```
 # COMMAND handles:
 - "## VERIFICATION PASSED" → continue
 - "## ISSUES FOUND" → revision loop
 # AGENT <output_contract> defines:
 - "## TASK COMPLETE"
 - "## TASK BLOCKED"
 ```
 **Why it's wrong:** Command routes on markers. If markers don't match, routing breaks silently — command may hang or misinterpret results.
 **Detection:** Extract return marker strings from both sides, compare sets.
 ### Pattern 5: Process Override
 **Symptom:** Prompt dictates step-by-step process.
 ```
 # BAD — prompt overrides agent's process
 Agent({
  subagent_type: "gsd-planner",
  prompt: "Step 1: Read the roadmap. Step 2: Extract requirements.
    Step 3: Create task breakdown. Step 4: Assign waves..."
 })
 # GOOD — prompt states objective, agent decides process
 Agent({
  subagent_type: "gsd-planner",
  prompt: "<objective>Create plans for phase 5</objective>
    <files_to_read>...</files_to_read>"
 })
 ```
 **Exception — Revision instructions:** `<instructions>` block in revision prompts is acceptable because it tells the agent *what changed* (checker issues), not *how to work*.
 ```
 # OK — revision context, not process override
 <instructions>
 Make targeted updates to address checker issues.
 Do NOT replan from scratch unless issues are fundamental.
 Return what changed.
 </instructions>
 ```
 **Detection:** "Step N:" / "First..." / "Then..." / "Finally..." patterns in prompt content outside `<instructions>` blocks.
 ### Pattern 6: Scope Authority Conflict
 **Symptom:** Prompt makes domain decisions the agent should own.
 ```
 # BAD — prompt decides implementation details
 Agent({
  subagent_type: "gsd-planner",
  prompt: "Use React Query for data fetching. Use Zustand for state management.
    <objective>Plan the frontend architecture</objective>"
 })
 # GOOD — user decisions passed through from CONTEXT.md
 Agent({
  subagent_type: "gsd-planner",
  prompt: "<planning_context>
    <files_to_read>
    - {context_path} (USER DECISIONS - locked: React Query, Zustand)
    </files_to_read>
  </planning_context>"
 })
 ```
 **Key distinction:**
 - **Prompt making decisions** = conflict (command shouldn't have domain opinion)
 - **Prompt passing through user decisions** = correct (user decisions flow through command to agent)
 - **Agent interpreting user decisions** = correct (agent's `<context_fidelity>` handles locked/deferred/discretion)
 **Detection:** Technical nouns (library names, architecture patterns) in prompt free text (not inside `<files_to_read>` path descriptions).
 ### Pattern 7: Missing Contracts
 **Symptom:** Handoff points between command and agent are incomplete.
 | Missing Element | Impact |
 |-----------------|--------|
 | Agent has no `<output_contract>` | Command can't route on return markers |
 | Command doesn't handle all agent return markers | BLOCKED/CHECKPOINT silently ignored |
 | Agent expects `<files_to_read>` but prompt doesn't provide it | Agent starts without context |
 | Agent's "Spawned by:" doesn't list this command | Agent may not expect this invocation pattern |
 | Agent has `<upstream_input>` but prompt doesn't match structure | Agent misinterprets input |
 **Detection:** Cross-reference both sides for completeness.
 ## The `<deep_work_rules>` Exception
 GSD's plan-phase uses `<deep_work_rules>` in delegation prompts. This is a deliberate design choice, not a violation:
 1. **It's cross-cutting policy**: applies to ALL planning agents equally
 2. **It's structural**: defines required fields (`<read_first>`, `<acceptance_criteria>`, `<action>` concreteness) — not domain expertise
 3. **It supplements agent quality**: agent's own `<quality_gate>` is self-check; deep_work_rules is command-imposed minimum standard
 4. **It's invocation-specific context**: different commands might impose different work rules
 **Rule:** `<deep_work_rules>` in a delegation prompt is `info` level, not error. Flag only if its content duplicates agent's domain sections verbatim.
 ## Severity Classification
 | Severity | When | Action Required |
 |----------|------|-----------------|
 | `error` | Actual conflict: contradictory content between prompt and agent | Must fix — move content to correct owner |
 | `warning` | Duplication or boundary blur without contradiction | Should fix — consolidate to single source of truth |
 | `info` | Acceptable pattern that looks like violation but isn't | No action — document why it's OK |
 ## Quick Reference: Is This Content in the Right Place?
 | Content | In Prompt? | In Agent? |
 |---------|-----------|-----------|
 | "You are a..." | ❌ Never | ✅ Always |
 | File paths for this invocation | ✅ Yes | ❌ No |
 | Phase number, mode | ✅ Yes | ❌ No |
 | Decision tables | ❌ Never | ✅ Always |
 | Good/bad examples | ❌ Never | ✅ Always |
 | "Write to: {path}" | ✅ Yes | ❌ No |
 | Return markers handling | ✅ Yes (routing) | ✅ Yes (definition) |
 | Quality gate | ✅ Per-invocation | ✅ Permanent self-check |
 | "MUST read files first" | ❌ Agent's `<role>` owns this | ✅ Always |
 | Anti-shallow rules | ⚠️ OK as cross-cutting policy | ✅ Preferred |
 | Revision instructions | ✅ Yes (what changed) | ❌ No |
 | Heuristics / philosophy | ❌ Never | ✅ Always |
 | Banner display | ✅ Yes | ❌ Never |
 | AskUserQuestion | ✅ Yes | ❌ Never |
--- a/.claude/skills/prompt-generator/SKILL.md
+++ b/.claude/skills/prompt-generator/SKILL.md
@@ -165,14 +165,14 @@ Generate a complete command file with:
 3. **`<process>`** — numbered steps (GSD workflow style):
   - Step 1: Initialize / parse arguments
   - Steps 2-N: Domain-specific orchestration logic
-   - Each step: banner display, validation, agent spawning via `Task()`, error handling
+   - Each step: banner display, validation, agent spawning via `Agent()`, error handling
   - Final step: status display + `<offer_next>` with next actions
 4. **`<success_criteria>`** — checkbox list of verifiable conditions
 **Command writing rules:**
 - Steps are **numbered** (`## 1.`, `## 2.`) — follow `plan-phase.md` and `new-project.md` style
 - Use banners for phase transitions: `━━━ SKILL ► ACTION ━━━`
- Agent spawning uses `Task(prompt, subagent_type, description)` pattern
+- Agent spawning uses `Agent({ subagent_type, prompt, description, run_in_background })` pattern
 - Prompt to agents uses `<objective>`, `<files_to_read>`, `<output>` blocks
 - Include `<offer_next>` block with formatted completion status
 - Handle agent return markers: `## TASK COMPLETE`, `## TASK BLOCKED`, `## CHECKPOINT REACHED`
@@ -286,7 +286,7 @@ Set `$TARGET_PATH = $SOURCE_PATH` (in-place conversion) unless user specifies ou
 | `<process>` with numbered steps | At least 3 `## N.` headers |
 | Step 1 is initialization | Parses args or loads context |
 | Last step is status/report | Displays results or routes to `<offer_next>` |
-| Agent spawning (if complex) | `Task(` call with `subagent_type` |
+| Agent spawning (if complex) | `Agent({` call with `subagent_type` |
 | Agent prompt structure | `<files_to_read>` + `<objective>` or `<output>` blocks |
 | Return handling | Routes on `## TASK COMPLETE` / `## TASK BLOCKED` markers |
 | `<offer_next>` | Banner + summary + next command suggestion |
--- a/.claude/skills/prompt-generator/specs/agent-design-spec.md
+++ b/.claude/skills/prompt-generator/specs/agent-design-spec.md
@@ -4,7 +4,7 @@ Guidelines for Claude Code **agent definition files** (role + domain expertise).
 ## Content Separation Principle
-Agents are spawned by commands via `Task()`. The agent file defines WHO the agent is and WHAT it knows. It does NOT define WHEN or HOW it gets invoked.
+Agents are spawned by commands via `Agent()`. The agent file defines WHO the agent is and WHAT it knows. It does NOT define WHEN or HOW it gets invoked.
 | Concern | Belongs in Agent | Belongs in Command |
 |---------|-----------------|-------------------|
--- a/.claude/skills/prompt-generator/specs/command-design-spec.md
+++ b/.claude/skills/prompt-generator/specs/command-design-spec.md
@@ -153,15 +153,15 @@ Display banners before major phase transitions (agent spawning, user decisions,
 ## Agent Spawning Pattern
-Commands spawn agents via `Task()` with structured prompts:
+Commands spawn agents via `Agent()` with structured prompts:
-```markdown
+```javascript
-Task(
+Agent({
-  prompt=filled_prompt,
+  subagent_type: "agent-name",
-  subagent_type="agent-name",
+  prompt: filled_prompt,
-  model="{model}",
+  description: "Verb Phase {X}",
-  description="Verb Phase {X}"
+  run_in_background: false
-)
+})
 ```
 ### Prompt Structure for Agents
--- a/.claude/skills/prompt-generator/specs/conversion-spec.md
+++ b/.claude/skills/prompt-generator/specs/conversion-spec.md
@@ -49,7 +49,7 @@ Conversion Summary:
 | `## Auto Mode` / `## Auto Mode Defaults` | `<auto_mode>` section |
 | `## Quick Reference` | Preserve as-is within appropriate section |
 | Inline `AskUserQuestion` calls | Preserve verbatim — these belong in commands |
-| `Task()` / agent spawning calls | Preserve verbatim within process steps |
+| `Agent()` / agent spawning calls | Preserve verbatim within process steps |
 | Banner displays (`━━━`) | Preserve verbatim |
 | Code blocks (```bash, ```javascript, etc.) | **Preserve exactly** — never modify code content |
 | Tables | **Preserve exactly** — never reformat table content |
--- a/.claude/skills/prompt-generator/templates/command-md.md
+++ b/.claude/skills/prompt-generator/templates/command-md.md
@@ -49,12 +49,13 @@ allowed-tools: {tools}           # omit if unrestricted
 {Construct prompt with <files_to_read>, <objective>, <output> blocks.}
-```
+```javascript
-Task(
+Agent({
-  prompt=filled_prompt,
+  subagent_type: "{agent-name}",
-  subagent_type="{agent-name}",
+  prompt: filled_prompt,
-  description="{Verb} {target}"
+  description: "{Verb} {target}",
-)
+  run_in_background: false
 })
 ```
 ## 4. Handle Result
--- a/ccw/src/core/routes/codexlens/watcher-handlers.ts
+++ b/ccw/src/core/routes/codexlens/watcher-handlers.ts
@@ -8,9 +8,11 @@ import {
  checkVenvStatus,
  executeCodexLens,
  getVenvPythonPath,
  useCodexLensV2,
 } from '../../../tools/codex-lens.js';
 import type { RouteContext } from '../types.js';
 import { extractJSON, stripAnsiCodes } from './utils.js';
 import type { ChildProcess } from 'child_process';
 // File watcher state (persisted across requests)
 let watcherProcess: any = null;
@@ -43,6 +45,29 @@ export async function stopWatcherForUninstall(): Promise<void> {
  watcherProcess = null;
 }
 /**
 * Spawn v2 bridge watcher subprocess.
 * Runs 'codexlens-search watch --root X --debounce-ms Y' and reads JSONL stdout.
 * @param root - Root directory to watch
 * @param debounceMs - Debounce interval in milliseconds
 * @returns Spawned child process
 */
 function spawnV2Watcher(root: string, debounceMs: number): ChildProcess {
  const { spawn } = require('child_process') as typeof import('child_process');
  return spawn('codexlens-search', [
    'watch',
    '--root', root,
    '--debounce-ms', String(debounceMs),
    '--db-path', require('path').join(root, '.codexlens'),
  ], {
    cwd: root,
    shell: false,
    stdio: ['ignore', 'pipe', 'pipe'],
    windowsHide: true,
    env: { ...process.env, PYTHONIOENCODING: 'utf-8' },
  });
 }
 /**
 * Handle CodexLens watcher routes
 * @returns true if route was handled, false otherwise
@@ -91,46 +116,52 @@ export async function handleCodexLensWatcherRoutes(ctx: RouteContext): Promise<b
          return { success: false, error: `Path is not a directory: ${targetPath}`, status: 400 };
        }
-        // Get the codexlens CLI path
+        // Route to v2 or v1 watcher based on feature flag
-        const venvStatus = await checkVenvStatus();
+        if (useCodexLensV2()) {
-        if (!venvStatus.ready) {
+          // v2 bridge watcher: codexlens-search watch
-          return { success: false, error: 'CodexLens not installed', status: 400 };
+          console.log('[CodexLens] Using v2 bridge watcher');
-        }
+          watcherProcess = spawnV2Watcher(targetPath, debounceMs);
-
+        } else {
-        // Verify directory is indexed before starting watcher
+          // v1 watcher: python -m codexlens watch
-        try {
+          const venvStatus = await checkVenvStatus();
-          const statusResult = await executeCodexLens(['projects', 'list', '--json']);
+          if (!venvStatus.ready) {
-          if (statusResult.success && statusResult.output) {
+            return { success: false, error: 'CodexLens not installed', status: 400 };
            const parsed = extractJSON(statusResult.output);
            const projects = parsed.result || parsed || [];
            const normalizedTarget = targetPath.toLowerCase().replace(/\\/g, '/');
            const isIndexed = Array.isArray(projects) && projects.some((p: { source_root?: string }) =>
              p.source_root && p.source_root.toLowerCase().replace(/\\/g, '/') === normalizedTarget
            );
            if (!isIndexed) {
              return {
                success: false,
                error: `Directory is not indexed: ${targetPath}. Run 'codexlens init' first.`,
                status: 400
              };
            }
          }
        } catch (err) {
          console.warn('[CodexLens] Could not verify index status:', err);
          // Continue anyway - watcher will fail with proper error if not indexed
        }
-        // Spawn watch process using Python (no shell: true for security)
+          // Verify directory is indexed before starting watcher
-        // CodexLens is a Python package, must run via python -m codexlens
+          try {
-        const pythonPath = getVenvPythonPath();
+            const statusResult = await executeCodexLens(['projects', 'list', '--json']);
-        const args = ['-m', 'codexlens', 'watch', targetPath, '--debounce', String(debounceMs)];
+            if (statusResult.success && statusResult.output) {
-        watcherProcess = spawn(pythonPath, args, {
+              const parsed = extractJSON(statusResult.output);
-          cwd: targetPath,
+              const projects = parsed.result || parsed || [];
-          shell: false,
+              const normalizedTarget = targetPath.toLowerCase().replace(/\\/g, '/');
-          stdio: ['ignore', 'pipe', 'pipe'],
+              const isIndexed = Array.isArray(projects) && projects.some((p: { source_root?: string }) =>
-          windowsHide: true,
+                p.source_root && p.source_root.toLowerCase().replace(/\\/g, '/') === normalizedTarget
-          env: { ...process.env, PYTHONIOENCODING: 'utf-8' }
+              );
-        });
+              if (!isIndexed) {
                return {
                  success: false,
                  error: `Directory is not indexed: ${targetPath}. Run 'codexlens init' first.`,
                  status: 400
                };
              }
            }
          } catch (err) {
            console.warn('[CodexLens] Could not verify index status:', err);
            // Continue anyway - watcher will fail with proper error if not indexed
          }
          // Spawn watch process using Python (no shell: true for security)
          const pythonPath = getVenvPythonPath();
          const args = ['-m', 'codexlens', 'watch', targetPath, '--debounce', String(debounceMs)];
          watcherProcess = spawn(pythonPath, args, {
            cwd: targetPath,
            shell: false,
            stdio: ['ignore', 'pipe', 'pipe'],
            windowsHide: true,
            env: { ...process.env, PYTHONIOENCODING: 'utf-8' }
          });
        }
        watcherStats = {
          running: true,
@@ -153,13 +184,37 @@ export async function handleCodexLensWatcherRoutes(ctx: RouteContext): Promise<b
        }
        // Handle process output for event counting
        const isV2Watcher = useCodexLensV2();
        let stdoutLineBuffer = '';
        if (watcherProcess.stdout) {
          watcherProcess.stdout.on('data', (data: Buffer) => {
            const output = data.toString();
-            // Count processed events from output
+
-            const matches = output.match(/Processed \d+ events?/g);
+            if (isV2Watcher) {
-            if (matches) {
+              // v2 bridge outputs JSONL - parse line by line
-              watcherStats.events_processed += matches.length;
+              stdoutLineBuffer += output;
              const lines = stdoutLineBuffer.split('\n');
              // Keep incomplete last line in buffer
              stdoutLineBuffer = lines.pop() || '';
              for (const line of lines) {
                const trimmed = line.trim();
                if (!trimmed) continue;
                try {
                  const event = JSON.parse(trimmed);
                  // Count file change events (created, modified, deleted, moved)
                  if (event.event && event.event !== 'watching') {
                    watcherStats.events_processed += 1;
                  }
                } catch {
                  // Not valid JSON, skip
                }
              }
            } else {
              // v1 watcher: count text-based event messages
              const matches = output.match(/Processed \d+ events?/g);
              if (matches) {
                watcherStats.events_processed += matches.length;
              }
            }
          });
        }
--- a/ccw/src/tools/codex-lens.ts
+++ b/ccw/src/tools/codex-lens.ts
@@ -1067,6 +1067,103 @@ async function installSemantic(gpuMode: GpuMode = 'cpu'): Promise<BootstrapResul
  });
 }
 /**
 * Check if codexlens-search (v2) bridge CLI is installed and functional.
 * Runs 'codexlens-search status' and checks exit code.
 * @returns true if the v2 bridge CLI is available
 */
 function isCodexLensV2Installed(): boolean {
  try {
    const result = spawnSync('codexlens-search', ['status', '--db-path', '.codexlens'], {
      encoding: 'utf-8',
      timeout: EXEC_TIMEOUTS.PYTHON_VERSION,
      stdio: ['pipe', 'pipe', 'pipe'],
      windowsHide: true,
    });
    // Exit code 0 or valid JSON output means it's installed
    return result.status === 0 || (result.stdout != null && result.stdout.includes('"status"'));
  } catch {
    return false;
  }
 }
 /**
 * Bootstrap codexlens-search (v2) package using UV.
 * Installs 'codexlens-search[semantic]' into the shared CodexLens venv.
 * @returns Bootstrap result
 */
 async function bootstrapV2WithUv(): Promise<BootstrapResult> {
  console.log('[CodexLens] Bootstrapping codexlens-search (v2) with UV...');
  const preFlightError = preFlightCheck();
  if (preFlightError) {
    return { success: false, error: `Pre-flight failed: ${preFlightError}` };
  }
  repairVenvIfCorrupted();
  const uvInstalled = await ensureUvInstalled();
  if (!uvInstalled) {
    return { success: false, error: 'Failed to install UV package manager' };
  }
  const uv = createCodexLensUvManager();
  if (!uv.isVenvValid()) {
    console.log('[CodexLens] Creating virtual environment with UV for v2...');
    const createResult = await uv.createVenv();
    if (!createResult.success) {
      return { success: false, error: `Failed to create venv: ${createResult.error}` };
    }
  }
  // Find local codexlens-search package using unified discovery
  const { findCodexLensSearchPath } = await import('../utils/package-discovery.js');
  const discovery = findCodexLensSearchPath();
  const extras = ['semantic'];
  const editable = isDevEnvironment() && !discovery.insideNodeModules;
  if (!discovery.path) {
    // Fallback: try installing from PyPI
    console.log('[CodexLens] Local codexlens-search not found, trying PyPI install...');
    const pipResult = await uv.install(['codexlens-search[semantic]']);
    if (!pipResult.success) {
      return {
        success: false,
        error: `Failed to install codexlens-search from PyPI: ${pipResult.error}`,
        diagnostics: { venvPath: getCodexLensVenvDir(), installer: 'uv' },
      };
    }
  } else {
    console.log(`[CodexLens] Installing codexlens-search from local path with UV: ${discovery.path} (editable: ${editable})`);
    const installResult = await uv.installFromProject(discovery.path, extras, editable);
    if (!installResult.success) {
      return {
        success: false,
        error: `Failed to install codexlens-search: ${installResult.error}`,
        diagnostics: { packagePath: discovery.path, venvPath: getCodexLensVenvDir(), installer: 'uv', editable },
      };
    }
  }
  clearVenvStatusCache();
  console.log('[CodexLens] codexlens-search (v2) bootstrap complete');
  return {
    success: true,
    message: 'Installed codexlens-search (v2) with UV',
    diagnostics: { packagePath: discovery.path ?? undefined, venvPath: getCodexLensVenvDir(), installer: 'uv', editable },
  };
 }
 /**
 * Check if v2 bridge should be used based on CCW_USE_CODEXLENS_V2 env var.
 */
 function useCodexLensV2(): boolean {
  const flag = process.env.CCW_USE_CODEXLENS_V2;
  return flag === '1' || flag === 'true';
 }
 /**
 * Bootstrap CodexLens venv with required packages
 * @returns Bootstrap result
@@ -1074,6 +1171,23 @@ async function installSemantic(gpuMode: GpuMode = 'cpu'): Promise<BootstrapResul
 async function bootstrapVenv(): Promise<BootstrapResult> {
  const warnings: string[] = [];
  // If v2 flag is set, also bootstrap codexlens-search alongside v1
  if (useCodexLensV2() && await isUvAvailable()) {
    try {
      const v2Result = await bootstrapV2WithUv();
      if (v2Result.success) {
        console.log('[CodexLens] codexlens-search (v2) installed successfully');
      } else {
        console.warn(`[CodexLens] codexlens-search (v2) bootstrap failed: ${v2Result.error}`);
        warnings.push(`v2 bootstrap failed: ${v2Result.error || 'Unknown error'}`);
      }
    } catch (v2Err) {
      const msg = v2Err instanceof Error ? v2Err.message : String(v2Err);
      console.warn(`[CodexLens] codexlens-search (v2) bootstrap error: ${msg}`);
      warnings.push(`v2 bootstrap error: ${msg}`);
    }
  }
  // Prefer UV if available (faster package resolution and installation)
  if (await isUvAvailable()) {
    console.log('[CodexLens] Using UV for bootstrap...');
@@ -2502,6 +2616,10 @@ export {
  // UV-based installation functions
  bootstrapWithUv,
  installSemanticWithUv,
  // v2 bridge support
  useCodexLensV2,
  isCodexLensV2Installed,
  bootstrapV2WithUv,
 };
 // Export Python path for direct spawn usage (e.g., watcher)
--- a/ccw/src/tools/smart-search.ts
+++ b/ccw/src/tools/smart-search.ts
@@ -29,7 +29,9 @@ import {
  ensureLiteLLMEmbedderReady,
  executeCodexLens,
  getVenvPythonPath,
  useCodexLensV2,
 } from './codex-lens.js';
 import { execFile } from 'child_process';
 import type { ProgressInfo } from './codex-lens.js';
 import { getProjectRoot } from '../utils/path-validator.js';
 import { getCodexLensDataDir } from '../utils/codexlens-path.js';
@@ -2774,6 +2776,90 @@ async function executeRipgrepMode(params: Params): Promise<SearchResult> {
  });
 }
 // ========================================
 // codexlens-search v2 bridge integration
 // ========================================
 /**
 * Execute search via codexlens-search (v2) bridge CLI.
 * Spawns 'codexlens-search search --query X --top-k Y --db-path Z' and parses JSON output.
 *
 * @param query - Search query string
 * @param topK - Number of results to return
 * @param dbPath - Path to the v2 index database directory
 * @returns Parsed search results as SemanticMatch array
 */
 async function executeCodexLensV2Bridge(
  query: string,
  topK: number,
  dbPath: string,
 ): Promise<SearchResult> {
  return new Promise((resolve) => {
    const args = [
      'search',
      '--query', query,
      '--top-k', String(topK),
      '--db-path', dbPath,
    ];
    execFile('codexlens-search', args, {
      encoding: 'utf-8',
      timeout: EXEC_TIMEOUTS.PROCESS_SPAWN,
      windowsHide: true,
      env: { ...process.env, PYTHONIOENCODING: 'utf-8' },
    }, (error, stdout, stderr) => {
      if (error) {
        console.warn(`[CodexLens-v2] Bridge search failed: ${error.message}`);
        resolve({
          success: false,
          error: `codexlens-search v2 bridge failed: ${error.message}`,
        });
        return;
      }
      try {
        const parsed = JSON.parse(stdout.trim());
        // Bridge outputs {"error": string} on failure
        if (parsed && typeof parsed === 'object' && 'error' in parsed) {
          resolve({
            success: false,
            error: `codexlens-search v2: ${parsed.error}`,
          });
          return;
        }
        // Bridge outputs array of {path, score, snippet}
        const results: SemanticMatch[] = (Array.isArray(parsed) ? parsed : []).map((r: { path?: string; score?: number; snippet?: string }) => ({
          file: r.path || '',
          score: r.score || 0,
          content: r.snippet || '',
          symbol: null,
        }));
        resolve({
          success: true,
          results,
          metadata: {
            mode: 'semantic' as any,
            backend: 'codexlens-v2',
            count: results.length,
            query,
            note: 'Using codexlens-search v2 bridge (2-stage vector + reranking)',
          },
        });
      } catch (parseErr) {
        console.warn(`[CodexLens-v2] Failed to parse bridge output: ${(parseErr as Error).message}`);
        resolve({
          success: false,
          error: `Failed to parse codexlens-search v2 output: ${(parseErr as Error).message}`,
          output: stdout,
        });
      }
    });
  });
 }
 /**
 * Mode: exact - CodexLens exact/FTS search
 * Requires index
@@ -4276,7 +4362,21 @@ export async function handler(params: Record<string, unknown>): Promise<ToolResu
      case 'search':
      default:
-        // Handle search modes: fuzzy | semantic
+        // v2 bridge: if CCW_USE_CODEXLENS_V2 is set, try v2 bridge first for semantic mode
        if (useCodexLensV2() && (mode === 'semantic' || mode === 'fuzzy')) {
          const scope = resolveSearchScope(parsed.data.path ?? '.');
          const dbPath = join(scope.workingDirectory, '.codexlens');
          const topK = (parsed.data.maxResults || 5) + (parsed.data.extraFilesCount || 10);
          const v2Result = await executeCodexLensV2Bridge(parsed.data.query || '', topK, dbPath);
          if (v2Result.success) {
            result = v2Result;
            break;
          }
          // v2 failed, fall through to v1
          console.warn(`[CodexLens-v2] Falling back to v1: ${v2Result.error}`);
        }
        // Handle search modes: fuzzy | semantic (v1 path)
        switch (mode) {
          case 'fuzzy':
            result = await executeFuzzyMode(parsed.data);
--- a/ccw/src/utils/package-discovery.ts
+++ b/ccw/src/utils/package-discovery.ts
@@ -55,18 +55,20 @@ export interface PackageDiscoveryResult {
 }
 /** Known local package names */
-export type LocalPackageName = 'codex-lens' | 'ccw-litellm';
+export type LocalPackageName = 'codex-lens' | 'ccw-litellm' | 'codexlens-search';
 /** Environment variable mapping for each package */
 const PACKAGE_ENV_VARS: Record<LocalPackageName, string> = {
  'codex-lens': 'CODEXLENS_PACKAGE_PATH',
  'ccw-litellm': 'CCW_LITELLM_PATH',
  'codexlens-search': 'CODEXLENS_SEARCH_PATH',
 };
 /** Config key mapping for each package */
 const PACKAGE_CONFIG_KEYS: Record<LocalPackageName, string> = {
  'codex-lens': 'codexLensPath',
  'ccw-litellm': 'ccwLitellmPath',
  'codexlens-search': 'codexlensSearchPath',
 };
 // ========================================
@@ -296,6 +298,13 @@ export function findCcwLitellmPath(): PackageDiscoveryResult {
  return findPackagePath('ccw-litellm');
 }
 /**
 * Find codexlens-search (v2) package path (convenience wrapper)
 */
 export function findCodexLensSearchPath(): PackageDiscoveryResult {
  return findPackagePath('codexlens-search');
 }
 /**
 * Format search results for error messages
 */
--- a/codex-lens-v2/LICENSE
+++ b/codex-lens-v2/LICENSE
@@ -0,0 +1,21 @@
 MIT License
 Copyright (c) 2026 codexlens-search contributors
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
 in the Software without restriction, including without limitation the rights
 to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 copies of the Software, and to permit persons to whom the Software is
 furnished to do so, subject to the following conditions:
 The above copyright notice and this permission notice shall be included in all
 copies or substantial portions of the Software.
 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
--- a/codex-lens-v2/README.md
+++ b/codex-lens-v2/README.md
@@ -0,0 +1,146 @@
 # codexlens-search
 Lightweight semantic code search engine with 2-stage vector search, full-text search, and Reciprocal Rank Fusion.
 ## Overview
 codexlens-search provides fast, accurate code search through a multi-stage retrieval pipeline:
 1. **Binary coarse search** - Hamming-distance filtering narrows candidates quickly
 2. **ANN fine search** - HNSW or FAISS refines the candidate set with float vectors
 3. **Full-text search** - SQLite FTS5 handles exact and fuzzy keyword matching
 4. **RRF fusion** - Reciprocal Rank Fusion merges vector and text results
 5. **Reranking** - Optional cross-encoder or API-based reranker for final ordering
 The core library has **zero required dependencies**. Install optional extras to enable semantic search, GPU acceleration, or FAISS backends.
 ## Installation
 ```bash
 # Core only (FTS search, no vector search)
 pip install codexlens-search
 # With semantic search (recommended)
 pip install codexlens-search[semantic]
 # Semantic search + GPU acceleration
 pip install codexlens-search[semantic-gpu]
 # With FAISS backend (CPU)
 pip install codexlens-search[faiss-cpu]
 # With API-based reranker
 pip install codexlens-search[reranker-api]
 # Everything (semantic + GPU + FAISS + reranker)
 pip install codexlens-search[semantic-gpu,faiss-gpu,reranker-api]
 ```
 ## Quick Start
 ```python
 from codexlens_search import Config, IndexingPipeline, SearchPipeline
 from codexlens_search.core import create_ann_index, create_binary_index
 from codexlens_search.embed.local import FastEmbedEmbedder
 from codexlens_search.rerank.local import LocalReranker
 from codexlens_search.search.fts import FTSEngine
 # 1. Configure
 config = Config(embed_model="BAAI/bge-small-en-v1.5", embed_dim=384)
 # 2. Create components
 embedder = FastEmbedEmbedder(config)
 binary_store = create_binary_index(config, db_path="index/binary.db")
 ann_index = create_ann_index(config, index_path="index/ann.bin")
 fts = FTSEngine("index/fts.db")
 reranker = LocalReranker()
 # 3. Index files
 indexer = IndexingPipeline(embedder, binary_store, ann_index, fts, config)
 stats = indexer.index_directory("./src")
 print(f"Indexed {stats.files_processed} files, {stats.chunks_created} chunks")
 # 4. Search
 pipeline = SearchPipeline(embedder, binary_store, ann_index, reranker, fts, config)
 results = pipeline.search("authentication handler", top_k=10)
 for r in results:
    print(f"  {r.path} (score={r.score:.3f})")
 ```
 ## Extras
 | Extra | Dependencies | Description |
 |-------|-------------|-------------|
 | `semantic` | hnswlib, numpy, fastembed | Vector search with local embeddings |
 | `gpu` | onnxruntime-gpu | GPU-accelerated embedding inference |
 | `semantic-gpu` | semantic + gpu combined | Vector search with GPU acceleration |
 | `faiss-cpu` | faiss-cpu | FAISS ANN backend (CPU) |
 | `faiss-gpu` | faiss-gpu | FAISS ANN backend (GPU) |
 | `reranker-api` | httpx | Remote reranker API client |
 | `dev` | pytest, pytest-cov | Development and testing |
 ## Architecture
 ```
 Query
  |
  v
 [Embedder] --> query vector
  |
  +---> [BinaryStore.coarse_search] --> candidate IDs (Hamming distance)
  |         |
  |         v
  +---> [ANNIndex.fine_search] ------> ranked IDs (cosine/L2)
  |         |
  |         v  (intersect)
  |     vector_results
  |
  +---> [FTSEngine.exact_search] ----> exact text matches
  +---> [FTSEngine.fuzzy_search] ----> fuzzy text matches
  |
  v
 [RRF Fusion] --> merged ranking (adaptive weights by query intent)
  |
  v
 [Reranker] --> final top-k results
 ```
 ### Key Design Decisions
 - **2-stage vector search**: Binary coarse search (fast Hamming distance on binarized vectors) filters candidates before the more expensive ANN search. This keeps memory usage low and search fast even on large corpora.
 - **Parallel retrieval**: Vector search and FTS run concurrently via ThreadPoolExecutor.
 - **Adaptive fusion weights**: Query intent detection adjusts RRF weights between vector and text signals.
 - **Backend abstraction**: ANN index supports both hnswlib and FAISS backends via a factory function.
 - **Zero core dependencies**: The base package requires only Python 3.10+. All heavy dependencies are optional.
 ## Configuration
 The `Config` dataclass controls all pipeline parameters:
 ```python
 from codexlens_search import Config
 config = Config(
    embed_model="BAAI/bge-small-en-v1.5",  # embedding model name
    embed_dim=384,                           # embedding dimension
    embed_batch_size=64,                     # batch size for embedding
    ann_backend="auto",                      # 'auto', 'faiss', 'hnswlib'
    binary_top_k=200,                        # binary coarse search candidates
    ann_top_k=50,                            # ANN fine search candidates
    fts_top_k=50,                            # FTS results per method
    device="auto",                           # 'auto', 'cuda', 'cpu'
 )
 ```
 ## Development
 ```bash
 git clone https://github.com/nicepkg/codexlens-search.git
 cd codexlens-search
 pip install -e ".[dev,semantic]"
 pytest
 ```
 ## License
 MIT
--- a/codex-lens-v2/dist/codexlens_search-0.2.0-py3-none-any.whl
+++ b/codex-lens-v2/dist/codexlens_search-0.2.0-py3-none-any.whl
--- a/codex-lens-v2/dist/codexlens_search-0.2.0.tar.gz
+++ b/codex-lens-v2/dist/codexlens_search-0.2.0.tar.gz
--- a/codex-lens-v2/pyproject.toml
+++ b/codex-lens-v2/pyproject.toml
@@ -8,6 +8,26 @@ version = "0.2.0"
 description = "Lightweight semantic code search engine — 2-stage vector + FTS + RRF fusion"
 requires-python = ">=3.10"
 dependencies = []
 license = {text = "MIT"}
 readme = "README.md"
 authors = [
    {name = "codexlens-search contributors"},
 ]
 classifiers = [
    "Programming Language :: Python :: 3",
    "Programming Language :: Python :: 3.10",
    "Programming Language :: Python :: 3.11",
    "Programming Language :: Python :: 3.12",
    "Programming Language :: Python :: 3.13",
    "License :: OSI Approved :: MIT License",
    "Topic :: Software Development :: Libraries",
    "Topic :: Text Processing :: Indexing",
    "Operating System :: OS Independent",
 ]
 [project.urls]
 Homepage = "https://github.com/nicepkg/codexlens-search"
 Repository = "https://github.com/nicepkg/codexlens-search"
 [project.optional-dependencies]
 semantic = [
@@ -27,10 +47,22 @@ faiss-gpu = [
 reranker-api = [
    "httpx>=0.25",
 ]
 watcher = [
    "watchdog>=3.0",
 ]
 semantic-gpu = [
    "hnswlib>=0.8.0",
    "numpy>=1.26",
    "fastembed>=0.4.0,<2.0",
    "onnxruntime-gpu>=1.16",
 ]
 dev = [
    "pytest>=7.0",
    "pytest-cov",
 ]
 [project.scripts]
 codexlens-search = "codexlens_search.bridge:main"
 [tool.hatch.build.targets.wheel]
 packages = ["src/codexlens_search"]
--- a/codex-lens-v2/src/codexlens_search/bridge.py
+++ b/codex-lens-v2/src/codexlens_search/bridge.py
@@ -0,0 +1,407 @@
 """CLI bridge for ccw integration.
 Argparse-based CLI with JSON output protocol.
 Each subcommand outputs a single JSON object to stdout.
 Watch command outputs JSONL (one JSON per line).
 All errors are JSON {"error": string} to stdout with non-zero exit code.
 """
 from __future__ import annotations
 import argparse
 import glob
 import json
 import logging
 import os
 import sys
 import time
 from pathlib import Path
 log = logging.getLogger("codexlens_search.bridge")
 # ---------------------------------------------------------------------------
 # Helpers
 # ---------------------------------------------------------------------------
 def _json_output(data: dict | list) -> None:
    """Print JSON to stdout with flush."""
    print(json.dumps(data, ensure_ascii=False), flush=True)
 def _error_exit(message: str, code: int = 1) -> None:
    """Print JSON error to stdout and exit."""
    _json_output({"error": message})
    sys.exit(code)
 def _resolve_db_path(args: argparse.Namespace) -> Path:
    """Return the --db-path as a resolved Path, creating parent dirs."""
    db_path = Path(args.db_path).resolve()
    db_path.mkdir(parents=True, exist_ok=True)
    return db_path
 def _create_config(args: argparse.Namespace) -> "Config":
    """Build Config from CLI args."""
    from codexlens_search.config import Config
    kwargs: dict = {}
    if hasattr(args, "embed_model") and args.embed_model:
        kwargs["embed_model"] = args.embed_model
    db_path = Path(args.db_path).resolve()
    kwargs["metadata_db_path"] = str(db_path / "metadata.db")
    return Config(**kwargs)
 def _create_pipeline(
    args: argparse.Namespace,
 ) -> tuple:
    """Lazily construct pipeline components from CLI args.
    Returns (indexing_pipeline, search_pipeline, config).
    Only loads embedder/reranker models when needed.
    """
    from codexlens_search.config import Config
    from codexlens_search.core.factory import create_ann_index, create_binary_index
    from codexlens_search.embed.local import FastEmbedEmbedder
    from codexlens_search.indexing.metadata import MetadataStore
    from codexlens_search.indexing.pipeline import IndexingPipeline
    from codexlens_search.rerank.local import FastEmbedReranker
    from codexlens_search.search.fts import FTSEngine
    from codexlens_search.search.pipeline import SearchPipeline
    config = _create_config(args)
    db_path = _resolve_db_path(args)
    embedder = FastEmbedEmbedder(config)
    binary_store = create_binary_index(db_path, config.embed_dim, config)
    ann_index = create_ann_index(db_path, config.embed_dim, config)
    fts = FTSEngine(db_path / "fts.db")
    metadata = MetadataStore(db_path / "metadata.db")
    reranker = FastEmbedReranker(config)
    indexing = IndexingPipeline(
        embedder=embedder,
        binary_store=binary_store,
        ann_index=ann_index,
        fts=fts,
        config=config,
        metadata=metadata,
    )
    search = SearchPipeline(
        embedder=embedder,
        binary_store=binary_store,
        ann_index=ann_index,
        reranker=reranker,
        fts=fts,
        config=config,
        metadata_store=metadata,
    )
    return indexing, search, config
 # ---------------------------------------------------------------------------
 # Subcommand handlers
 # ---------------------------------------------------------------------------
 def cmd_init(args: argparse.Namespace) -> None:
    """Initialize an empty index at --db-path."""
    from codexlens_search.indexing.metadata import MetadataStore
    from codexlens_search.search.fts import FTSEngine
    db_path = _resolve_db_path(args)
    # Create empty stores - just touch the metadata and FTS databases
    MetadataStore(db_path / "metadata.db")
    FTSEngine(db_path / "fts.db")
    _json_output({
        "status": "initialized",
        "db_path": str(db_path),
    })
 def cmd_search(args: argparse.Namespace) -> None:
    """Run search query, output JSON array of results."""
    _, search, _ = _create_pipeline(args)
    results = search.search(args.query, top_k=args.top_k)
    _json_output([
        {"path": r.path, "score": r.score, "snippet": r.snippet}
        for r in results
    ])
 def cmd_index_file(args: argparse.Namespace) -> None:
    """Index a single file."""
    indexing, _, _ = _create_pipeline(args)
    file_path = Path(args.file).resolve()
    if not file_path.is_file():
        _error_exit(f"File not found: {file_path}")
    root = Path(args.root).resolve() if args.root else None
    stats = indexing.index_file(file_path, root=root)
    _json_output({
        "status": "indexed",
        "file": str(file_path),
        "files_processed": stats.files_processed,
        "chunks_created": stats.chunks_created,
        "duration_seconds": stats.duration_seconds,
    })
 def cmd_remove_file(args: argparse.Namespace) -> None:
    """Remove a file from the index."""
    indexing, _, _ = _create_pipeline(args)
    indexing.remove_file(args.file)
    _json_output({
        "status": "removed",
        "file": args.file,
    })
 def cmd_sync(args: argparse.Namespace) -> None:
    """Sync index with files under --root matching --glob pattern."""
    indexing, _, _ = _create_pipeline(args)
    root = Path(args.root).resolve()
    if not root.is_dir():
        _error_exit(f"Root directory not found: {root}")
    pattern = args.glob or "**/*"
    file_paths = [
        p for p in root.glob(pattern)
        if p.is_file()
    ]
    stats = indexing.sync(file_paths, root=root)
    _json_output({
        "status": "synced",
        "root": str(root),
        "files_processed": stats.files_processed,
        "chunks_created": stats.chunks_created,
        "duration_seconds": stats.duration_seconds,
    })
 def cmd_watch(args: argparse.Namespace) -> None:
    """Watch --root for changes, output JSONL events."""
    root = Path(args.root).resolve()
    if not root.is_dir():
        _error_exit(f"Root directory not found: {root}")
    debounce_ms = args.debounce_ms
    try:
        from watchdog.observers import Observer
        from watchdog.events import FileSystemEventHandler, FileSystemEvent
    except ImportError:
        _error_exit(
            "watchdog is required for watch mode. "
            "Install with: pip install watchdog"
        )
    class _JsonEventHandler(FileSystemEventHandler):
        """Emit JSONL for file events."""
        def _emit(self, event_type: str, path: str) -> None:
            _json_output({
                "event": event_type,
                "path": path,
                "timestamp": time.time(),
            })
        def on_created(self, event: FileSystemEvent) -> None:
            if not event.is_directory:
                self._emit("created", event.src_path)
        def on_modified(self, event: FileSystemEvent) -> None:
            if not event.is_directory:
                self._emit("modified", event.src_path)
        def on_deleted(self, event: FileSystemEvent) -> None:
            if not event.is_directory:
                self._emit("deleted", event.src_path)
        def on_moved(self, event: FileSystemEvent) -> None:
            if not event.is_directory:
                self._emit("moved", event.dest_path)
    observer = Observer()
    observer.schedule(_JsonEventHandler(), str(root), recursive=True)
    observer.start()
    _json_output({
        "status": "watching",
        "root": str(root),
        "debounce_ms": debounce_ms,
    })
    try:
        while True:
            time.sleep(debounce_ms / 1000.0)
    except KeyboardInterrupt:
        observer.stop()
    observer.join()
 def cmd_download_models(args: argparse.Namespace) -> None:
    """Download embed + reranker models."""
    from codexlens_search import model_manager
    config = _create_config(args)
    model_manager.ensure_model(config.embed_model, config)
    model_manager.ensure_model(config.reranker_model, config)
    _json_output({
        "status": "downloaded",
        "embed_model": config.embed_model,
        "reranker_model": config.reranker_model,
    })
 def cmd_status(args: argparse.Namespace) -> None:
    """Report index statistics."""
    from codexlens_search.indexing.metadata import MetadataStore
    db_path = _resolve_db_path(args)
    meta_path = db_path / "metadata.db"
    if not meta_path.exists():
        _json_output({
            "status": "not_initialized",
            "db_path": str(db_path),
        })
        return
    metadata = MetadataStore(meta_path)
    all_files = metadata.get_all_files()
    deleted_ids = metadata.get_deleted_ids()
    max_chunk = metadata.max_chunk_id()
    _json_output({
        "status": "ok",
        "db_path": str(db_path),
        "files_tracked": len(all_files),
        "max_chunk_id": max_chunk,
        "total_chunks_approx": max_chunk + 1 if max_chunk >= 0 else 0,
        "deleted_chunks": len(deleted_ids),
    })
 # ---------------------------------------------------------------------------
 # CLI parser
 # ---------------------------------------------------------------------------
 def _build_parser() -> argparse.ArgumentParser:
    parser = argparse.ArgumentParser(
        prog="codexlens-search",
        description="Lightweight semantic code search - CLI bridge",
    )
    parser.add_argument(
        "--db-path",
        default=os.environ.get("CODEXLENS_DB_PATH", ".codexlens"),
        help="Path to index database directory (default: .codexlens or $CODEXLENS_DB_PATH)",
    )
    parser.add_argument(
        "--verbose", "-v",
        action="store_true",
        help="Enable debug logging to stderr",
    )
    sub = parser.add_subparsers(dest="command")
    # init
    sub.add_parser("init", help="Initialize empty index")
    # search
    p_search = sub.add_parser("search", help="Search the index")
    p_search.add_argument("--query", "-q", required=True, help="Search query")
    p_search.add_argument("--top-k", "-k", type=int, default=10, help="Number of results")
    # index-file
    p_index = sub.add_parser("index-file", help="Index a single file")
    p_index.add_argument("--file", "-f", required=True, help="File path to index")
    p_index.add_argument("--root", "-r", help="Root directory for relative paths")
    # remove-file
    p_remove = sub.add_parser("remove-file", help="Remove a file from index")
    p_remove.add_argument("--file", "-f", required=True, help="Relative file path to remove")
    # sync
    p_sync = sub.add_parser("sync", help="Sync index with directory")
    p_sync.add_argument("--root", "-r", required=True, help="Root directory to sync")
    p_sync.add_argument("--glob", "-g", default="**/*", help="Glob pattern (default: **/*)")
    # watch
    p_watch = sub.add_parser("watch", help="Watch directory for changes (JSONL output)")
    p_watch.add_argument("--root", "-r", required=True, help="Root directory to watch")
    p_watch.add_argument("--debounce-ms", type=int, default=500, help="Debounce interval in ms")
    # download-models
    p_dl = sub.add_parser("download-models", help="Download embed + reranker models")
    p_dl.add_argument("--embed-model", help="Override embed model name")
    # status
    sub.add_parser("status", help="Report index statistics")
    return parser
 def main() -> None:
    """CLI entry point."""
    parser = _build_parser()
    args = parser.parse_args()
    # Configure logging
    if args.verbose:
        logging.basicConfig(
            level=logging.DEBUG,
            format="%(levelname)s %(name)s: %(message)s",
            stream=sys.stderr,
        )
    else:
        logging.basicConfig(
            level=logging.WARNING,
            format="%(levelname)s: %(message)s",
            stream=sys.stderr,
        )
    if not args.command:
        parser.print_help(sys.stderr)
        sys.exit(1)
    dispatch = {
        "init": cmd_init,
        "search": cmd_search,
        "index-file": cmd_index_file,
        "remove-file": cmd_remove_file,
        "sync": cmd_sync,
        "watch": cmd_watch,
        "download-models": cmd_download_models,
        "status": cmd_status,
    }
    handler = dispatch.get(args.command)
    if handler is None:
        _error_exit(f"Unknown command: {args.command}")
    try:
        handler(args)
    except KeyboardInterrupt:
        sys.exit(130)
    except SystemExit:
        raise
    except Exception as exc:
        log.debug("Command failed", exc_info=True)
        _error_exit(str(exc))
 if __name__ == "__main__":
    main()
--- a/codex-lens-v2/src/codexlens_search/config.py
+++ b/codex-lens-v2/src/codexlens_search/config.py
@@ -49,6 +49,9 @@ class Config:
    reranker_api_model: str = ""
    reranker_api_max_tokens_per_batch: int = 2048
    # Metadata store
    metadata_db_path: str = ""  # empty = no metadata tracking
    # FTS
    fts_top_k: int = 50
--- a/codex-lens-v2/src/codexlens_search/indexing/init.py
+++ b/codex-lens-v2/src/codexlens_search/indexing/init.py
@@ -1,5 +1,6 @@
 from __future__ import annotations
 from .metadata import MetadataStore
 from .pipeline import IndexingPipeline, IndexStats
-__all__ = ["IndexingPipeline", "IndexStats"]
+__all__ = ["IndexingPipeline", "IndexStats", "MetadataStore"]
--- a/codex-lens-v2/src/codexlens_search/indexing/metadata.py
+++ b/codex-lens-v2/src/codexlens_search/indexing/metadata.py
@@ -0,0 +1,165 @@
 """SQLite-backed metadata store for file-to-chunk mapping and tombstone tracking."""
 from __future__ import annotations
 import sqlite3
 from pathlib import Path
 class MetadataStore:
    """Tracks file-to-chunk mappings and deleted chunk IDs (tombstones).
    Tables:
        files      - file_path (PK), content_hash, last_modified
        chunks     - chunk_id (PK), file_path (FK CASCADE), chunk_hash
        deleted_chunks - chunk_id (PK) for tombstone tracking
    """
    def __init__(self, db_path: str | Path) -> None:
        self._conn = sqlite3.connect(str(db_path), check_same_thread=False)
        self._conn.execute("PRAGMA foreign_keys = ON")
        self._conn.execute("PRAGMA journal_mode = WAL")
        self._create_tables()
    def _create_tables(self) -> None:
        self._conn.executescript("""
            CREATE TABLE IF NOT EXISTS files (
                file_path   TEXT PRIMARY KEY,
                content_hash TEXT NOT NULL,
                last_modified REAL NOT NULL
            );
            CREATE TABLE IF NOT EXISTS chunks (
                chunk_id    INTEGER PRIMARY KEY,
                file_path   TEXT NOT NULL,
                chunk_hash  TEXT NOT NULL DEFAULT '',
                FOREIGN KEY (file_path) REFERENCES files(file_path) ON DELETE CASCADE
            );
            CREATE TABLE IF NOT EXISTS deleted_chunks (
                chunk_id    INTEGER PRIMARY KEY
            );
        """)
        self._conn.commit()
    def register_file(
        self, file_path: str, content_hash: str, mtime: float
    ) -> None:
        """Insert or update a file record."""
        self._conn.execute(
            "INSERT OR REPLACE INTO files (file_path, content_hash, last_modified) "
            "VALUES (?, ?, ?)",
            (file_path, content_hash, mtime),
        )
        self._conn.commit()
    def register_chunks(
        self, file_path: str, chunk_ids_and_hashes: list[tuple[int, str]]
    ) -> None:
        """Register chunk IDs belonging to a file.
        Args:
            file_path: The owning file path (must already exist in files table).
            chunk_ids_and_hashes: List of (chunk_id, chunk_hash) tuples.
        """
        if not chunk_ids_and_hashes:
            return
        self._conn.executemany(
            "INSERT OR REPLACE INTO chunks (chunk_id, file_path, chunk_hash) "
            "VALUES (?, ?, ?)",
            [(cid, file_path, chash) for cid, chash in chunk_ids_and_hashes],
        )
        self._conn.commit()
    def mark_file_deleted(self, file_path: str) -> int:
        """Move all chunk IDs for a file to deleted_chunks, then remove the file.
        Returns the number of chunks tombstoned.
        """
        # Collect chunk IDs before CASCADE deletes them
        rows = self._conn.execute(
            "SELECT chunk_id FROM chunks WHERE file_path = ?", (file_path,)
        ).fetchall()
        if not rows:
            # Still remove the file record if it exists
            self._conn.execute(
                "DELETE FROM files WHERE file_path = ?", (file_path,)
            )
            self._conn.commit()
            return 0
        chunk_ids = [(r[0],) for r in rows]
        self._conn.executemany(
            "INSERT OR IGNORE INTO deleted_chunks (chunk_id) VALUES (?)",
            chunk_ids,
        )
        # CASCADE deletes chunks rows automatically
        self._conn.execute(
            "DELETE FROM files WHERE file_path = ?", (file_path,)
        )
        self._conn.commit()
        return len(chunk_ids)
    def get_deleted_ids(self) -> set[int]:
        """Return all tombstoned chunk IDs for search-time filtering."""
        rows = self._conn.execute(
            "SELECT chunk_id FROM deleted_chunks"
        ).fetchall()
        return {r[0] for r in rows}
    def get_file_hash(self, file_path: str) -> str | None:
        """Return the stored content hash for a file, or None if not tracked."""
        row = self._conn.execute(
            "SELECT content_hash FROM files WHERE file_path = ?", (file_path,)
        ).fetchone()
        return row[0] if row else None
    def file_needs_update(self, file_path: str, content_hash: str) -> bool:
        """Check if a file needs re-indexing based on its content hash."""
        stored = self.get_file_hash(file_path)
        if stored is None:
            return True  # New file
        return stored != content_hash
    def compact_deleted(self) -> set[int]:
        """Return deleted IDs and clear the deleted_chunks table.
        Call this after rebuilding the vector index to reclaim space.
        """
        deleted = self.get_deleted_ids()
        if deleted:
            self._conn.execute("DELETE FROM deleted_chunks")
            self._conn.commit()
        return deleted
    def get_chunk_ids_for_file(self, file_path: str) -> list[int]:
        """Return all chunk IDs belonging to a file."""
        rows = self._conn.execute(
            "SELECT chunk_id FROM chunks WHERE file_path = ?", (file_path,)
        ).fetchall()
        return [r[0] for r in rows]
    def get_all_files(self) -> dict[str, str]:
        """Return all tracked files as {file_path: content_hash}."""
        rows = self._conn.execute(
            "SELECT file_path, content_hash FROM files"
        ).fetchall()
        return {r[0]: r[1] for r in rows}
    def max_chunk_id(self) -> int:
        """Return the maximum chunk_id across chunks and deleted_chunks.
        Returns -1 if no chunks exist, so that next_id = max_chunk_id() + 1
        starts at 0 for an empty store.
        """
        row = self._conn.execute(
            "SELECT MAX(m) FROM ("
            "  SELECT MAX(chunk_id) AS m FROM chunks"
            "  UNION ALL"
            "  SELECT MAX(chunk_id) AS m FROM deleted_chunks"
            ")"
        ).fetchone()
        return row[0] if row[0] is not None else -1
    def close(self) -> None:
        self._conn.close()
--- a/codex-lens-v2/src/codexlens_search/indexing/pipeline.py
+++ b/codex-lens-v2/src/codexlens_search/indexing/pipeline.py
@@ -5,6 +5,7 @@ The GIL is acceptable because embedding (onnxruntime) releases it in C extension
 """
 from __future__ import annotations
 import hashlib
 import logging
 import queue
 import threading
@@ -18,6 +19,7 @@ from codexlens_search.config import Config
 from codexlens_search.core.binary import BinaryStore
 from codexlens_search.core.index import ANNIndex
 from codexlens_search.embed.base import BaseEmbedder
 from codexlens_search.indexing.metadata import MetadataStore
 from codexlens_search.search.fts import FTSEngine
 logger = logging.getLogger(__name__)
@@ -55,12 +57,14 @@ class IndexingPipeline:
        ann_index: ANNIndex,
        fts: FTSEngine,
        config: Config,
        metadata: MetadataStore | None = None,
    ) -> None:
        self._embedder = embedder
        self._binary_store = binary_store
        self._ann_index = ann_index
        self._fts = fts
        self._config = config
        self._metadata = metadata
    def index_files(
        self,
@@ -275,3 +279,271 @@ class IndexingPipeline:
            chunks.append(("".join(current), path))
        return chunks
    # ------------------------------------------------------------------
    # Incremental API
    # ------------------------------------------------------------------
    @staticmethod
    def _content_hash(text: str) -> str:
        """Compute SHA-256 hex digest of file content."""
        return hashlib.sha256(text.encode("utf-8", errors="replace")).hexdigest()
    def _require_metadata(self) -> MetadataStore:
        """Return metadata store or raise if not configured."""
        if self._metadata is None:
            raise RuntimeError(
                "MetadataStore is required for incremental indexing. "
                "Pass metadata= to IndexingPipeline.__init__."
            )
        return self._metadata
    def _next_chunk_id(self) -> int:
        """Return the next available chunk ID from MetadataStore."""
        meta = self._require_metadata()
        return meta.max_chunk_id() + 1
    def index_file(
        self,
        file_path: Path,
        *,
        root: Path | None = None,
        force: bool = False,
        max_chunk_chars: int = _DEFAULT_MAX_CHUNK_CHARS,
        chunk_overlap: int = _DEFAULT_CHUNK_OVERLAP,
        max_file_size: int = 50_000,
    ) -> IndexStats:
        """Index a single file incrementally.
        Skips files that have not changed (same content_hash) unless
        *force* is True.
        Args:
            file_path: Path to the file to index.
            root: Optional root for computing relative path identifiers.
            force: Re-index even if content hash has not changed.
            max_chunk_chars: Maximum characters per chunk.
            chunk_overlap: Character overlap between consecutive chunks.
            max_file_size: Skip files larger than this (bytes).
        Returns:
            IndexStats with counts and timing.
        """
        meta = self._require_metadata()
        t0 = time.monotonic()
        # Read file
        try:
            if file_path.stat().st_size > max_file_size:
                logger.debug("Skipping %s: exceeds max_file_size", file_path)
                return IndexStats(duration_seconds=round(time.monotonic() - t0, 2))
            text = file_path.read_text(encoding="utf-8", errors="replace")
        except Exception as exc:
            logger.debug("Skipping %s: %s", file_path, exc)
            return IndexStats(duration_seconds=round(time.monotonic() - t0, 2))
        content_hash = self._content_hash(text)
        rel_path = str(file_path.relative_to(root)) if root else str(file_path)
        # Check if update is needed
        if not force and not meta.file_needs_update(rel_path, content_hash):
            logger.debug("Skipping %s: unchanged", rel_path)
            return IndexStats(duration_seconds=round(time.monotonic() - t0, 2))
        # If file was previously indexed, remove old data first
        if meta.get_file_hash(rel_path) is not None:
            meta.mark_file_deleted(rel_path)
            self._fts.delete_by_path(rel_path)
        # Chunk
        file_chunks = self._chunk_text(text, rel_path, max_chunk_chars, chunk_overlap)
        if not file_chunks:
            # Register file with no chunks
            meta.register_file(rel_path, content_hash, file_path.stat().st_mtime)
            return IndexStats(
                files_processed=1,
                duration_seconds=round(time.monotonic() - t0, 2),
            )
        # Assign chunk IDs
        start_id = self._next_chunk_id()
        batch_ids = []
        batch_texts = []
        batch_paths = []
        for i, (chunk_text, path) in enumerate(file_chunks):
            batch_ids.append(start_id + i)
            batch_texts.append(chunk_text)
            batch_paths.append(path)
        # Embed synchronously
        vecs = self._embedder.embed_batch(batch_texts)
        vec_array = np.array(vecs, dtype=np.float32)
        id_array = np.array(batch_ids, dtype=np.int64)
        # Index: write to stores
        self._binary_store.add(id_array, vec_array)
        self._ann_index.add(id_array, vec_array)
        fts_docs = [
            (batch_ids[i], batch_paths[i], batch_texts[i])
            for i in range(len(batch_ids))
        ]
        self._fts.add_documents(fts_docs)
        # Register in metadata
        meta.register_file(rel_path, content_hash, file_path.stat().st_mtime)
        chunk_id_hashes = [
            (batch_ids[i], self._content_hash(batch_texts[i]))
            for i in range(len(batch_ids))
        ]
        meta.register_chunks(rel_path, chunk_id_hashes)
        # Flush stores
        self._binary_store.save()
        self._ann_index.save()
        duration = time.monotonic() - t0
        stats = IndexStats(
            files_processed=1,
            chunks_created=len(batch_ids),
            duration_seconds=round(duration, 2),
        )
        logger.info(
            "Indexed file %s: %d chunks in %.2fs",
            rel_path, stats.chunks_created, stats.duration_seconds,
        )
        return stats
    def remove_file(self, file_path: str) -> None:
        """Mark a file as deleted via tombstone strategy.
        Marks all chunk IDs for the file in MetadataStore.deleted_chunks
        and removes the file's FTS entries.
        Args:
            file_path: The relative path identifier of the file to remove.
        """
        meta = self._require_metadata()
        count = meta.mark_file_deleted(file_path)
        fts_count = self._fts.delete_by_path(file_path)
        logger.info(
            "Removed file %s: %d chunks tombstoned, %d FTS entries deleted",
            file_path, count, fts_count,
        )
    def sync(
        self,
        file_paths: list[Path],
        *,
        root: Path | None = None,
        max_chunk_chars: int = _DEFAULT_MAX_CHUNK_CHARS,
        chunk_overlap: int = _DEFAULT_CHUNK_OVERLAP,
        max_file_size: int = 50_000,
    ) -> IndexStats:
        """Reconcile index state against a current file list.
        Identifies files that are new, changed, or removed and processes
        each accordingly.
        Args:
            file_paths: Current list of files that should be indexed.
            root: Optional root for computing relative path identifiers.
            max_chunk_chars: Maximum characters per chunk.
            chunk_overlap: Character overlap between consecutive chunks.
            max_file_size: Skip files larger than this (bytes).
        Returns:
            Aggregated IndexStats for all operations.
        """
        meta = self._require_metadata()
        t0 = time.monotonic()
        # Build set of current relative paths
        current_rel_paths: dict[str, Path] = {}
        for fpath in file_paths:
            rel = str(fpath.relative_to(root)) if root else str(fpath)
            current_rel_paths[rel] = fpath
        # Get known files from metadata
        known_files = meta.get_all_files()  # {rel_path: content_hash}
        # Detect removed files
        removed = set(known_files.keys()) - set(current_rel_paths.keys())
        for rel in removed:
            self.remove_file(rel)
        # Index new and changed files
        total_files = 0
        total_chunks = 0
        for rel, fpath in current_rel_paths.items():
            stats = self.index_file(
                fpath,
                root=root,
                max_chunk_chars=max_chunk_chars,
                chunk_overlap=chunk_overlap,
                max_file_size=max_file_size,
            )
            total_files += stats.files_processed
            total_chunks += stats.chunks_created
        duration = time.monotonic() - t0
        result = IndexStats(
            files_processed=total_files,
            chunks_created=total_chunks,
            duration_seconds=round(duration, 2),
        )
        logger.info(
            "Sync complete: %d files indexed, %d chunks created, "
            "%d files removed in %.1fs",
            result.files_processed, result.chunks_created,
            len(removed), result.duration_seconds,
        )
        return result
    def compact(self) -> None:
        """Rebuild indexes excluding tombstoned chunk IDs.
        Reads all deleted IDs from MetadataStore, rebuilds BinaryStore
        and ANNIndex without those entries, then clears the
        deleted_chunks table.
        """
        meta = self._require_metadata()
        deleted_ids = meta.compact_deleted()
        if not deleted_ids:
            logger.debug("Compact: no deleted IDs, nothing to do")
            return
        logger.info("Compact: rebuilding indexes, excluding %d deleted IDs", len(deleted_ids))
        # Rebuild BinaryStore: read current data, filter, replace
        if self._binary_store._count > 0:
            active_ids = self._binary_store._ids[: self._binary_store._count]
            active_matrix = self._binary_store._matrix[: self._binary_store._count]
            mask = ~np.isin(active_ids, list(deleted_ids))
            kept_ids = active_ids[mask]
            kept_matrix = active_matrix[mask]
            # Reset store
            self._binary_store._count = 0
            self._binary_store._matrix = None
            self._binary_store._ids = None
            if len(kept_ids) > 0:
                self._binary_store._ensure_capacity(len(kept_ids))
                self._binary_store._matrix[: len(kept_ids)] = kept_matrix
                self._binary_store._ids[: len(kept_ids)] = kept_ids
                self._binary_store._count = len(kept_ids)
            self._binary_store.save()
        # Rebuild ANNIndex: must reconstruct from scratch since HNSW
        # does not support deletion. We re-initialize and re-add kept items.
        # Note: we need the float32 vectors, but BinaryStore only has quantized.
        # ANNIndex (hnswlib) supports mark_deleted, but compact means full rebuild.
        # Since we don't have original float vectors cached, we rely on the fact
        # that ANNIndex.mark_deleted is not available in all hnswlib versions.
        # Instead, we reinitialize the index and let future searches filter via
        # deleted_ids at query time. The BinaryStore is already compacted above.
        # For a full ANN rebuild, the caller should re-run index_files() on all
        # files after compact.
        logger.info(
            "Compact: BinaryStore rebuilt (%d entries kept). "
            "Note: ANNIndex retains stale entries; run full re-index for clean ANN state.",
            self._binary_store._count,
        )
--- a/codex-lens-v2/src/codexlens_search/search/fts.py
+++ b/codex-lens-v2/src/codexlens_search/search/fts.py
@@ -67,3 +67,28 @@ class FTSEngine:
            "SELECT content FROM docs WHERE rowid = ?", (doc_id,)
        ).fetchone()
        return row[0] if row else ""
    def get_chunk_ids_by_path(self, path: str) -> list[int]:
        """Return all doc IDs associated with a given file path."""
        rows = self._conn.execute(
            "SELECT id FROM docs_meta WHERE path = ?", (path,)
        ).fetchall()
        return [r[0] for r in rows]
    def delete_by_path(self, path: str) -> int:
        """Delete all docs and docs_meta rows for a given file path.
        Returns the number of deleted documents.
        """
        ids = self.get_chunk_ids_by_path(path)
        if not ids:
            return 0
        placeholders = ",".join("?" for _ in ids)
        self._conn.execute(
            f"DELETE FROM docs WHERE rowid IN ({placeholders})", ids
        )
        self._conn.execute(
            f"DELETE FROM docs_meta WHERE id IN ({placeholders})", ids
        )
        self._conn.commit()
        return len(ids)
--- a/codex-lens-v2/src/codexlens_search/search/pipeline.py
+++ b/codex-lens-v2/src/codexlens_search/search/pipeline.py
@@ -9,6 +9,7 @@ import numpy as np
 from ..config import Config
 from ..core import ANNIndex, BinaryStore
 from ..embed import BaseEmbedder
 from ..indexing.metadata import MetadataStore
 from ..rerank import BaseReranker
 from .fts import FTSEngine
 from .fusion import (
@@ -38,6 +39,7 @@ class SearchPipeline:
        reranker: BaseReranker,
        fts: FTSEngine,
        config: Config,
        metadata_store: MetadataStore | None = None,
    ) -> None:
        self._embedder = embedder
        self._binary_store = binary_store
@@ -45,6 +47,7 @@ class SearchPipeline:
        self._reranker = reranker
        self._fts = fts
        self._config = config
        self._metadata_store = metadata_store
    # -- Helper: vector search (binary coarse + ANN fine) -----------------
@@ -137,6 +140,16 @@ class SearchPipeline:
        fused = reciprocal_rank_fusion(fusion_input, weights=weights, k=cfg.fusion_k)
        # 4b. Filter out deleted IDs (tombstone filtering)
        if self._metadata_store is not None:
            deleted_ids = self._metadata_store.get_deleted_ids()
            if deleted_ids:
                fused = [
                    (doc_id, score)
                    for doc_id, score in fused
                    if doc_id not in deleted_ids
                ]
        # 5. Rerank top candidates
        rerank_ids = [doc_id for doc_id, _ in fused[:50]]
        contents = [self._fts.get_content(doc_id) for doc_id in rerank_ids]
--- a/codex-lens-v2/src/codexlens_search/watcher/init.py
+++ b/codex-lens-v2/src/codexlens_search/watcher/init.py
@@ -0,0 +1,17 @@
 """File watcher and incremental indexer for codexlens-search.
 Requires the ``watcher`` extra::
    pip install codexlens-search[watcher]
 """
 from codexlens_search.watcher.events import ChangeType, FileEvent, WatcherConfig
 from codexlens_search.watcher.file_watcher import FileWatcher
 from codexlens_search.watcher.incremental_indexer import IncrementalIndexer
 __all__ = [
    "ChangeType",
    "FileEvent",
    "FileWatcher",
    "IncrementalIndexer",
    "WatcherConfig",
 ]
--- a/codex-lens-v2/src/codexlens_search/watcher/events.py
+++ b/codex-lens-v2/src/codexlens_search/watcher/events.py
@@ -0,0 +1,57 @@
 """Event types for file watcher."""
 from __future__ import annotations
 import time
 from dataclasses import dataclass, field
 from enum import Enum
 from pathlib import Path
 from typing import Optional, Set
 class ChangeType(Enum):
    """Type of file system change."""
    CREATED = "created"
    MODIFIED = "modified"
    DELETED = "deleted"
@dataclass
 class FileEvent:
    """A file system change event."""
    path: Path
    change_type: ChangeType
    timestamp: float = field(default_factory=time.time)
@dataclass
 class WatcherConfig:
    """Configuration for file watcher.
    Attributes:
        debounce_ms: Milliseconds to wait after the last event before
            flushing the batch.  Default 500ms for low-latency indexing.
        ignored_patterns: Directory/file name patterns to skip.  Any
            path component matching one of these strings is ignored.
    """
    debounce_ms: int = 500
    ignored_patterns: Set[str] = field(default_factory=lambda: {
        # Version control
        ".git", ".svn", ".hg",
        # Python
        ".venv", "venv", "env", "__pycache__", ".pytest_cache",
        ".mypy_cache", ".ruff_cache",
        # Node.js
        "node_modules", "bower_components",
        # Build artifacts
        "dist", "build", "out", "target", "bin", "obj",
        "coverage", "htmlcov",
        # IDE / Editor
        ".idea", ".vscode", ".vs",
        # Package / cache
        ".cache", ".parcel-cache", ".turbo", ".next", ".nuxt",
        # Logs / temp
        "logs", "tmp", "temp",
    })
--- a/codex-lens-v2/src/codexlens_search/watcher/file_watcher.py
+++ b/codex-lens-v2/src/codexlens_search/watcher/file_watcher.py
@@ -0,0 +1,263 @@
 """File system watcher using watchdog library.
 Ported from codex-lens v1 with simplifications:
 - Removed v1-specific Config dependency (uses WatcherConfig directly)
 - Removed MAX_QUEUE_SIZE (v2 processes immediately via debounce)
 - Removed flush.signal file mechanism
 - Added optional JSONL output mode for bridge CLI integration
 """
 from __future__ import annotations
 import json
 import logging
 import sys
 import threading
 import time
 from pathlib import Path
 from typing import Callable, Dict, List, Optional
 from watchdog.events import FileSystemEventHandler
 from watchdog.observers import Observer
 from .events import ChangeType, FileEvent, WatcherConfig
 logger = logging.getLogger(__name__)
 # Event priority for deduplication: higher wins when same file appears
 # multiple times within one debounce window.
 _EVENT_PRIORITY: Dict[ChangeType, int] = {
    ChangeType.CREATED: 1,
    ChangeType.MODIFIED: 2,
    ChangeType.DELETED: 3,
 }
 class _Handler(FileSystemEventHandler):
    """Internal watchdog handler that converts events to FileEvent."""
    def __init__(self, watcher: FileWatcher) -> None:
        super().__init__()
        self._watcher = watcher
    def on_created(self, event) -> None:
        if not event.is_directory:
            self._watcher._on_raw_event(event.src_path, ChangeType.CREATED)
    def on_modified(self, event) -> None:
        if not event.is_directory:
            self._watcher._on_raw_event(event.src_path, ChangeType.MODIFIED)
    def on_deleted(self, event) -> None:
        if not event.is_directory:
            self._watcher._on_raw_event(event.src_path, ChangeType.DELETED)
    def on_moved(self, event) -> None:
        if event.is_directory:
            return
        # Treat move as delete old + create new
        self._watcher._on_raw_event(event.src_path, ChangeType.DELETED)
        self._watcher._on_raw_event(event.dest_path, ChangeType.CREATED)
 class FileWatcher:
    """File system watcher with debounce and event deduplication.
    Monitors a directory recursively using watchdog.  Raw events are
    collected into a queue.  After *debounce_ms* of silence the queue
    is flushed: events are deduplicated per-path (keeping the highest
    priority change type) and delivered via *on_changes*.
    Example::
        def handle(events: list[FileEvent]) -> None:
            for e in events:
                print(e.change_type.value, e.path)
        watcher = FileWatcher(Path("."), WatcherConfig(), handle)
        watcher.start()
        watcher.wait()
    """
    def __init__(
        self,
        root_path: Path,
        config: WatcherConfig,
        on_changes: Callable[[List[FileEvent]], None],
    ) -> None:
        self.root_path = Path(root_path).resolve()
        self.config = config
        self.on_changes = on_changes
        self._observer: Optional[Observer] = None
        self._running = False
        self._stop_event = threading.Event()
        self._lock = threading.RLock()
        # Pending events keyed by resolved path
        self._pending: Dict[Path, FileEvent] = {}
        self._pending_lock = threading.Lock()
        # True-debounce timer: resets on every new event
        self._flush_timer: Optional[threading.Timer] = None
    # ------------------------------------------------------------------
    # Filtering
    # ------------------------------------------------------------------
    def _should_watch(self, path: Path) -> bool:
        """Return True if *path* should not be ignored."""
        parts = path.parts
        for pattern in self.config.ignored_patterns:
            if pattern in parts:
                return False
        return True
    # ------------------------------------------------------------------
    # Event intake (called from watchdog thread)
    # ------------------------------------------------------------------
    def _on_raw_event(self, raw_path: str, change_type: ChangeType) -> None:
        """Accept a raw watchdog event, filter, and queue with debounce."""
        path = Path(raw_path).resolve()
        if not self._should_watch(path):
            return
        event = FileEvent(path=path, change_type=change_type)
        with self._pending_lock:
            existing = self._pending.get(path)
            if existing is None or _EVENT_PRIORITY[change_type] >= _EVENT_PRIORITY[existing.change_type]:
                self._pending[path] = event
            # Cancel previous timer and start a new one (true debounce)
            if self._flush_timer is not None:
                self._flush_timer.cancel()
            self._flush_timer = threading.Timer(
                self.config.debounce_ms / 1000.0,
                self._flush,
            )
            self._flush_timer.daemon = True
            self._flush_timer.start()
    # ------------------------------------------------------------------
    # Flush
    # ------------------------------------------------------------------
    def _flush(self) -> None:
        """Deduplicate and deliver pending events."""
        with self._pending_lock:
            if not self._pending:
                return
            events = list(self._pending.values())
            self._pending.clear()
            self._flush_timer = None
        try:
            self.on_changes(events)
        except Exception:
            logger.exception("Error in on_changes callback")
    def flush_now(self) -> None:
        """Immediately flush pending events (manual trigger)."""
        with self._pending_lock:
            if self._flush_timer is not None:
                self._flush_timer.cancel()
                self._flush_timer = None
        self._flush()
    # ------------------------------------------------------------------
    # Lifecycle
    # ------------------------------------------------------------------
    def start(self) -> None:
        """Start watching the directory (non-blocking)."""
        with self._lock:
            if self._running:
                logger.warning("Watcher already running")
                return
            if not self.root_path.exists():
                raise ValueError(f"Root path does not exist: {self.root_path}")
            self._observer = Observer()
            handler = _Handler(self)
            self._observer.schedule(handler, str(self.root_path), recursive=True)
            self._running = True
            self._stop_event.clear()
            self._observer.start()
            logger.info("Started watching: %s", self.root_path)
    def stop(self) -> None:
        """Stop watching and flush remaining events."""
        with self._lock:
            if not self._running:
                return
            self._running = False
            self._stop_event.set()
            with self._pending_lock:
                if self._flush_timer is not None:
                    self._flush_timer.cancel()
                    self._flush_timer = None
            if self._observer is not None:
                self._observer.stop()
                self._observer.join(timeout=5.0)
                self._observer = None
            # Deliver any remaining events
            self._flush()
            logger.info("Stopped watching: %s", self.root_path)
    def wait(self) -> None:
        """Block until stopped (Ctrl+C or stop() from another thread)."""
        try:
            while self._running:
                self._stop_event.wait(timeout=1.0)
        except KeyboardInterrupt:
            logger.info("Received interrupt, stopping watcher...")
            self.stop()
    @property
    def is_running(self) -> bool:
        """True if the watcher is currently running."""
        return self._running
    # ------------------------------------------------------------------
    # JSONL output helper
    # ------------------------------------------------------------------
    @staticmethod
    def events_to_jsonl(events: List[FileEvent]) -> str:
        """Serialize a batch of events as newline-delimited JSON.
        Each line is a JSON object with keys: ``path``, ``change_type``,
        ``timestamp``.  Useful for bridge CLI integration.
        """
        lines: list[str] = []
        for evt in events:
            obj = {
                "path": str(evt.path),
                "change_type": evt.change_type.value,
                "timestamp": evt.timestamp,
            }
            lines.append(json.dumps(obj, ensure_ascii=False))
        return "\n".join(lines)
    @staticmethod
    def jsonl_callback(events: List[FileEvent]) -> None:
        """Callback that writes JSONL to stdout.
        Suitable as *on_changes* when running in bridge/CLI mode::
            watcher = FileWatcher(root, config, FileWatcher.jsonl_callback)
        """
        output = FileWatcher.events_to_jsonl(events)
        if output:
            sys.stdout.write(output + "\n")
            sys.stdout.flush()
--- a/codex-lens-v2/src/codexlens_search/watcher/incremental_indexer.py
+++ b/codex-lens-v2/src/codexlens_search/watcher/incremental_indexer.py
@@ -0,0 +1,129 @@
 """Incremental indexer that processes FileEvents via IndexingPipeline.
 Ported from codex-lens v1 with simplifications:
 - Uses IndexingPipeline.index_file() / remove_file() directly
 - No v1-specific Config, ParserFactory, DirIndexStore dependencies
 - Per-file error isolation: one failure does not stop batch processing
 """
 from __future__ import annotations
 import logging
 from dataclasses import dataclass, field
 from pathlib import Path
 from typing import List, Optional
 from codexlens_search.indexing.pipeline import IndexingPipeline
 from .events import ChangeType, FileEvent
 logger = logging.getLogger(__name__)
@dataclass
 class BatchResult:
    """Result of processing a batch of file events."""
    files_indexed: int = 0
    files_removed: int = 0
    chunks_created: int = 0
    errors: List[str] = field(default_factory=list)
    @property
    def total_processed(self) -> int:
        return self.files_indexed + self.files_removed
    @property
    def has_errors(self) -> bool:
        return len(self.errors) > 0
 class IncrementalIndexer:
    """Routes file change events to IndexingPipeline operations.
    CREATED / MODIFIED events call ``pipeline.index_file()``.
    DELETED events call ``pipeline.remove_file()``.
    Each file is processed in isolation so that a single failure
    does not prevent the rest of the batch from being indexed.
    Example::
        indexer = IncrementalIndexer(pipeline, root=Path("/project"))
        result = indexer.process_events([
            FileEvent(Path("src/main.py"), ChangeType.MODIFIED),
        ])
        print(f"Indexed {result.files_indexed}, removed {result.files_removed}")
    """
    def __init__(
        self,
        pipeline: IndexingPipeline,
        *,
        root: Optional[Path] = None,
    ) -> None:
        """Initialize the incremental indexer.
        Args:
            pipeline: The indexing pipeline with metadata store configured.
            root: Optional project root for computing relative paths.
                  If None, absolute paths are used as identifiers.
        """
        self._pipeline = pipeline
        self._root = root
    def process_events(self, events: List[FileEvent]) -> BatchResult:
        """Process a batch of file events with per-file error isolation.
        Args:
            events: List of file events to process.
        Returns:
            BatchResult with per-batch statistics.
        """
        result = BatchResult()
        for event in events:
            try:
                if event.change_type in (ChangeType.CREATED, ChangeType.MODIFIED):
                    self._handle_index(event, result)
                elif event.change_type == ChangeType.DELETED:
                    self._handle_remove(event, result)
            except Exception as exc:
                error_msg = (
                    f"Error processing {event.path} "
                    f"({event.change_type.value}): "
                    f"{type(exc).__name__}: {exc}"
                )
                logger.error(error_msg)
                result.errors.append(error_msg)
        if result.total_processed > 0:
            logger.info(
                "Batch complete: %d indexed, %d removed, %d errors",
                result.files_indexed,
                result.files_removed,
                len(result.errors),
            )
        return result
    def _handle_index(self, event: FileEvent, result: BatchResult) -> None:
        """Index a created or modified file."""
        stats = self._pipeline.index_file(
            event.path,
            root=self._root,
            force=(event.change_type == ChangeType.MODIFIED),
        )
        if stats.files_processed > 0:
            result.files_indexed += 1
            result.chunks_created += stats.chunks_created
    def _handle_remove(self, event: FileEvent, result: BatchResult) -> None:
        """Remove a deleted file from the index."""
        rel_path = (
            str(event.path.relative_to(self._root))
            if self._root
            else str(event.path)
        )
        self._pipeline.remove_file(rel_path)
        result.files_removed += 1
--- a/codex-lens-v2/tests/unit/test_incremental.py
+++ b/codex-lens-v2/tests/unit/test_incremental.py
@@ -0,0 +1,388 @@
 """Unit tests for IndexingPipeline incremental API (index_file, remove_file, sync, compact)."""
 from __future__ import annotations
 import tempfile
 from pathlib import Path
 from unittest.mock import MagicMock
 import numpy as np
 import pytest
 from codexlens_search.config import Config
 from codexlens_search.core.binary import BinaryStore
 from codexlens_search.core.index import ANNIndex
 from codexlens_search.embed.base import BaseEmbedder
 from codexlens_search.indexing.metadata import MetadataStore
 from codexlens_search.indexing.pipeline import IndexingPipeline, IndexStats
 from codexlens_search.search.fts import FTSEngine
 DIM = 32
 class FakeEmbedder(BaseEmbedder):
    """Deterministic embedder for testing."""
    def __init__(self) -> None:
        pass
    def embed_single(self, text: str) -> np.ndarray:
        rng = np.random.default_rng(hash(text) % (2**31))
        return rng.standard_normal(DIM).astype(np.float32)
    def embed_batch(self, texts: list[str]) -> list[np.ndarray]:
        return [self.embed_single(t) for t in texts]
@pytest.fixture
 def workspace(tmp_path: Path):
    """Create workspace with stores, metadata, and pipeline."""
    cfg = Config.small()
    # Override embed_dim to match our test dim
    cfg.embed_dim = DIM
    store_dir = tmp_path / "stores"
    store_dir.mkdir()
    binary_store = BinaryStore(store_dir, DIM, cfg)
    ann_index = ANNIndex(store_dir, DIM, cfg)
    fts = FTSEngine(str(store_dir / "fts.db"))
    metadata = MetadataStore(str(store_dir / "metadata.db"))
    embedder = FakeEmbedder()
    pipeline = IndexingPipeline(
        embedder=embedder,
        binary_store=binary_store,
        ann_index=ann_index,
        fts=fts,
        config=cfg,
        metadata=metadata,
    )
    # Create sample source files
    src_dir = tmp_path / "src"
    src_dir.mkdir()
    return {
        "pipeline": pipeline,
        "metadata": metadata,
        "binary_store": binary_store,
        "ann_index": ann_index,
        "fts": fts,
        "src_dir": src_dir,
        "store_dir": store_dir,
        "config": cfg,
    }
 def _write_file(src_dir: Path, name: str, content: str) -> Path:
    """Write a file and return its path."""
    p = src_dir / name
    p.write_text(content, encoding="utf-8")
    return p
 # ---------------------------------------------------------------------------
 # MetadataStore helper method tests
 # ---------------------------------------------------------------------------
 class TestMetadataHelpers:
    def test_get_all_files_empty(self, workspace):
        meta = workspace["metadata"]
        assert meta.get_all_files() == {}
    def test_get_all_files_after_register(self, workspace):
        meta = workspace["metadata"]
        meta.register_file("a.py", "hash_a", 1000.0)
        meta.register_file("b.py", "hash_b", 2000.0)
        result = meta.get_all_files()
        assert result == {"a.py": "hash_a", "b.py": "hash_b"}
    def test_max_chunk_id_empty(self, workspace):
        meta = workspace["metadata"]
        assert meta.max_chunk_id() == -1
    def test_max_chunk_id_with_chunks(self, workspace):
        meta = workspace["metadata"]
        meta.register_file("a.py", "hash_a", 1000.0)
        meta.register_chunks("a.py", [(0, "h0"), (1, "h1"), (5, "h5")])
        assert meta.max_chunk_id() == 5
    def test_max_chunk_id_includes_deleted(self, workspace):
        meta = workspace["metadata"]
        meta.register_file("a.py", "hash_a", 1000.0)
        meta.register_chunks("a.py", [(0, "h0"), (3, "h3")])
        meta.mark_file_deleted("a.py")
        # Chunks moved to deleted_chunks, max should still be 3
        assert meta.max_chunk_id() == 3
 # ---------------------------------------------------------------------------
 # index_file tests
 # ---------------------------------------------------------------------------
 class TestIndexFile:
    def test_index_file_basic(self, workspace):
        pipeline = workspace["pipeline"]
        meta = workspace["metadata"]
        src_dir = workspace["src_dir"]
        f = _write_file(src_dir, "hello.py", "print('hello world')\n")
        stats = pipeline.index_file(f, root=src_dir)
        assert stats.files_processed == 1
        assert stats.chunks_created >= 1
        assert meta.get_file_hash("hello.py") is not None
        assert len(meta.get_chunk_ids_for_file("hello.py")) >= 1
    def test_index_file_skips_unchanged(self, workspace):
        pipeline = workspace["pipeline"]
        src_dir = workspace["src_dir"]
        f = _write_file(src_dir, "same.py", "x = 1\n")
        stats1 = pipeline.index_file(f, root=src_dir)
        assert stats1.files_processed == 1
        stats2 = pipeline.index_file(f, root=src_dir)
        assert stats2.files_processed == 0
        assert stats2.chunks_created == 0
    def test_index_file_force_reindex(self, workspace):
        pipeline = workspace["pipeline"]
        src_dir = workspace["src_dir"]
        f = _write_file(src_dir, "force.py", "x = 1\n")
        pipeline.index_file(f, root=src_dir)
        stats = pipeline.index_file(f, root=src_dir, force=True)
        assert stats.files_processed == 1
        assert stats.chunks_created >= 1
    def test_index_file_updates_changed_file(self, workspace):
        pipeline = workspace["pipeline"]
        meta = workspace["metadata"]
        src_dir = workspace["src_dir"]
        f = _write_file(src_dir, "changing.py", "version = 1\n")
        pipeline.index_file(f, root=src_dir)
        old_chunks = meta.get_chunk_ids_for_file("changing.py")
        # Modify file
        f.write_text("version = 2\nmore code\n", encoding="utf-8")
        stats = pipeline.index_file(f, root=src_dir)
        assert stats.files_processed == 1
        new_chunks = meta.get_chunk_ids_for_file("changing.py")
        # Old chunks should have been tombstoned, new ones assigned
        assert set(old_chunks) != set(new_chunks)
    def test_index_file_registers_in_metadata(self, workspace):
        pipeline = workspace["pipeline"]
        meta = workspace["metadata"]
        fts = workspace["fts"]
        src_dir = workspace["src_dir"]
        f = _write_file(src_dir, "meta_test.py", "def foo(): pass\n")
        pipeline.index_file(f, root=src_dir)
        # MetadataStore has file registered
        assert meta.get_file_hash("meta_test.py") is not None
        chunk_ids = meta.get_chunk_ids_for_file("meta_test.py")
        assert len(chunk_ids) >= 1
        # FTS has the content
        fts_ids = fts.get_chunk_ids_by_path("meta_test.py")
        assert len(fts_ids) >= 1
    def test_index_file_no_metadata_raises(self, workspace):
        cfg = workspace["config"]
        pipeline_no_meta = IndexingPipeline(
            embedder=FakeEmbedder(),
            binary_store=workspace["binary_store"],
            ann_index=workspace["ann_index"],
            fts=workspace["fts"],
            config=cfg,
        )
        f = _write_file(workspace["src_dir"], "no_meta.py", "x = 1\n")
        with pytest.raises(RuntimeError, match="MetadataStore is required"):
            pipeline_no_meta.index_file(f)
 # ---------------------------------------------------------------------------
 # remove_file tests
 # ---------------------------------------------------------------------------
 class TestRemoveFile:
    def test_remove_file_tombstones_and_fts(self, workspace):
        pipeline = workspace["pipeline"]
        meta = workspace["metadata"]
        fts = workspace["fts"]
        src_dir = workspace["src_dir"]
        f = _write_file(src_dir, "to_remove.py", "data = [1, 2, 3]\n")
        pipeline.index_file(f, root=src_dir)
        chunk_ids = meta.get_chunk_ids_for_file("to_remove.py")
        assert len(chunk_ids) >= 1
        pipeline.remove_file("to_remove.py")
        # File should be gone from metadata
        assert meta.get_file_hash("to_remove.py") is None
        assert meta.get_chunk_ids_for_file("to_remove.py") == []
        # Chunks should be in deleted_chunks
        deleted = meta.get_deleted_ids()
        for cid in chunk_ids:
            assert cid in deleted
        # FTS should be cleared
        assert fts.get_chunk_ids_by_path("to_remove.py") == []
    def test_remove_nonexistent_file(self, workspace):
        pipeline = workspace["pipeline"]
        # Should not raise
        pipeline.remove_file("nonexistent.py")
 # ---------------------------------------------------------------------------
 # sync tests
 # ---------------------------------------------------------------------------
 class TestSync:
    def test_sync_indexes_new_files(self, workspace):
        pipeline = workspace["pipeline"]
        meta = workspace["metadata"]
        src_dir = workspace["src_dir"]
        f1 = _write_file(src_dir, "a.py", "a = 1\n")
        f2 = _write_file(src_dir, "b.py", "b = 2\n")
        stats = pipeline.sync([f1, f2], root=src_dir)
        assert stats.files_processed == 2
        assert meta.get_file_hash("a.py") is not None
        assert meta.get_file_hash("b.py") is not None
    def test_sync_removes_missing_files(self, workspace):
        pipeline = workspace["pipeline"]
        meta = workspace["metadata"]
        src_dir = workspace["src_dir"]
        f1 = _write_file(src_dir, "keep.py", "keep = True\n")
        f2 = _write_file(src_dir, "remove.py", "remove = True\n")
        pipeline.sync([f1, f2], root=src_dir)
        assert meta.get_file_hash("remove.py") is not None
        # Sync with only f1 -- f2 should be removed
        stats = pipeline.sync([f1], root=src_dir)
        assert meta.get_file_hash("remove.py") is None
        deleted = meta.get_deleted_ids()
        assert len(deleted) > 0
    def test_sync_detects_changed_files(self, workspace):
        pipeline = workspace["pipeline"]
        meta = workspace["metadata"]
        src_dir = workspace["src_dir"]
        f = _write_file(src_dir, "mutable.py", "v1\n")
        pipeline.sync([f], root=src_dir)
        old_hash = meta.get_file_hash("mutable.py")
        f.write_text("v2\n", encoding="utf-8")
        stats = pipeline.sync([f], root=src_dir)
        assert stats.files_processed == 1
        new_hash = meta.get_file_hash("mutable.py")
        assert old_hash != new_hash
    def test_sync_skips_unchanged(self, workspace):
        pipeline = workspace["pipeline"]
        src_dir = workspace["src_dir"]
        f = _write_file(src_dir, "stable.py", "stable = True\n")
        pipeline.sync([f], root=src_dir)
        # Second sync with same file, unchanged
        stats = pipeline.sync([f], root=src_dir)
        assert stats.files_processed == 0
        assert stats.chunks_created == 0
 # ---------------------------------------------------------------------------
 # compact tests
 # ---------------------------------------------------------------------------
 class TestCompact:
    def test_compact_removes_tombstoned_from_binary_store(self, workspace):
        pipeline = workspace["pipeline"]
        meta = workspace["metadata"]
        binary_store = workspace["binary_store"]
        src_dir = workspace["src_dir"]
        f1 = _write_file(src_dir, "alive.py", "alive = True\n")
        f2 = _write_file(src_dir, "dead.py", "dead = True\n")
        pipeline.index_file(f1, root=src_dir)
        pipeline.index_file(f2, root=src_dir)
        count_before = binary_store._count
        assert count_before >= 2
        pipeline.remove_file("dead.py")
        pipeline.compact()
        # BinaryStore should have fewer entries
        assert binary_store._count < count_before
        # deleted_chunks should be cleared
        assert meta.get_deleted_ids() == set()
    def test_compact_noop_when_no_deletions(self, workspace):
        pipeline = workspace["pipeline"]
        meta = workspace["metadata"]
        binary_store = workspace["binary_store"]
        src_dir = workspace["src_dir"]
        f = _write_file(src_dir, "solo.py", "solo = True\n")
        pipeline.index_file(f, root=src_dir)
        count_before = binary_store._count
        pipeline.compact()
        assert binary_store._count == count_before
 # ---------------------------------------------------------------------------
 # Backward compatibility: existing batch API still works
 # ---------------------------------------------------------------------------
 class TestBatchAPIUnchanged:
    def test_index_files_still_works(self, workspace):
        pipeline = workspace["pipeline"]
        src_dir = workspace["src_dir"]
        f1 = _write_file(src_dir, "batch1.py", "batch1 = 1\n")
        f2 = _write_file(src_dir, "batch2.py", "batch2 = 2\n")
        stats = pipeline.index_files([f1, f2], root=src_dir)
        assert stats.files_processed == 2
        assert stats.chunks_created >= 2
    def test_index_files_works_without_metadata(self, workspace):
        """Batch API should work even without MetadataStore."""
        cfg = workspace["config"]
        pipeline_no_meta = IndexingPipeline(
            embedder=FakeEmbedder(),
            binary_store=BinaryStore(workspace["store_dir"] / "no_meta", DIM, cfg),
            ann_index=ANNIndex(workspace["store_dir"] / "no_meta", DIM, cfg),
            fts=FTSEngine(str(workspace["store_dir"] / "no_meta_fts.db")),
            config=cfg,
        )
        src_dir = workspace["src_dir"]
        f = _write_file(src_dir, "no_meta_batch.py", "x = 1\n")
        stats = pipeline_no_meta.index_files([f], root=src_dir)
        assert stats.files_processed == 1