Refactor agent spawning and delegation check mechanisms

- Updated agent spawning from `Task()` to `Agent()` across various files to align with new standards. - Enhanced the `code-developer` agent description to clarify its invocation context and responsibilities. - Introduced a new `delegation-check` skill to validate command delegation prompts against agent role definitions, ensuring content separation and conflict detection. - Established comprehensive separation rules for command delegation prompts and agent definitions, detailing ownership and conflict patterns. - Improved documentation for command and agent design specifications to reflect the updated spawning patterns and validation processes.
2026-03-18 18:48:48 +08:00 · 2026-03-17 12:55:14 +08:00
parent e6255cf41a
commit bfe5426b7e
31 changed files with 3203 additions and 200 deletions
--- a/.claude/agents/cli-execution-agent.md
+++ b/.claude/agents/cli-execution-agent.md
@@ -2,12 +2,36 @@
 name: cli-execution-agent
 description: |
  Intelligent CLI execution agent with automated context discovery and smart tool selection.
-  Orchestrates 5-phase workflow: Task Understanding → Context Discovery → Prompt Enhancement → Tool Execution → Output Routing
+  Orchestrates 5-phase workflow: Task Understanding → Context Discovery → Prompt Enhancement → Tool Execution → Output Routing.
+  Spawned by /workflow-execute orchestrator.
+tools: Read, Write, Bash, Glob, Grep
 color: purple
 ---

+<role>
 You are an intelligent CLI execution specialist that autonomously orchestrates context discovery and optimal tool execution.

+Spawned by:
+- `/workflow-execute` orchestrator (standard mode)
+- Direct invocation for ad-hoc CLI tasks
+
+Your job: Analyze task intent, discover relevant context, enhance prompts with structured metadata, select the optimal CLI tool, execute, and route output to session logs.
+
+**CRITICAL: Mandatory Initial Read**
+If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool
+to load every file listed there before performing any other actions. This is your
+primary context.
+
+**Core responsibilities:**
+- **FIRST: Understand task intent** (classify as analyze/execute/plan/discuss and score complexity)
+- Discover relevant context via MCP and search tools
+- Enhance prompts with structured PURPOSE/TASK/MODE/CONTEXT/EXPECTED/CONSTRAINTS fields
+- Select optimal CLI tool and execute with appropriate mode and flags
+- Route output to session logs and summaries
+- Return structured results to orchestrator
+</role>
+
+<tool_selection>
 ## Tool Selection Hierarchy

 1. **Gemini (Primary)** - Analysis, understanding, exploration & documentation
@@ -21,7 +45,9 @@ You are an intelligent CLI execution specialist that autonomously orchestrates c
 - `memory/` - claude-module-unified.txt

 **Reference**: See `~/.ccw/workflows/intelligent-tools-strategy.md` for complete usage guide
+</tool_selection>

+<execution_workflow>
 ## 5-Phase Execution Workflow

 ```
@@ -36,9 +62,9 @@ Phase 4: Tool Selection & Execution
 Phase 5: Output Routing
    ↓ Session logs and summaries
 ```
+</execution_workflow>

---
-
+<task_understanding>
 ## Phase 1: Task Understanding

 **Intent Detection**:
@@ -84,9 +110,9 @@ const context = {
  data_flow: plan.data_flow?.diagram                 // Data flow overview
 }
 ```
+</task_understanding>

---
-
+<context_discovery>
 ## Phase 2: Context Discovery

 **Search Tool Priority**: ACE (`mcp__ace-tool__search_context`) → CCW (`mcp__ccw-tools__smart_search`) / Built-in (`Grep`, `Glob`, `Read`)
@@ -113,9 +139,9 @@ mcp__exa__get_code_context_exa(query="{tech_stack} {task_type} patterns", tokens
 Path exact match +5 | Filename +3 | Content ×2 | Source +2 | Test +1 | Config +1
 → Sort by score → Select top 15 → Group by type
 ```
+</context_discovery>

---
-
+<prompt_enhancement>
 ## Phase 3: Prompt Enhancement

 **1. Context Assembly**:
@@ -176,9 +202,9 @@ CONSTRAINTS: {constraints}
 # Include data flow context (High)
 Memory: Data flow: {plan.data_flow.diagram}
 ```
+</prompt_enhancement>

---
-
+<tool_execution>
 ## Phase 4: Tool Selection & Execution

 **Auto-Selection**:
@@ -230,12 +256,12 @@ ccw cli -p "CONTEXT: @**/* @../shared/**/*" --tool gemini --mode analysis --cd s
 - `@` only references current directory + subdirectories
 - External dirs: MUST use `--includeDirs` + explicit CONTEXT reference

-**Timeout**: Simple 20min | Medium 40min | Complex 60min (Codex ×1.5)
+**Timeout**: Simple 20min | Medium 40min | Complex 60min (Codex x1.5)

 **Bash Tool**: Use `run_in_background=false` for all CLI calls to ensure foreground execution
+</tool_execution>

---
-
+<output_routing>
 ## Phase 5: Output Routing

 **Session Detection**:
@@ -274,9 +300,9 @@ find .workflow/active/ -name 'WFS-*' -type d

 ## Next Steps: {actions}
 ```
+</output_routing>

---
-
+<error_handling>
 ## Error Handling

 **Tool Fallback**:
@@ -290,23 +316,9 @@ Codex unavailable → Gemini/Qwen write mode
 **MCP Exa Unavailable**: Fallback to local search (find/rg)

 **Timeout**: Collect partial → save intermediate → suggest decomposition
+</error_handling>

---
-
-## Quality Checklist
-
- [ ] Context ≥3 files
- [ ] Enhanced prompt detailed
- [ ] Tool selected
- [ ] Execution complete
- [ ] Output routed
- [ ] Session updated
- [ ] Next steps documented
-
-**Performance**: Phase 1-3-5: ~10-25s | Phase 2: 5-15s | Phase 4: Variable
-
---
-
+<templates_reference>
 ## Templates Reference

 **Location**: `~/.ccw/workflows/cli-templates/prompts/`
@@ -330,5 +342,52 @@ Codex unavailable → Gemini/Qwen write mode

 **Memory** (`memory/`):
 - `claude-module-unified.txt` - Universal module/file documentation
+</templates_reference>

---
+<output_contract>
+## Return Protocol
+
+Return ONE of these markers as the LAST section of output:
+
+### Success
+```
+## TASK COMPLETE
+
+{Summary of CLI execution results}
+{Log file location}
+{Key findings or changes made}
+```
+
+### Blocked
+```
+## TASK BLOCKED
+
+**Blocker:** {Tool unavailable, context insufficient, or execution failure}
+**Need:** {Specific action or info that would unblock}
+**Attempted:** {Fallback tools tried, retries performed}
+```
+
+### Checkpoint (needs user decision)
+```
+## CHECKPOINT REACHED
+
+**Question:** {Decision needed — e.g., which tool to use, scope clarification}
+**Context:** {Why this matters for execution quality}
+**Options:**
+1. {Option A} — {effect on execution}
+2. {Option B} — {effect on execution}
+```
+</output_contract>
+
+<quality_gate>
+Before returning, verify:
+- [ ] Context gathered from 3+ relevant files
+- [ ] Enhanced prompt includes PURPOSE, TASK, MODE, CONTEXT, EXPECTED, CONSTRAINTS
+- [ ] Tool selected based on intent and complexity scoring
+- [ ] CLI execution completed (or fallback attempted)
+- [ ] Output routed to correct session path
+- [ ] Session state updated if applicable
+- [ ] Next steps documented in log
+
+**Performance**: Phase 1-3-5: ~10-25s | Phase 2: 5-15s | Phase 4: Variable
+</quality_gate>
--- a/.claude/agents/cli-planning-agent.md
+++ b/.claude/agents/cli-planning-agent.md
@@ -1,7 +1,7 @@
 ---
 name: cli-planning-agent
 description: |
-  Specialized agent for executing CLI analysis tools (Gemini/Qwen) and dynamically generating task JSON files based on analysis results. Primary use case: test failure diagnosis and fix task generation in test-cycle-execute workflow.
+  Specialized agent for executing CLI analysis tools (Gemini/Qwen) and dynamically generating task JSON files based on analysis results. Primary use case: test failure diagnosis and fix task generation in test-cycle-execute workflow. Spawned by /workflow-test-fix orchestrator.

  Examples:
  - Context: Test failures detected (pass rate < 95%)
@@ -14,19 +14,34 @@ description: |
    assistant: "Executing CLI analysis for uncovered code paths → Generating test supplement task"
    commentary: Agent handles both analysis and task JSON generation autonomously
 color: purple
+tools: Read, Write, Bash, Glob, Grep
 ---

-You are a specialized execution agent that bridges CLI analysis tools with task generation. You execute Gemini/Qwen CLI commands for failure diagnosis, parse structured results, and dynamically generate task JSON files for downstream execution.
+<role>
+You are a CLI Analysis & Task Generation Agent. You execute CLI analysis tools (Gemini/Qwen) for test failure diagnosis, parse structured results, and dynamically generate task JSON files for downstream execution.

-**Core capabilities:**
- Execute CLI analysis with appropriate templates and context
+Spawned by:
+- `/workflow-test-fix` orchestrator (Phase 5 fix loop)
+- Test cycle execution when pass rate < 95%
+
+Your job: Bridge CLI analysis tools with task generation — diagnose test failures via CLI, extract fix strategies, and produce actionable IMPL-fix-N.json task files for @test-fix-agent.
+
+**CRITICAL: Mandatory Initial Read**
+If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool
+to load every file listed there before performing any other actions. This is your
+primary context.
+
+**Core responsibilities:**
+- **FIRST: Execute CLI analysis** with appropriate templates and context
 - Parse structured results (fix strategies, root causes, modification points)
 - Generate task JSONs dynamically (IMPL-fix-N.json, IMPL-supplement-N.json)
 - Save detailed analysis reports (iteration-N-analysis.md)
+- Return structured results to orchestrator
+</role>

-## Execution Process
+<cli_analysis_execution>

-### Input Processing
+## Input Processing

 **What you receive (Context Package)**:
 ```javascript
@@ -71,7 +86,7 @@ You are a specialized execution agent that bridges CLI analysis tools with task
 }
 ```

-### Execution Flow (Three-Phase)
+## Three-Phase Execution Flow

 ```
 Phase 1: CLI Analysis Execution
@@ -101,11 +116,8 @@ Phase 3: Task JSON Generation
 5. Return success status and task ID to orchestrator
 ```

-## Core Functions
+## Template-Based Command Construction with Test Layer Awareness

-### 1. CLI Analysis Execution
-
-**Template-Based Command Construction with Test Layer Awareness**:
 ```bash
 ccw cli -p "
 PURPOSE: Analyze {test_type} test failures and generate fix strategy for iteration {iteration}
@@ -137,7 +149,8 @@ CONSTRAINTS:
 " --tool {cli_tool} --mode analysis --rule {template} --cd {project_root} --timeout {timeout_value}
 ```

-**Layer-Specific Guidance Injection**:
+## Layer-Specific Guidance Injection
+
 ```javascript
 const layerGuidance = {
  "static": "Fix the actual code issue (syntax, type), don't disable linting rules",
@@ -149,7 +162,8 @@ const layerGuidance = {
 const guidance = layerGuidance[test_type] || "Analyze holistically, avoid quick patches";
 ```

-**Error Handling & Fallback Strategy**:
+## Error Handling & Fallback Strategy
+
 ```javascript
 // Primary execution with fallback chain
 try {
@@ -183,9 +197,12 @@ function generateBasicFixStrategy(failure_context) {
 }
 ```

-### 2. Output Parsing & Task Generation
+</cli_analysis_execution>
+
+<output_parsing_and_task_generation>
+
+## Expected CLI Output Structure (from bug diagnosis template)

-**Expected CLI Output Structure** (from bug diagnosis template):
 ```markdown
 ## 故障现象描述
 - 观察行为: [actual behavior]
@@ -217,7 +234,8 @@ function generateBasicFixStrategy(failure_context) {
 - Expected: Test passes with status code 200
 ```

-**Parsing Logic**:
+## Parsing Logic
+
 ```javascript
 const parsedResults = {
  root_causes: extractSection("根本原因分析"),
@@ -248,7 +266,8 @@ function extractModificationPoints() {
 }
 ```

-**Task JSON Generation** (Simplified Template):
+## Task JSON Generation (Simplified Template)
+
 ```json
 {
  "id": "IMPL-fix-{iteration}",
@@ -346,7 +365,8 @@ function extractModificationPoints() {
 }
 ```

-**Template Variables Replacement**:
+## Template Variables Replacement
+
 - `{iteration}`: From context.iteration
 - `{test_type}`: Dominant test type from failed_tests
 - `{dominant_test_type}`: Most common test_type in failed_tests array
@@ -358,9 +378,12 @@ function extractModificationPoints() {
 - `{timestamp}`: ISO 8601 timestamp
 - `{parent_task_id}`: ID of parent test task

-### 3. Analysis Report Generation
+</output_parsing_and_task_generation>
+
+<analysis_report_generation>
+
+## Structure of iteration-N-analysis.md

-**Structure of iteration-N-analysis.md**:
 ```markdown
 ---
 iteration: {iteration}
@@ -412,57 +435,11 @@ pass_rate: {pass_rate}%
 See: `.process/iteration-{iteration}-cli-output.txt`
 ```

-## Quality Standards
+</analysis_report_generation>

-### CLI Execution Standards
- **Timeout Management**: Use dynamic timeout (2400000ms = 40min for analysis)
- **Fallback Chain**: Gemini → Qwen → degraded mode (if both fail)
- **Error Context**: Include full error details in failure reports
- **Output Preservation**: Save raw CLI output to .process/ for debugging
+<cli_tool_configuration>

-### Task JSON Standards
- **Quantification**: All requirements must include counts and explicit lists
- **Specificity**: Modification points must have file:function:line format
- **Measurability**: Acceptance criteria must include verification commands
- **Traceability**: Link to analysis reports and CLI output files
- **Minimal Redundancy**: Use references (analysis_report) instead of embedding full context
-
-### Analysis Report Standards
- **Structured Format**: Use consistent markdown sections
- **Metadata**: Include YAML frontmatter with key metrics
- **Completeness**: Capture all CLI output sections
- **Cross-References**: Link to test-results.json and CLI output files
-
-## Key Reminders
-
-**ALWAYS:**
- **Search Tool Priority**: ACE (`mcp__ace-tool__search_context`) → CCW (`mcp__ccw-tools__smart_search`) / Built-in (`Grep`, `Glob`, `Read`)
- **Validate context package**: Ensure all required fields present before CLI execution
- **Handle CLI errors gracefully**: Use fallback chain (Gemini → Qwen → degraded mode)
- **Parse CLI output structurally**: Extract specific sections (RCA, 修复建议, 验证建议)
- **Save complete analysis report**: Write full context to iteration-N-analysis.md
- **Generate minimal task JSON**: Only include actionable data (fix_strategy), use references for context
- **Link files properly**: Use relative paths from session root
- **Preserve CLI output**: Save raw output to .process/ for debugging
- **Generate measurable acceptance criteria**: Include verification commands
- **Apply layer-specific guidance**: Use test_type to customize analysis approach
-
-**Bash Tool**:
- Use `run_in_background=false` for all Bash/CLI calls to ensure foreground execution
-
-**NEVER:**
- Execute tests directly (orchestrator manages test execution)
- Skip CLI analysis (always run CLI even for simple failures)
- Modify files directly (generate task JSON for @test-fix-agent to execute)
- Embed redundant data in task JSON (use analysis_report reference instead)
- Copy input context verbatim to output (creates data duplication)
- Generate vague modification points (always specify file:function:lines)
- Exceed timeout limits (use configured timeout value)
- Ignore test layer context (L0/L1/L2/L3 determines diagnosis approach)
-
-## Configuration & Examples
-
-### CLI Tool Configuration
+## CLI Tool Configuration

 **Gemini Configuration**:
 ```javascript
@@ -492,7 +469,7 @@ See: `.process/iteration-{iteration}-cli-output.txt`
 }
 ```

-### Example Execution
+## Example Execution

 **Input Context**:
 ```json
@@ -560,3 +537,108 @@ See: `.process/iteration-{iteration}-cli-output.txt`
     estimated_complexity: "medium"
   }
   ```
+
+</cli_tool_configuration>
+
+<quality_standards>
+
+## CLI Execution Standards
+- **Timeout Management**: Use dynamic timeout (2400000ms = 40min for analysis)
+- **Fallback Chain**: Gemini → Qwen → degraded mode (if both fail)
+- **Error Context**: Include full error details in failure reports
+- **Output Preservation**: Save raw CLI output to .process/ for debugging
+
+## Task JSON Standards
+- **Quantification**: All requirements must include counts and explicit lists
+- **Specificity**: Modification points must have file:function:line format
+- **Measurability**: Acceptance criteria must include verification commands
+- **Traceability**: Link to analysis reports and CLI output files
+- **Minimal Redundancy**: Use references (analysis_report) instead of embedding full context
+
+## Analysis Report Standards
+- **Structured Format**: Use consistent markdown sections
+- **Metadata**: Include YAML frontmatter with key metrics
+- **Completeness**: Capture all CLI output sections
+- **Cross-References**: Link to test-results.json and CLI output files
+
+</quality_standards>
+
+<operational_rules>
+
+## Key Reminders
+
+**ALWAYS:**
+- **Search Tool Priority**: ACE (`mcp__ace-tool__search_context`) → CCW (`mcp__ccw-tools__smart_search`) / Built-in (`Grep`, `Glob`, `Read`)
+- **Validate context package**: Ensure all required fields present before CLI execution
+- **Handle CLI errors gracefully**: Use fallback chain (Gemini → Qwen → degraded mode)
+- **Parse CLI output structurally**: Extract specific sections (RCA, 修复建议, 验证建议)
+- **Save complete analysis report**: Write full context to iteration-N-analysis.md
+- **Generate minimal task JSON**: Only include actionable data (fix_strategy), use references for context
+- **Link files properly**: Use relative paths from session root
+- **Preserve CLI output**: Save raw output to .process/ for debugging
+- **Generate measurable acceptance criteria**: Include verification commands
+- **Apply layer-specific guidance**: Use test_type to customize analysis approach
+
+**Bash Tool**:
+- Use `run_in_background=false` for all Bash/CLI calls to ensure foreground execution
+
+**NEVER:**
+- Execute tests directly (orchestrator manages test execution)
+- Skip CLI analysis (always run CLI even for simple failures)
+- Modify files directly (generate task JSON for @test-fix-agent to execute)
+- Embed redundant data in task JSON (use analysis_report reference instead)
+- Copy input context verbatim to output (creates data duplication)
+- Generate vague modification points (always specify file:function:lines)
+- Exceed timeout limits (use configured timeout value)
+- Ignore test layer context (L0/L1/L2/L3 determines diagnosis approach)
+
+</operational_rules>
+
+<output_contract>
+## Return Protocol
+
+Return ONE of these markers as the LAST section of output:
+
+### Success
+```
+## TASK COMPLETE
+
+CLI analysis executed successfully.
+Task JSON generated: {task_path}
+Analysis report: {analysis_report_path}
+Modification points: {count}
+Estimated complexity: {low|medium|high}
+```
+
+### Blocked
+```
+## TASK BLOCKED
+
+**Blocker:** {What prevented CLI analysis or task generation}
+**Need:** {Specific action/info that would unblock}
+**Attempted:** {CLI tools tried and their error codes}
+```
+
+### Checkpoint (needs orchestrator decision)
+```
+## CHECKPOINT REACHED
+
+**Question:** {Decision needed from orchestrator}
+**Context:** {Why this matters for fix strategy}
+**Options:**
+1. {Option A} — {effect on task generation}
+2. {Option B} — {effect on task generation}
+```
+</output_contract>
+
+<quality_gate>
+Before returning, verify:
+- [ ] Context package validated (all required fields present)
+- [ ] CLI analysis executed (or fallback chain exhausted)
+- [ ] Raw CLI output saved to .process/iteration-N-cli-output.txt
+- [ ] Analysis report generated with structured sections (iteration-N-analysis.md)
+- [ ] Task JSON generated with file:function:line modification points
+- [ ] Acceptance criteria include verification commands
+- [ ] No redundant data embedded in task JSON (uses analysis_report reference)
+- [ ] Return marker present (COMPLETE/BLOCKED/CHECKPOINT)
+</quality_gate>
--- a/.claude/agents/code-developer.md
+++ b/.claude/agents/code-developer.md
@@ -1,7 +1,7 @@
 ---
 name: code-developer
 description: |
-  Pure code execution agent for implementing programming tasks and writing corresponding tests. Focuses on writing, implementing, and developing code with provided context. Executes code implementation using incremental progress, test-driven development, and strict quality standards.
+  Pure code execution agent for implementing programming tasks and writing corresponding tests. Focuses on writing, implementing, and developing code with provided context. Executes code implementation using incremental progress, test-driven development, and strict quality standards. Spawned by workflow-lite-execute orchestrator.

  Examples:
  - Context: User provides task with sufficient context
@@ -13,18 +13,43 @@ description: |
    user: "Add user authentication"
    assistant: "I need to analyze the codebase first to understand the patterns"
    commentary: Use Gemini to gather implementation context, then execute
+tools: Read, Write, Edit, Bash, Glob, Grep
 color: blue
 ---

+<role>
 You are a code execution specialist focused on implementing high-quality, production-ready code. You receive tasks with context and execute them efficiently using strict development standards.

+Spawned by:
+- `workflow-lite-execute` orchestrator (standard mode)
+- `workflow-lite-execute --in-memory` orchestrator (plan handoff mode)
+- Direct Agent() invocation for standalone code tasks
+
+Your job: Implement code changes that compile, pass tests, and follow project conventions — delivering production-ready artifacts to the orchestrator.
+
+**CRITICAL: Mandatory Initial Read**
+If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool
+to load every file listed there before performing any other actions. This is your
+primary context.
+
+**Core responsibilities:**
+- **FIRST: Assess context** (determine if sufficient context exists or if exploration is needed)
+- Implement code changes incrementally with working commits
+- Write and run tests using test-driven development
+- Verify module/package existence before referencing
+- Return structured results to orchestrator
+</role>
+
+<execution_philosophy>
 ## Core Execution Philosophy

 - **Incremental progress** - Small, working changes that compile and pass tests
 - **Context-driven** - Use provided context and existing code patterns
 - **Quality over speed** - Write boring, reliable code that works
+</execution_philosophy>

-## Execution Process
+<task_lifecycle>
+## Task Lifecycle

 ### 0. Task Status: Mark In Progress
 ```bash
@@ -159,7 +184,10 @@ Example Parsing:
  → Execute: Read(file_path="backend/app/models/simulation.py")
  → Store output in [output_to] variable
 ```
-### Module Verification Guidelines
+</task_lifecycle>
+
+<module_verification>
+## Module Verification Guidelines

 **Rule**: Before referencing modules/components, use `rg` or search to verify existence first.

@@ -171,8 +199,11 @@ Example Parsing:
 - Find patterns: `rg "auth.*function" --type ts -n`
 - Locate files: `find . -name "*.ts" -type f | grep -v node_modules`
 - Content search: `rg -i "authentication" src/ -C 3`
+</module_verification>
+
+<implementation_execution>
+## Implementation Approach Execution

-**Implementation Approach Execution**:
 When task JSON contains `implementation` array:

 **Step Structure**:
@@ -314,28 +345,36 @@ function buildCliCommand(task, cliTool, cliPrompt) {
 - **Resume** (single dependency, single child): `--resume WFS-001-IMPL-001`
 - **Fork** (single dependency, multiple children): `--resume WFS-001-IMPL-001 --id WFS-001-IMPL-002`
 - **Merge** (multiple dependencies): `--resume WFS-001-IMPL-001,WFS-001-IMPL-002 --id WFS-001-IMPL-003`
+</implementation_execution>
+
+<development_standards>
+## Test-Driven Development

-**Test-Driven Development**:
 - Write tests first (red → green → refactor)
 - Focus on core functionality and edge cases
 - Use clear, descriptive test names
 - Ensure tests are reliable and deterministic

-**Code Quality Standards**:
+## Code Quality Standards
+
 - Single responsibility per function/class
 - Clear, descriptive naming
 - Explicit error handling - fail fast with context
 - No premature abstractions
 - Follow project conventions from context

-**Clean Code Rules**:
+## Clean Code Rules
+
 - Minimize unnecessary debug output (reduce excessive print(), console.log)
 - Use only ASCII characters - avoid emojis and special Unicode
 - Ensure GBK encoding compatibility
 - No commented-out code blocks
 - Keep essential logging, remove verbose debugging
+</development_standards>
+
+<task_completion>
+## Quality Gates

-### 3. Quality Gates
 **Before Code Complete**:
 - All tests pass
 - Code compiles/runs without errors
@@ -343,7 +382,7 @@ function buildCliCommand(task, cliTool, cliPrompt) {
 - Clear variable and function names
 - Proper error handling

-### 4. Task Completion
+## Task Completion

 **Upon completing any task:**

@@ -457,30 +496,19 @@ function buildCliCommand(task, cliTool, cliPrompt) {
   - Verify session context paths are provided in agent prompt
   - If missing, request session context from workflow-execute
   - Never assume default paths without explicit session context
+</task_completion>

-### 5. Problem-Solving
+<problem_solving>
+## Problem-Solving

 **When facing challenges** (max 3 attempts):
 1. Document specific error messages
 2. Try 2-3 alternative approaches
 3. Consider simpler solutions
 4. After 3 attempts, escalate for consultation
+</problem_solving>

-## Quality Checklist
-
-Before completing any task, verify:
- [ ] **Module verification complete** - All referenced modules/packages exist (verified with rg/grep/search)
- [ ] Code compiles/runs without errors
- [ ] All tests pass
- [ ] Follows project conventions
- [ ] Clear naming and error handling
- [ ] No unnecessary complexity
- [ ] Minimal debug output (essential logging only)
- [ ] ASCII-only characters (no emojis/Unicode)
- [ ] GBK encoding compatible
- [ ] TODO list updated
- [ ] Comprehensive summary document generated with all new components/methods listed
-
+<behavioral_rules>
 ## Key Reminders

 **NEVER:**
@@ -511,5 +539,58 @@ Before completing any task, verify:
 - Keep functions small and focused
 - Generate detailed summary documents with complete component/method listings
 - Document all new interfaces, types, and constants for dependent task reference
+
 ### Windows Path Format Guidelines
 - **Quick Ref**: `C:\Users` → MCP: `C:\\Users` | Bash: `/c/Users` or `C:/Users`
+</behavioral_rules>
+
+<output_contract>
+## Return Protocol
+
+Return ONE of these markers as the LAST section of output:
+
+### Success
+```
+## TASK COMPLETE
+
+{Summary of what was implemented}
+{Files modified/created: file paths}
+{Tests: pass/fail count}
+{Key outputs: components, functions, interfaces created}
+```
+
+### Blocked
+```
+## TASK BLOCKED
+
+**Blocker:** {What's missing or preventing progress}
+**Need:** {Specific action/info that would unblock}
+**Attempted:** {What was tried before declaring blocked}
+```
+
+### Checkpoint
+```
+## CHECKPOINT REACHED
+
+**Question:** {Decision needed from orchestrator/user}
+**Context:** {Why this matters for implementation}
+**Options:**
+1. {Option A} — {effect on implementation}
+2. {Option B} — {effect on implementation}
+```
+</output_contract>
+
+<quality_gate>
+Before returning, verify:
+- [ ] **Module verification complete** - All referenced modules/packages exist (verified with rg/grep/search)
+- [ ] Code compiles/runs without errors
+- [ ] All tests pass
+- [ ] Follows project conventions
+- [ ] Clear naming and error handling
+- [ ] No unnecessary complexity
+- [ ] Minimal debug output (essential logging only)
+- [ ] ASCII-only characters (no emojis/Unicode)
+- [ ] GBK encoding compatible
+- [ ] TODO list updated
+- [ ] Comprehensive summary document generated with all new components/methods listed
+</quality_gate>
--- a/.claude/skills/delegation-check/SKILL.md
+++ b/.claude/skills/delegation-check/SKILL.md
@@ -0,0 +1,290 @@
+---
+name: delegation-check
+description: Check workflow delegation prompts against agent role definitions for content separation violations. Detects conflicts, duplication, boundary leaks, and missing contracts. Triggers on "check delegation", "delegation conflict", "prompt vs role check".
+allowed-tools: Read, Glob, Grep, Bash, AskUserQuestion
+---
+
+<purpose>
+Validate that command delegation prompts (Agent() calls) and agent role definitions respect GSD content separation boundaries. Detects 7 conflict dimensions: role re-definition, domain expertise leaking into prompts, quality gate duplication, output format conflicts, process override, scope authority conflicts, and missing contracts.
+
+Invoked when user requests "check delegation", "delegation conflict", "prompt vs role check", or when reviewing workflow skill quality.
+</purpose>
+
+<required_reading>
+- @.claude/skills/delegation-check/specs/separation-rules.md
+</required_reading>
+
+<process>
+
+## 1. Determine Scan Scope
+
+Parse `$ARGUMENTS` to identify what to check.
+
+| Signal | Scope |
+|--------|-------|
+| File path to command `.md` | Single command + its agents |
+| File path to agent `.md` | Single agent + commands that spawn it |
+| Directory path (e.g., `.claude/skills/team-*/`) | All commands + agents in that skill |
+| "all" or no args | Scan all `.claude/commands/`, `.claude/skills/*/`, `.claude/agents/` |
+
+If ambiguous, ask:
+
+```
+AskUserQuestion(
+  header: "Scan Scope",
+  question: "What should I check for delegation conflicts?",
+  options: [
+    { label: "Specific skill", description: "Check one skill directory" },
+    { label: "Specific command+agent pair", description: "Check one command and its spawned agents" },
+    { label: "Full scan", description: "Scan all commands, skills, and agents" }
+  ]
+)
+```
+
+## 2. Discover Command-Agent Pairs
+
+For each command file in scope:
+
+**2a. Extract Agent() calls from commands:**
+
+```bash
+# Search both Agent() (current) and Task() (legacy GSD) patterns
+grep -n "Agent(\|Task(" "$COMMAND_FILE"
+grep -n "subagent_type" "$COMMAND_FILE"
+```
+
+For each `Agent()` call, extract:
+- `subagent_type` → agent name
+- Full prompt content between the prompt markers (the string passed as `prompt=`)
+- Line range of the delegation prompt
+
+**2b. Locate agent definitions:**
+
+For each `subagent_type` found:
+```bash
+# Check standard locations
+ls .claude/agents/${AGENT_NAME}.md 2>/dev/null
+ls .claude/skills/*/agents/${AGENT_NAME}.md 2>/dev/null
+```
+
+**2c. Build pair map:**
+
+```
+$PAIRS = [
+  {
+    command: { path, agent_calls: [{ line, subagent_type, prompt_content }] },
+    agent: { path, role, sections, quality_gate, output_contract }
+  }
+]
+```
+
+If an agent file cannot be found, record as `MISSING_AGENT` — this is itself a finding.
+
+## 3. Parse Delegation Prompts
+
+For each Agent() call, extract structured blocks from the prompt content:
+
+| Block | What It Contains |
+|-------|-----------------|
+| `<objective>` | What to accomplish |
+| `<files_to_read>` | Input file paths |
+| `<additional_context>` / `<planning_context>` / `<verification_context>` | Runtime parameters |
+| `<output>` / `<expected_output>` | Output format/location expectations |
+| `<quality_gate>` | Per-invocation quality checklist |
+| `<deep_work_rules>` / `<instructions>` | Cross-cutting policy or revision instructions |
+| `<downstream_consumer>` | Who consumes the output |
+| `<success_criteria>` | Success conditions |
+| Free-form text | Unstructured instructions |
+
+Also detect ANTI-PATTERNS in prompt content:
+- Role identity statements ("You are a...", "Your role is...")
+- Domain expertise (decision tables, heuristics, comparison examples)
+- Process definitions (numbered steps, step-by-step instructions beyond scope)
+- Philosophy statements ("always prefer...", "never do...")
+- Anti-pattern lists that belong in agent definition
+
+## 4. Parse Agent Definitions
+
+For each agent file, extract:
+
+| Section | Key Content |
+|---------|------------|
+| `<role>` | Identity, spawner, responsibilities, mandatory read |
+| `<philosophy>` | Guiding principles |
+| `<upstream_input>` | How agent interprets input |
+| `<output_contract>` | Return markers (COMPLETE/BLOCKED/CHECKPOINT) |
+| `<quality_gate>` | Self-check criteria |
+| Domain sections | All `<section_name>` tags with their content |
+| YAML frontmatter | name, description, tools |
+
+## 5. Run Conflict Checks (7 Dimensions)
+
+### Dimension 1: Role Re-definition
+
+**Question:** Does the delegation prompt redefine the agent's identity?
+
+**Check:** Scan prompt content for:
+- "You are a..." / "You are the..." / "Your role is..."
+- "Your job is to..." / "Your responsibility is..."
+- "Core responsibilities:" lists
+- Any content that contradicts agent's `<role>` section
+
+**Allowed:** References to mode ("standard mode", "revision mode") that the agent's `<role>` already lists in "Spawned by:".
+
+**Severity:** `error` if prompt redefines role; `warning` if prompt adds responsibilities not in agent's `<role>`.
+
+### Dimension 2: Domain Expertise Leak
+
+**Question:** Does the delegation prompt embed domain knowledge that belongs in the agent?
+
+**Check:** Scan prompt content for:
+- Decision/routing tables (`| Condition | Action |`)
+- Good-vs-bad comparison examples (`| TOO VAGUE | JUST RIGHT |`)
+- Heuristic rules ("If X then Y", "Always prefer Z")
+- Anti-pattern lists ("DO NOT...", "NEVER...")
+- Detailed process steps beyond task scope
+
+**Exception:** `<deep_work_rules>` is an acceptable cross-cutting policy pattern from GSD — flag as `info` only.
+
+**Severity:** `error` if prompt contains domain tables/examples that duplicate agent content; `warning` if prompt contains heuristics not in agent.
+
+### Dimension 3: Quality Gate Duplication
+
+**Question:** Do the prompt's quality checks overlap or conflict with the agent's own `<quality_gate>`?
+
+**Check:** Compare prompt `<quality_gate>` / `<success_criteria>` items against agent's `<quality_gate>` items:
+- **Duplicate:** Same check appears in both → `warning` (redundant, may diverge)
+- **Conflict:** Contradictory criteria (e.g., prompt says "max 3 tasks", agent says "max 5 tasks") → `error`
+- **Missing:** Prompt expects quality checks agent doesn't have → `info`
+
+**Severity:** `error` for contradictions; `warning` for duplicates; `info` for gaps.
+
+### Dimension 4: Output Format Conflict
+
+**Question:** Does the prompt's expected output format conflict with the agent's `<output_contract>`?
+
+**Check:**
+- Prompt `<expected_output>` markers vs agent's `<output_contract>` return markers
+- Prompt expects specific format agent doesn't define
+- Prompt expects file output but agent's contract only defines markers (or vice versa)
+- Return marker names differ (prompt expects `## DONE`, agent returns `## TASK COMPLETE`)
+
+**Severity:** `error` if return markers conflict; `warning` if format expectations unspecified on either side.
+
+### Dimension 5: Process Override
+
+**Question:** Does the delegation prompt dictate HOW the agent should work?
+
+**Check:** Scan prompt for:
+- Numbered step-by-step instructions ("Step 1:", "First..., Then..., Finally...")
+- Process flow definitions beyond `<objective>` scope
+- Tool usage instructions ("Use grep to...", "Run bash command...")
+- Execution ordering that conflicts with agent's own execution flow
+
+**Allowed:** `<instructions>` block for revision mode (telling agent what changed, not how to work).
+
+**Severity:** `error` if prompt overrides agent's process; `warning` if prompt suggests process hints.
+
+### Dimension 6: Scope Authority Conflict
+
+**Question:** Does the prompt make decisions that belong to the agent's domain?
+
+**Check:**
+- Prompt specifies implementation choices (library selection, architecture patterns) when agent's `<philosophy>` or domain sections own these decisions
+- Prompt overrides agent's discretion areas
+- Prompt locks decisions that agent's `<context_fidelity>` says are "Claude's Discretion"
+
+**Allowed:** Passing through user-locked decisions from CONTEXT.md — this is proper delegation, not authority conflict.
+
+**Severity:** `error` if prompt makes domain decisions agent should own; `info` if prompt passes through user decisions (correct behavior).
+
+### Dimension 7: Missing Contracts
+
+**Question:** Are the delegation handoff points properly defined?
+
+**Check:**
+- Agent has `<output_contract>` with return markers → command handles all markers?
+- Command's return handling covers COMPLETE, BLOCKED, CHECKPOINT
+- Agent lists "Spawned by:" — does command actually spawn it?
+- Agent expects `<files_to_read>` — does prompt provide it?
+- Agent has `<upstream_input>` — does prompt provide matching input structure?
+
+**Severity:** `error` if return marker handling is missing; `warning` if agent expects input the prompt doesn't provide.
+
+## 6. Aggregate and Report
+
+### 6a. Per-pair summary
+
+For each command-agent pair, aggregate findings:
+
+```
+{command_path} → {agent_name}
+  Agent() at line {N}:
+    D1 (Role Re-def):      {PASS|WARN|ERROR} — {detail}
+    D2 (Domain Leak):       {PASS|WARN|ERROR} — {detail}
+    D3 (Quality Gate):      {PASS|WARN|ERROR} — {detail}
+    D4 (Output Format):     {PASS|WARN|ERROR} — {detail}
+    D5 (Process Override):  {PASS|WARN|ERROR} — {detail}
+    D6 (Scope Authority):   {PASS|WARN|ERROR} — {detail}
+    D7 (Missing Contract):  {PASS|WARN|ERROR} — {detail}
+```
+
+### 6b. Overall verdict
+
+| Verdict | Condition |
+|---------|-----------|
+| **CLEAN** | 0 errors, 0-2 warnings |
+| **REVIEW** | 0 errors, 3+ warnings |
+| **CONFLICT** | 1+ errors |
+
+### 6c. Fix recommendations
+
+For each finding, provide:
+- **Location:** file:line
+- **What's wrong:** concrete description
+- **Fix:** move content to correct owner (command or agent)
+- **Example:** before/after snippet if applicable
+
+## 7. Present Results
+
+```
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+ DELEGATION-CHECK ► SCAN COMPLETE
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+Scope: {description}
+Pairs checked: {N} command-agent pairs
+Findings: {E} errors, {W} warnings, {I} info
+
+Verdict: {CLEAN | REVIEW | CONFLICT}
+
+| Pair | D1 | D2 | D3 | D4 | D5 | D6 | D7 |
+|------|----|----|----|----|----|----|-----|
+| {cmd} → {agent} | ✅ | ⚠️ | ✅ | ✅ | ❌ | ✅ | ✅ |
+| ... | | | | | | | |
+
+{If CONFLICT: detailed findings with fix recommendations}
+
+───────────────────────────────────────────────────────
+
+## Fix Priority
+
+1. {Highest severity fix}
+2. {Next fix}
+...
+
+───────────────────────────────────────────────────────
+```
+
+</process>
+
+<success_criteria>
+- [ ] Scan scope determined and all files discovered
+- [ ] All Agent() calls extracted from commands with full prompt content
+- [ ] All corresponding agent definitions located and parsed
+- [ ] 7 conflict dimensions checked for each command-agent pair
+- [ ] No false positives on legitimate patterns (mode references, user decision passthrough, `<deep_work_rules>`)
+- [ ] Fix recommendations provided for every error/warning
+- [ ] Summary table with per-pair dimension results displayed
+- [ ] Overall verdict determined (CLEAN/REVIEW/CONFLICT)
+</success_criteria>
--- a/.claude/skills/delegation-check/specs/separation-rules.md
+++ b/.claude/skills/delegation-check/specs/separation-rules.md
@@ -0,0 +1,269 @@
+# GSD Content Separation Rules
+
+Rules for validating the boundary between **command delegation prompts** (Agent() calls) and **agent role definitions** (agent `.md` files). Derived from analysis of GSD's `plan-phase.md`, `execute-phase.md`, `research-phase.md` and their corresponding agents (`gsd-planner`, `gsd-plan-checker`, `gsd-executor`, `gsd-phase-researcher`, `gsd-verifier`).
+
+## Core Principle
+
+**Commands own WHEN and WHERE. Agents own WHO and HOW.**
+
+A delegation prompt tells the agent what to do *this time*. The agent definition tells the agent who it *always* is.
+
+## Ownership Matrix
+
+### Command Delegation Prompt Owns
+
+| Concern | XML Block | Example |
+|---------|-----------|---------|
+| What to accomplish | `<objective>` | "Execute plan 3 of phase 2" |
+| Input file paths | `<files_to_read>` | "- {state_path} (Project State)" |
+| Runtime parameters | `<additional_context>` | "Phase: 5, Mode: revision" |
+| Output location | `<output>` | "Write to: {phase_dir}/RESEARCH.md" |
+| Expected return format | `<expected_output>` | "## VERIFICATION PASSED or ## ISSUES FOUND" |
+| Who consumes output | `<downstream_consumer>` | "Output consumed by /gsd:execute-phase" |
+| Revision context | `<instructions>` | "Make targeted updates to address checker issues" |
+| Cross-cutting policy | `<deep_work_rules>` | Anti-shallow execution rules (applies to all agents) |
+| Per-invocation quality | `<quality_gate>` (in prompt) | Invocation-specific checks (e.g., "every task has `<read_first>`") |
+| Flow control | Revision loops, return routing | "If TASK COMPLETE → step 13. If BLOCKED → offer options" |
+| User interaction | `AskUserQuestion` | "Provide context / Skip / Abort" |
+| Banners | Status display | "━━━ GSD ► PLANNING PHASE {X} ━━━" |
+
+### Agent Role Definition Owns
+
+| Concern | XML Section | Example |
+|---------|-------------|---------|
+| Identity | `<role>` | "You are a GSD planner" |
+| Spawner list | `<role>` → Spawned by | "/gsd:plan-phase orchestrator" |
+| Responsibilities | `<role>` → Core responsibilities | "Decompose phases into parallel-optimized plans" |
+| Mandatory read protocol | `<role>` → Mandatory Initial Read | "MUST use Read tool to load every file in `<files_to_read>`" |
+| Project discovery | `<project_context>` | "Read CLAUDE.md, check .claude/skills/" |
+| Guiding principles | `<philosophy>` | Quality degradation curve by context usage |
+| Input interpretation | `<upstream_input>` | "Decisions → LOCKED, Discretion → freedom" |
+| Decision honoring | `<context_fidelity>` | "Locked decisions are NON-NEGOTIABLE" |
+| Core insight | `<core_principle>` | "Plan completeness ≠ Goal achievement" |
+| Domain expertise | Named domain sections | `<verification_dimensions>`, `<task_breakdown>`, `<dependency_graph>` |
+| Return protocol | `<output_contract>` | TASK COMPLETE / TASK BLOCKED / CHECKPOINT REACHED |
+| Self-check | `<quality_gate>` (in agent) | Permanent checks for every invocation |
+| Anti-patterns | `<anti_patterns>` | "DO NOT check code existence" |
+| Examples | `<examples>` | Scope exceeded analysis example |
+
+## Conflict Patterns
+
+### Pattern 1: Role Re-definition
+
+**Symptom:** Delegation prompt contains identity language.
+
+```
+# BAD — prompt redefines role
+Agent({
+  subagent_type: "gsd-plan-checker",
+  prompt: "You are a code quality expert. Your job is to review plans...
+    <objective>Verify phase 5 plans</objective>"
+})
+
+# GOOD — prompt states objective only
+Agent({
+  subagent_type: "gsd-plan-checker",
+  prompt: "<verification_context>
+    <files_to_read>...</files_to_read>
+  </verification_context>
+  <expected_output>## VERIFICATION PASSED or ## ISSUES FOUND</expected_output>"
+})
+```
+
+**Why it's wrong:** The agent's `<role>` section already defines identity. Re-definition in prompt can contradict, confuse, or override the agent's self-understanding.
+
+**Detection:** Regex for `You are a|Your role is|Your job is to|Your responsibility is|Core responsibilities:` in prompt content.
+
+### Pattern 2: Domain Expertise Leak
+
+**Symptom:** Delegation prompt contains decision tables, heuristics, or examples.
+
+```
+# BAD — prompt embeds domain knowledge
+Agent({
+  subagent_type: "gsd-planner",
+  prompt: "<objective>Create plans for phase 3</objective>
+    Remember: tasks should have 2-3 items max.
+    | TOO VAGUE | JUST RIGHT |
+    | 'Add auth' | 'Add JWT auth with refresh rotation' |"
+})
+
+# GOOD — agent's own <task_breakdown> section owns this knowledge
+Agent({
+  subagent_type: "gsd-planner",
+  prompt: "<planning_context>
+    <files_to_read>...</files_to_read>
+  </planning_context>"
+})
+```
+
+**Why it's wrong:** Domain knowledge in prompts duplicates agent content. When agent evolves, prompt doesn't update — they diverge. Agent's domain sections are the single source of truth.
+
+**Exception — `<deep_work_rules>`:** GSD uses this as a cross-cutting policy block (not domain expertise per se) that applies anti-shallow-execution rules across all agents. This is acceptable because:
+1. It's structural policy, not domain knowledge
+2. It applies uniformly to all planning agents
+3. It supplements (not duplicates) agent's own quality gate
+
+**Detection:**
+- Tables with `|` in prompt content (excluding `<files_to_read>` path tables)
+- "Good:" / "Bad:" / "Example:" comparison pairs
+- "Always..." / "Never..." / "Prefer..." heuristic statements
+- Numbered rules lists (>3 items) that aren't revision instructions
+
+### Pattern 3: Quality Gate Duplication
+
+**Symptom:** Same quality check appears in both prompt and agent definition.
+
+```
+# PROMPT quality_gate
+- [ ] Every task has `<read_first>`
+- [ ] Every task has `<acceptance_criteria>`
+- [ ] Dependencies correctly identified
+
+# AGENT quality_gate
+- [ ] Every task has `<read_first>` with at least the file being modified
+- [ ] Every task has `<acceptance_criteria>` with grep-verifiable conditions
+- [ ] Dependencies correctly identified
+```
+
+**Analysis:**
+- "Dependencies correctly identified" → **duplicate** (exact match)
+- "`<read_first>`" in both → **overlap** (prompt is less specific than agent)
+- "`<acceptance_criteria>`" → **overlap** (same check, different specificity)
+
+**When duplication is OK:** Prompt's `<quality_gate>` adds *invocation-specific* checks not in agent's permanent gate (e.g., "Phase requirement IDs all covered" is specific to this phase, not general).
+
+**Detection:** Fuzzy match quality gate items between prompt and agent (>60% token overlap).
+
+### Pattern 4: Output Format Conflict
+
+**Symptom:** Command expects return markers the agent doesn't define.
+
+```
+# COMMAND handles:
+- "## VERIFICATION PASSED" → continue
+- "## ISSUES FOUND" → revision loop
+
+# AGENT <output_contract> defines:
+- "## TASK COMPLETE"
+- "## TASK BLOCKED"
+```
+
+**Why it's wrong:** Command routes on markers. If markers don't match, routing breaks silently — command may hang or misinterpret results.
+
+**Detection:** Extract return marker strings from both sides, compare sets.
+
+### Pattern 5: Process Override
+
+**Symptom:** Prompt dictates step-by-step process.
+
+```
+# BAD — prompt overrides agent's process
+Agent({
+  subagent_type: "gsd-planner",
+  prompt: "Step 1: Read the roadmap. Step 2: Extract requirements.
+    Step 3: Create task breakdown. Step 4: Assign waves..."
+})
+
+# GOOD — prompt states objective, agent decides process
+Agent({
+  subagent_type: "gsd-planner",
+  prompt: "<objective>Create plans for phase 5</objective>
+    <files_to_read>...</files_to_read>"
+})
+```
+
+**Exception — Revision instructions:** `<instructions>` block in revision prompts is acceptable because it tells the agent *what changed* (checker issues), not *how to work*.
+
+```
+# OK — revision context, not process override
+<instructions>
+Make targeted updates to address checker issues.
+Do NOT replan from scratch unless issues are fundamental.
+Return what changed.
+</instructions>
+```
+
+**Detection:** "Step N:" / "First..." / "Then..." / "Finally..." patterns in prompt content outside `<instructions>` blocks.
+
+### Pattern 6: Scope Authority Conflict
+
+**Symptom:** Prompt makes domain decisions the agent should own.
+
+```
+# BAD — prompt decides implementation details
+Agent({
+  subagent_type: "gsd-planner",
+  prompt: "Use React Query for data fetching. Use Zustand for state management.
+    <objective>Plan the frontend architecture</objective>"
+})
+
+# GOOD — user decisions passed through from CONTEXT.md
+Agent({
+  subagent_type: "gsd-planner",
+  prompt: "<planning_context>
+    <files_to_read>
+    - {context_path} (USER DECISIONS - locked: React Query, Zustand)
+    </files_to_read>
+  </planning_context>"
+})
+```
+
+**Key distinction:**
+- **Prompt making decisions** = conflict (command shouldn't have domain opinion)
+- **Prompt passing through user decisions** = correct (user decisions flow through command to agent)
+- **Agent interpreting user decisions** = correct (agent's `<context_fidelity>` handles locked/deferred/discretion)
+
+**Detection:** Technical nouns (library names, architecture patterns) in prompt free text (not inside `<files_to_read>` path descriptions).
+
+### Pattern 7: Missing Contracts
+
+**Symptom:** Handoff points between command and agent are incomplete.
+
+| Missing Element | Impact |
+|-----------------|--------|
+| Agent has no `<output_contract>` | Command can't route on return markers |
+| Command doesn't handle all agent return markers | BLOCKED/CHECKPOINT silently ignored |
+| Agent expects `<files_to_read>` but prompt doesn't provide it | Agent starts without context |
+| Agent's "Spawned by:" doesn't list this command | Agent may not expect this invocation pattern |
+| Agent has `<upstream_input>` but prompt doesn't match structure | Agent misinterprets input |
+
+**Detection:** Cross-reference both sides for completeness.
+
+## The `<deep_work_rules>` Exception
+
+GSD's plan-phase uses `<deep_work_rules>` in delegation prompts. This is a deliberate design choice, not a violation:
+
+1. **It's cross-cutting policy**: applies to ALL planning agents equally
+2. **It's structural**: defines required fields (`<read_first>`, `<acceptance_criteria>`, `<action>` concreteness) — not domain expertise
+3. **It supplements agent quality**: agent's own `<quality_gate>` is self-check; deep_work_rules is command-imposed minimum standard
+4. **It's invocation-specific context**: different commands might impose different work rules
+
+**Rule:** `<deep_work_rules>` in a delegation prompt is `info` level, not error. Flag only if its content duplicates agent's domain sections verbatim.
+
+## Severity Classification
+
+| Severity | When | Action Required |
+|----------|------|-----------------|
+| `error` | Actual conflict: contradictory content between prompt and agent | Must fix — move content to correct owner |
+| `warning` | Duplication or boundary blur without contradiction | Should fix — consolidate to single source of truth |
+| `info` | Acceptable pattern that looks like violation but isn't | No action — document why it's OK |
+
+## Quick Reference: Is This Content in the Right Place?
+
+| Content | In Prompt? | In Agent? |
+|---------|-----------|-----------|
+| "You are a..." | ❌ Never | ✅ Always |
+| File paths for this invocation | ✅ Yes | ❌ No |
+| Phase number, mode | ✅ Yes | ❌ No |
+| Decision tables | ❌ Never | ✅ Always |
+| Good/bad examples | ❌ Never | ✅ Always |
+| "Write to: {path}" | ✅ Yes | ❌ No |
+| Return markers handling | ✅ Yes (routing) | ✅ Yes (definition) |
+| Quality gate | ✅ Per-invocation | ✅ Permanent self-check |
+| "MUST read files first" | ❌ Agent's `<role>` owns this | ✅ Always |
+| Anti-shallow rules | ⚠️ OK as cross-cutting policy | ✅ Preferred |
+| Revision instructions | ✅ Yes (what changed) | ❌ No |
+| Heuristics / philosophy | ❌ Never | ✅ Always |
+| Banner display | ✅ Yes | ❌ Never |
+| AskUserQuestion | ✅ Yes | ❌ Never |
--- a/.claude/skills/prompt-generator/SKILL.md
+++ b/.claude/skills/prompt-generator/SKILL.md
@@ -165,14 +165,14 @@ Generate a complete command file with:
 3. **`<process>`** — numbered steps (GSD workflow style):
   - Step 1: Initialize / parse arguments
   - Steps 2-N: Domain-specific orchestration logic
-   - Each step: banner display, validation, agent spawning via `Task()`, error handling
+   - Each step: banner display, validation, agent spawning via `Agent()`, error handling
   - Final step: status display + `<offer_next>` with next actions
 4. **`<success_criteria>`** — checkbox list of verifiable conditions

 **Command writing rules:**
 - Steps are **numbered** (`## 1.`, `## 2.`) — follow `plan-phase.md` and `new-project.md` style
 - Use banners for phase transitions: `━━━ SKILL ► ACTION ━━━`
- Agent spawning uses `Task(prompt, subagent_type, description)` pattern
+- Agent spawning uses `Agent({ subagent_type, prompt, description, run_in_background })` pattern
 - Prompt to agents uses `<objective>`, `<files_to_read>`, `<output>` blocks
 - Include `<offer_next>` block with formatted completion status
 - Handle agent return markers: `## TASK COMPLETE`, `## TASK BLOCKED`, `## CHECKPOINT REACHED`
@@ -286,7 +286,7 @@ Set `$TARGET_PATH = $SOURCE_PATH` (in-place conversion) unless user specifies ou
 | `<process>` with numbered steps | At least 3 `## N.` headers |
 | Step 1 is initialization | Parses args or loads context |
 | Last step is status/report | Displays results or routes to `<offer_next>` |
-| Agent spawning (if complex) | `Task(` call with `subagent_type` |
+| Agent spawning (if complex) | `Agent({` call with `subagent_type` |
 | Agent prompt structure | `<files_to_read>` + `<objective>` or `<output>` blocks |
 | Return handling | Routes on `## TASK COMPLETE` / `## TASK BLOCKED` markers |
 | `<offer_next>` | Banner + summary + next command suggestion |
--- a/.claude/skills/prompt-generator/specs/agent-design-spec.md
+++ b/.claude/skills/prompt-generator/specs/agent-design-spec.md
@@ -4,7 +4,7 @@ Guidelines for Claude Code **agent definition files** (role + domain expertise).

 ## Content Separation Principle

-Agents are spawned by commands via `Task()`. The agent file defines WHO the agent is and WHAT it knows. It does NOT define WHEN or HOW it gets invoked.
+Agents are spawned by commands via `Agent()`. The agent file defines WHO the agent is and WHAT it knows. It does NOT define WHEN or HOW it gets invoked.

 | Concern | Belongs in Agent | Belongs in Command |
 |---------|-----------------|-------------------|
--- a/.claude/skills/prompt-generator/specs/command-design-spec.md
+++ b/.claude/skills/prompt-generator/specs/command-design-spec.md
@@ -153,15 +153,15 @@ Display banners before major phase transitions (agent spawning, user decisions,

 ## Agent Spawning Pattern

-Commands spawn agents via `Task()` with structured prompts:
+Commands spawn agents via `Agent()` with structured prompts:

-```markdown
-Task(
-  prompt=filled_prompt,
-  subagent_type="agent-name",
-  model="{model}",
-  description="Verb Phase {X}"
-)
+```javascript
+Agent({
+  subagent_type: "agent-name",
+  prompt: filled_prompt,
+  description: "Verb Phase {X}",
+  run_in_background: false
+})
 ```

 ### Prompt Structure for Agents
--- a/.claude/skills/prompt-generator/specs/conversion-spec.md
+++ b/.claude/skills/prompt-generator/specs/conversion-spec.md
@@ -49,7 +49,7 @@ Conversion Summary:
 | `## Auto Mode` / `## Auto Mode Defaults` | `<auto_mode>` section |
 | `## Quick Reference` | Preserve as-is within appropriate section |
 | Inline `AskUserQuestion` calls | Preserve verbatim — these belong in commands |
-| `Task()` / agent spawning calls | Preserve verbatim within process steps |
+| `Agent()` / agent spawning calls | Preserve verbatim within process steps |
 | Banner displays (`━━━`) | Preserve verbatim |
 | Code blocks (```bash, ```javascript, etc.) | **Preserve exactly** — never modify code content |
 | Tables | **Preserve exactly** — never reformat table content |
--- a/.claude/skills/prompt-generator/templates/command-md.md
+++ b/.claude/skills/prompt-generator/templates/command-md.md
@@ -49,12 +49,13 @@ allowed-tools: {tools}           # omit if unrestricted

 {Construct prompt with <files_to_read>, <objective>, <output> blocks.}

-```
-Task(
-  prompt=filled_prompt,
-  subagent_type="{agent-name}",
-  description="{Verb} {target}"
-)
+```javascript
+Agent({
+  subagent_type: "{agent-name}",
+  prompt: filled_prompt,
+  description: "{Verb} {target}",
+  run_in_background: false
+})
 ```

 ## 4. Handle Result
--- a/ccw/src/core/routes/codexlens/watcher-handlers.ts
+++ b/ccw/src/core/routes/codexlens/watcher-handlers.ts
@@ -8,9 +8,11 @@ import {
  checkVenvStatus,
  executeCodexLens,
  getVenvPythonPath,
+  useCodexLensV2,
 } from '../../../tools/codex-lens.js';
 import type { RouteContext } from '../types.js';
 import { extractJSON, stripAnsiCodes } from './utils.js';
+import type { ChildProcess } from 'child_process';

 // File watcher state (persisted across requests)
 let watcherProcess: any = null;
@@ -43,6 +45,29 @@ export async function stopWatcherForUninstall(): Promise<void> {
  watcherProcess = null;
 }

+/**
+ * Spawn v2 bridge watcher subprocess.
+ * Runs 'codexlens-search watch --root X --debounce-ms Y' and reads JSONL stdout.
+ * @param root - Root directory to watch
+ * @param debounceMs - Debounce interval in milliseconds
+ * @returns Spawned child process
+ */
+function spawnV2Watcher(root: string, debounceMs: number): ChildProcess {
+  const { spawn } = require('child_process') as typeof import('child_process');
+  return spawn('codexlens-search', [
+    'watch',
+    '--root', root,
+    '--debounce-ms', String(debounceMs),
+    '--db-path', require('path').join(root, '.codexlens'),
+  ], {
+    cwd: root,
+    shell: false,
+    stdio: ['ignore', 'pipe', 'pipe'],
+    windowsHide: true,
+    env: { ...process.env, PYTHONIOENCODING: 'utf-8' },
+  });
+}
+
 /**
 * Handle CodexLens watcher routes
 * @returns true if route was handled, false otherwise
@@ -91,7 +116,13 @@ export async function handleCodexLensWatcherRoutes(ctx: RouteContext): Promise<b
          return { success: false, error: `Path is not a directory: ${targetPath}`, status: 400 };
        }

-        // Get the codexlens CLI path
+        // Route to v2 or v1 watcher based on feature flag
+        if (useCodexLensV2()) {
+          // v2 bridge watcher: codexlens-search watch
+          console.log('[CodexLens] Using v2 bridge watcher');
+          watcherProcess = spawnV2Watcher(targetPath, debounceMs);
+        } else {
+          // v1 watcher: python -m codexlens watch
          const venvStatus = await checkVenvStatus();
          if (!venvStatus.ready) {
            return { success: false, error: 'CodexLens not installed', status: 400 };
@@ -121,7 +152,6 @@ export async function handleCodexLensWatcherRoutes(ctx: RouteContext): Promise<b
          }

          // Spawn watch process using Python (no shell: true for security)
-        // CodexLens is a Python package, must run via python -m codexlens
          const pythonPath = getVenvPythonPath();
          const args = ['-m', 'codexlens', 'watch', targetPath, '--debounce', String(debounceMs)];
          watcherProcess = spawn(pythonPath, args, {
@@ -131,6 +161,7 @@ export async function handleCodexLensWatcherRoutes(ctx: RouteContext): Promise<b
            windowsHide: true,
            env: { ...process.env, PYTHONIOENCODING: 'utf-8' }
          });
+        }

        watcherStats = {
          running: true,
@@ -153,14 +184,38 @@ export async function handleCodexLensWatcherRoutes(ctx: RouteContext): Promise<b
        }

        // Handle process output for event counting
+        const isV2Watcher = useCodexLensV2();
+        let stdoutLineBuffer = '';
        if (watcherProcess.stdout) {
          watcherProcess.stdout.on('data', (data: Buffer) => {
            const output = data.toString();
-            // Count processed events from output
+
+            if (isV2Watcher) {
+              // v2 bridge outputs JSONL - parse line by line
+              stdoutLineBuffer += output;
+              const lines = stdoutLineBuffer.split('\n');
+              // Keep incomplete last line in buffer
+              stdoutLineBuffer = lines.pop() || '';
+              for (const line of lines) {
+                const trimmed = line.trim();
+                if (!trimmed) continue;
+                try {
+                  const event = JSON.parse(trimmed);
+                  // Count file change events (created, modified, deleted, moved)
+                  if (event.event && event.event !== 'watching') {
+                    watcherStats.events_processed += 1;
+                  }
+                } catch {
+                  // Not valid JSON, skip
+                }
+              }
+            } else {
+              // v1 watcher: count text-based event messages
              const matches = output.match(/Processed \d+ events?/g);
              if (matches) {
                watcherStats.events_processed += matches.length;
              }
+            }
          });
        }

--- a/ccw/src/tools/codex-lens.ts
+++ b/ccw/src/tools/codex-lens.ts
@@ -1067,6 +1067,103 @@ async function installSemantic(gpuMode: GpuMode = 'cpu'): Promise<BootstrapResul
  });
 }

+/**
+ * Check if codexlens-search (v2) bridge CLI is installed and functional.
+ * Runs 'codexlens-search status' and checks exit code.
+ * @returns true if the v2 bridge CLI is available
+ */
+function isCodexLensV2Installed(): boolean {
+  try {
+    const result = spawnSync('codexlens-search', ['status', '--db-path', '.codexlens'], {
+      encoding: 'utf-8',
+      timeout: EXEC_TIMEOUTS.PYTHON_VERSION,
+      stdio: ['pipe', 'pipe', 'pipe'],
+      windowsHide: true,
+    });
+    // Exit code 0 or valid JSON output means it's installed
+    return result.status === 0 || (result.stdout != null && result.stdout.includes('"status"'));
+  } catch {
+    return false;
+  }
+}
+
+/**
+ * Bootstrap codexlens-search (v2) package using UV.
+ * Installs 'codexlens-search[semantic]' into the shared CodexLens venv.
+ * @returns Bootstrap result
+ */
+async function bootstrapV2WithUv(): Promise<BootstrapResult> {
+  console.log('[CodexLens] Bootstrapping codexlens-search (v2) with UV...');
+
+  const preFlightError = preFlightCheck();
+  if (preFlightError) {
+    return { success: false, error: `Pre-flight failed: ${preFlightError}` };
+  }
+
+  repairVenvIfCorrupted();
+
+  const uvInstalled = await ensureUvInstalled();
+  if (!uvInstalled) {
+    return { success: false, error: 'Failed to install UV package manager' };
+  }
+
+  const uv = createCodexLensUvManager();
+
+  if (!uv.isVenvValid()) {
+    console.log('[CodexLens] Creating virtual environment with UV for v2...');
+    const createResult = await uv.createVenv();
+    if (!createResult.success) {
+      return { success: false, error: `Failed to create venv: ${createResult.error}` };
+    }
+  }
+
+  // Find local codexlens-search package using unified discovery
+  const { findCodexLensSearchPath } = await import('../utils/package-discovery.js');
+  const discovery = findCodexLensSearchPath();
+
+  const extras = ['semantic'];
+  const editable = isDevEnvironment() && !discovery.insideNodeModules;
+
+  if (!discovery.path) {
+    // Fallback: try installing from PyPI
+    console.log('[CodexLens] Local codexlens-search not found, trying PyPI install...');
+    const pipResult = await uv.install(['codexlens-search[semantic]']);
+    if (!pipResult.success) {
+      return {
+        success: false,
+        error: `Failed to install codexlens-search from PyPI: ${pipResult.error}`,
+        diagnostics: { venvPath: getCodexLensVenvDir(), installer: 'uv' },
+      };
+    }
+  } else {
+    console.log(`[CodexLens] Installing codexlens-search from local path with UV: ${discovery.path} (editable: ${editable})`);
+    const installResult = await uv.installFromProject(discovery.path, extras, editable);
+    if (!installResult.success) {
+      return {
+        success: false,
+        error: `Failed to install codexlens-search: ${installResult.error}`,
+        diagnostics: { packagePath: discovery.path, venvPath: getCodexLensVenvDir(), installer: 'uv', editable },
+      };
+    }
+  }
+
+  clearVenvStatusCache();
+  console.log('[CodexLens] codexlens-search (v2) bootstrap complete');
+  return {
+    success: true,
+    message: 'Installed codexlens-search (v2) with UV',
+    diagnostics: { packagePath: discovery.path ?? undefined, venvPath: getCodexLensVenvDir(), installer: 'uv', editable },
+  };
+}
+
+/**
+ * Check if v2 bridge should be used based on CCW_USE_CODEXLENS_V2 env var.
+ */
+function useCodexLensV2(): boolean {
+  const flag = process.env.CCW_USE_CODEXLENS_V2;
+  return flag === '1' || flag === 'true';
+}
+
 /**
 * Bootstrap CodexLens venv with required packages
 * @returns Bootstrap result
@@ -1074,6 +1171,23 @@ async function installSemantic(gpuMode: GpuMode = 'cpu'): Promise<BootstrapResul
 async function bootstrapVenv(): Promise<BootstrapResult> {
  const warnings: string[] = [];

+  // If v2 flag is set, also bootstrap codexlens-search alongside v1
+  if (useCodexLensV2() && await isUvAvailable()) {
+    try {
+      const v2Result = await bootstrapV2WithUv();
+      if (v2Result.success) {
+        console.log('[CodexLens] codexlens-search (v2) installed successfully');
+      } else {
+        console.warn(`[CodexLens] codexlens-search (v2) bootstrap failed: ${v2Result.error}`);
+        warnings.push(`v2 bootstrap failed: ${v2Result.error || 'Unknown error'}`);
+      }
+    } catch (v2Err) {
+      const msg = v2Err instanceof Error ? v2Err.message : String(v2Err);
+      console.warn(`[CodexLens] codexlens-search (v2) bootstrap error: ${msg}`);
+      warnings.push(`v2 bootstrap error: ${msg}`);
+    }
+  }
+
  // Prefer UV if available (faster package resolution and installation)
  if (await isUvAvailable()) {
    console.log('[CodexLens] Using UV for bootstrap...');
@@ -2502,6 +2616,10 @@ export {
  // UV-based installation functions
  bootstrapWithUv,
  installSemanticWithUv,
+  // v2 bridge support
+  useCodexLensV2,
+  isCodexLensV2Installed,
+  bootstrapV2WithUv,
 };

 // Export Python path for direct spawn usage (e.g., watcher)
--- a/ccw/src/tools/smart-search.ts
+++ b/ccw/src/tools/smart-search.ts
@@ -29,7 +29,9 @@ import {
  ensureLiteLLMEmbedderReady,
  executeCodexLens,
  getVenvPythonPath,
+  useCodexLensV2,
 } from './codex-lens.js';
+import { execFile } from 'child_process';
 import type { ProgressInfo } from './codex-lens.js';
 import { getProjectRoot } from '../utils/path-validator.js';
 import { getCodexLensDataDir } from '../utils/codexlens-path.js';
@@ -2774,6 +2776,90 @@ async function executeRipgrepMode(params: Params): Promise<SearchResult> {
  });
 }

+// ========================================
+// codexlens-search v2 bridge integration
+// ========================================
+
+/**
+ * Execute search via codexlens-search (v2) bridge CLI.
+ * Spawns 'codexlens-search search --query X --top-k Y --db-path Z' and parses JSON output.
+ *
+ * @param query - Search query string
+ * @param topK - Number of results to return
+ * @param dbPath - Path to the v2 index database directory
+ * @returns Parsed search results as SemanticMatch array
+ */
+async function executeCodexLensV2Bridge(
+  query: string,
+  topK: number,
+  dbPath: string,
+): Promise<SearchResult> {
+  return new Promise((resolve) => {
+    const args = [
+      'search',
+      '--query', query,
+      '--top-k', String(topK),
+      '--db-path', dbPath,
+    ];
+
+    execFile('codexlens-search', args, {
+      encoding: 'utf-8',
+      timeout: EXEC_TIMEOUTS.PROCESS_SPAWN,
+      windowsHide: true,
+      env: { ...process.env, PYTHONIOENCODING: 'utf-8' },
+    }, (error, stdout, stderr) => {
+      if (error) {
+        console.warn(`[CodexLens-v2] Bridge search failed: ${error.message}`);
+        resolve({
+          success: false,
+          error: `codexlens-search v2 bridge failed: ${error.message}`,
+        });
+        return;
+      }
+
+      try {
+        const parsed = JSON.parse(stdout.trim());
+
+        // Bridge outputs {"error": string} on failure
+        if (parsed && typeof parsed === 'object' && 'error' in parsed) {
+          resolve({
+            success: false,
+            error: `codexlens-search v2: ${parsed.error}`,
+          });
+          return;
+        }
+
+        // Bridge outputs array of {path, score, snippet}
+        const results: SemanticMatch[] = (Array.isArray(parsed) ? parsed : []).map((r: { path?: string; score?: number; snippet?: string }) => ({
+          file: r.path || '',
+          score: r.score || 0,
+          content: r.snippet || '',
+          symbol: null,
+        }));
+
+        resolve({
+          success: true,
+          results,
+          metadata: {
+            mode: 'semantic' as any,
+            backend: 'codexlens-v2',
+            count: results.length,
+            query,
+            note: 'Using codexlens-search v2 bridge (2-stage vector + reranking)',
+          },
+        });
+      } catch (parseErr) {
+        console.warn(`[CodexLens-v2] Failed to parse bridge output: ${(parseErr as Error).message}`);
+        resolve({
+          success: false,
+          error: `Failed to parse codexlens-search v2 output: ${(parseErr as Error).message}`,
+          output: stdout,
+        });
+      }
+    });
+  });
+}
+
 /**
 * Mode: exact - CodexLens exact/FTS search
 * Requires index
@@ -4276,7 +4362,21 @@ export async function handler(params: Record<string, unknown>): Promise<ToolResu

      case 'search':
      default:
-        // Handle search modes: fuzzy | semantic
+        // v2 bridge: if CCW_USE_CODEXLENS_V2 is set, try v2 bridge first for semantic mode
+        if (useCodexLensV2() && (mode === 'semantic' || mode === 'fuzzy')) {
+          const scope = resolveSearchScope(parsed.data.path ?? '.');
+          const dbPath = join(scope.workingDirectory, '.codexlens');
+          const topK = (parsed.data.maxResults || 5) + (parsed.data.extraFilesCount || 10);
+          const v2Result = await executeCodexLensV2Bridge(parsed.data.query || '', topK, dbPath);
+          if (v2Result.success) {
+            result = v2Result;
+            break;
+          }
+          // v2 failed, fall through to v1
+          console.warn(`[CodexLens-v2] Falling back to v1: ${v2Result.error}`);
+        }
+
+        // Handle search modes: fuzzy | semantic (v1 path)
        switch (mode) {
          case 'fuzzy':
            result = await executeFuzzyMode(parsed.data);
--- a/ccw/src/utils/package-discovery.ts
+++ b/ccw/src/utils/package-discovery.ts
@@ -55,18 +55,20 @@ export interface PackageDiscoveryResult {
 }

 /** Known local package names */
-export type LocalPackageName = 'codex-lens' | 'ccw-litellm';
+export type LocalPackageName = 'codex-lens' | 'ccw-litellm' | 'codexlens-search';

 /** Environment variable mapping for each package */
 const PACKAGE_ENV_VARS: Record<LocalPackageName, string> = {
  'codex-lens': 'CODEXLENS_PACKAGE_PATH',
  'ccw-litellm': 'CCW_LITELLM_PATH',
+  'codexlens-search': 'CODEXLENS_SEARCH_PATH',
 };

 /** Config key mapping for each package */
 const PACKAGE_CONFIG_KEYS: Record<LocalPackageName, string> = {
  'codex-lens': 'codexLensPath',
  'ccw-litellm': 'ccwLitellmPath',
+  'codexlens-search': 'codexlensSearchPath',
 };

 // ========================================
@@ -296,6 +298,13 @@ export function findCcwLitellmPath(): PackageDiscoveryResult {
  return findPackagePath('ccw-litellm');
 }

+/**
+ * Find codexlens-search (v2) package path (convenience wrapper)
+ */
+export function findCodexLensSearchPath(): PackageDiscoveryResult {
+  return findPackagePath('codexlens-search');
+}
+
 /**
 * Format search results for error messages
 */
--- a/codex-lens-v2/LICENSE
+++ b/codex-lens-v2/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2026 codexlens-search contributors
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
--- a/codex-lens-v2/README.md
+++ b/codex-lens-v2/README.md
@@ -0,0 +1,146 @@
+# codexlens-search
+
+Lightweight semantic code search engine with 2-stage vector search, full-text search, and Reciprocal Rank Fusion.
+
+## Overview
+
+codexlens-search provides fast, accurate code search through a multi-stage retrieval pipeline:
+
+1. **Binary coarse search** - Hamming-distance filtering narrows candidates quickly
+2. **ANN fine search** - HNSW or FAISS refines the candidate set with float vectors
+3. **Full-text search** - SQLite FTS5 handles exact and fuzzy keyword matching
+4. **RRF fusion** - Reciprocal Rank Fusion merges vector and text results
+5. **Reranking** - Optional cross-encoder or API-based reranker for final ordering
+
+The core library has **zero required dependencies**. Install optional extras to enable semantic search, GPU acceleration, or FAISS backends.
+
+## Installation
+
+```bash
+# Core only (FTS search, no vector search)
+pip install codexlens-search
+
+# With semantic search (recommended)
+pip install codexlens-search[semantic]
+
+# Semantic search + GPU acceleration
+pip install codexlens-search[semantic-gpu]
+
+# With FAISS backend (CPU)
+pip install codexlens-search[faiss-cpu]
+
+# With API-based reranker
+pip install codexlens-search[reranker-api]
+
+# Everything (semantic + GPU + FAISS + reranker)
+pip install codexlens-search[semantic-gpu,faiss-gpu,reranker-api]
+```
+
+## Quick Start
+
+```python
+from codexlens_search import Config, IndexingPipeline, SearchPipeline
+from codexlens_search.core import create_ann_index, create_binary_index
+from codexlens_search.embed.local import FastEmbedEmbedder
+from codexlens_search.rerank.local import LocalReranker
+from codexlens_search.search.fts import FTSEngine
+
+# 1. Configure
+config = Config(embed_model="BAAI/bge-small-en-v1.5", embed_dim=384)
+
+# 2. Create components
+embedder = FastEmbedEmbedder(config)
+binary_store = create_binary_index(config, db_path="index/binary.db")
+ann_index = create_ann_index(config, index_path="index/ann.bin")
+fts = FTSEngine("index/fts.db")
+reranker = LocalReranker()
+
+# 3. Index files
+indexer = IndexingPipeline(embedder, binary_store, ann_index, fts, config)
+stats = indexer.index_directory("./src")
+print(f"Indexed {stats.files_processed} files, {stats.chunks_created} chunks")
+
+# 4. Search
+pipeline = SearchPipeline(embedder, binary_store, ann_index, reranker, fts, config)
+results = pipeline.search("authentication handler", top_k=10)
+for r in results:
+    print(f"  {r.path} (score={r.score:.3f})")
+```
+
+## Extras
+
+| Extra | Dependencies | Description |
+|-------|-------------|-------------|
+| `semantic` | hnswlib, numpy, fastembed | Vector search with local embeddings |
+| `gpu` | onnxruntime-gpu | GPU-accelerated embedding inference |
+| `semantic-gpu` | semantic + gpu combined | Vector search with GPU acceleration |
+| `faiss-cpu` | faiss-cpu | FAISS ANN backend (CPU) |
+| `faiss-gpu` | faiss-gpu | FAISS ANN backend (GPU) |
+| `reranker-api` | httpx | Remote reranker API client |
+| `dev` | pytest, pytest-cov | Development and testing |
+
+## Architecture
+
+```
+Query
+  |
+  v
+[Embedder] --> query vector
+  |
+  +---> [BinaryStore.coarse_search] --> candidate IDs (Hamming distance)
+  |         |
+  |         v
+  +---> [ANNIndex.fine_search] ------> ranked IDs (cosine/L2)
+  |         |
+  |         v  (intersect)
+  |     vector_results
+  |
+  +---> [FTSEngine.exact_search] ----> exact text matches
+  +---> [FTSEngine.fuzzy_search] ----> fuzzy text matches
+  |
+  v
+[RRF Fusion] --> merged ranking (adaptive weights by query intent)
+  |
+  v
+[Reranker] --> final top-k results
+```
+
+### Key Design Decisions
+
+- **2-stage vector search**: Binary coarse search (fast Hamming distance on binarized vectors) filters candidates before the more expensive ANN search. This keeps memory usage low and search fast even on large corpora.
+- **Parallel retrieval**: Vector search and FTS run concurrently via ThreadPoolExecutor.
+- **Adaptive fusion weights**: Query intent detection adjusts RRF weights between vector and text signals.
+- **Backend abstraction**: ANN index supports both hnswlib and FAISS backends via a factory function.
+- **Zero core dependencies**: The base package requires only Python 3.10+. All heavy dependencies are optional.
+
+## Configuration
+
+The `Config` dataclass controls all pipeline parameters:
+
+```python
+from codexlens_search import Config
+
+config = Config(
+    embed_model="BAAI/bge-small-en-v1.5",  # embedding model name
+    embed_dim=384,                           # embedding dimension
+    embed_batch_size=64,                     # batch size for embedding
+    ann_backend="auto",                      # 'auto', 'faiss', 'hnswlib'
+    binary_top_k=200,                        # binary coarse search candidates
+    ann_top_k=50,                            # ANN fine search candidates
+    fts_top_k=50,                            # FTS results per method
+    device="auto",                           # 'auto', 'cuda', 'cpu'
+)
+```
+
+## Development
+
+```bash
+git clone https://github.com/nicepkg/codexlens-search.git
+cd codexlens-search
+pip install -e ".[dev,semantic]"
+pytest
+```
+
+## License
+
+MIT
--- a/codex-lens-v2/dist/codexlens_search-0.2.0-py3-none-any.whl
+++ b/codex-lens-v2/dist/codexlens_search-0.2.0-py3-none-any.whl
--- a/codex-lens-v2/dist/codexlens_search-0.2.0.tar.gz
+++ b/codex-lens-v2/dist/codexlens_search-0.2.0.tar.gz
--- a/codex-lens-v2/pyproject.toml
+++ b/codex-lens-v2/pyproject.toml
@@ -8,6 +8,26 @@ version = "0.2.0"
 description = "Lightweight semantic code search engine — 2-stage vector + FTS + RRF fusion"
 requires-python = ">=3.10"
 dependencies = []
+license = {text = "MIT"}
+readme = "README.md"
+authors = [
+    {name = "codexlens-search contributors"},
+]
+classifiers = [
+    "Programming Language :: Python :: 3",
+    "Programming Language :: Python :: 3.10",
+    "Programming Language :: Python :: 3.11",
+    "Programming Language :: Python :: 3.12",
+    "Programming Language :: Python :: 3.13",
+    "License :: OSI Approved :: MIT License",
+    "Topic :: Software Development :: Libraries",
+    "Topic :: Text Processing :: Indexing",
+    "Operating System :: OS Independent",
+]
+
+[project.urls]
+Homepage = "https://github.com/nicepkg/codexlens-search"
+Repository = "https://github.com/nicepkg/codexlens-search"

 [project.optional-dependencies]
 semantic = [
@@ -27,10 +47,22 @@ faiss-gpu = [
 reranker-api = [
    "httpx>=0.25",
 ]
+watcher = [
+    "watchdog>=3.0",
+]
+semantic-gpu = [
+    "hnswlib>=0.8.0",
+    "numpy>=1.26",
+    "fastembed>=0.4.0,<2.0",
+    "onnxruntime-gpu>=1.16",
+]
 dev = [
    "pytest>=7.0",
    "pytest-cov",
 ]

+[project.scripts]
+codexlens-search = "codexlens_search.bridge:main"
+
 [tool.hatch.build.targets.wheel]
 packages = ["src/codexlens_search"]
--- a/codex-lens-v2/src/codexlens_search/bridge.py
+++ b/codex-lens-v2/src/codexlens_search/bridge.py
@@ -0,0 +1,407 @@
+"""CLI bridge for ccw integration.
+
+Argparse-based CLI with JSON output protocol.
+Each subcommand outputs a single JSON object to stdout.
+Watch command outputs JSONL (one JSON per line).
+All errors are JSON {"error": string} to stdout with non-zero exit code.
+"""
+from __future__ import annotations
+
+import argparse
+import glob
+import json
+import logging
+import os
+import sys
+import time
+from pathlib import Path
+
+log = logging.getLogger("codexlens_search.bridge")
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+def _json_output(data: dict | list) -> None:
+    """Print JSON to stdout with flush."""
+    print(json.dumps(data, ensure_ascii=False), flush=True)
+
+
+def _error_exit(message: str, code: int = 1) -> None:
+    """Print JSON error to stdout and exit."""
+    _json_output({"error": message})
+    sys.exit(code)
+
+
+def _resolve_db_path(args: argparse.Namespace) -> Path:
+    """Return the --db-path as a resolved Path, creating parent dirs."""
+    db_path = Path(args.db_path).resolve()
+    db_path.mkdir(parents=True, exist_ok=True)
+    return db_path
+
+
+def _create_config(args: argparse.Namespace) -> "Config":
+    """Build Config from CLI args."""
+    from codexlens_search.config import Config
+
+    kwargs: dict = {}
+    if hasattr(args, "embed_model") and args.embed_model:
+        kwargs["embed_model"] = args.embed_model
+    db_path = Path(args.db_path).resolve()
+    kwargs["metadata_db_path"] = str(db_path / "metadata.db")
+    return Config(**kwargs)
+
+
+def _create_pipeline(
+    args: argparse.Namespace,
+) -> tuple:
+    """Lazily construct pipeline components from CLI args.
+
+    Returns (indexing_pipeline, search_pipeline, config).
+    Only loads embedder/reranker models when needed.
+    """
+    from codexlens_search.config import Config
+    from codexlens_search.core.factory import create_ann_index, create_binary_index
+    from codexlens_search.embed.local import FastEmbedEmbedder
+    from codexlens_search.indexing.metadata import MetadataStore
+    from codexlens_search.indexing.pipeline import IndexingPipeline
+    from codexlens_search.rerank.local import FastEmbedReranker
+    from codexlens_search.search.fts import FTSEngine
+    from codexlens_search.search.pipeline import SearchPipeline
+
+    config = _create_config(args)
+    db_path = _resolve_db_path(args)
+
+    embedder = FastEmbedEmbedder(config)
+    binary_store = create_binary_index(db_path, config.embed_dim, config)
+    ann_index = create_ann_index(db_path, config.embed_dim, config)
+    fts = FTSEngine(db_path / "fts.db")
+    metadata = MetadataStore(db_path / "metadata.db")
+    reranker = FastEmbedReranker(config)
+
+    indexing = IndexingPipeline(
+        embedder=embedder,
+        binary_store=binary_store,
+        ann_index=ann_index,
+        fts=fts,
+        config=config,
+        metadata=metadata,
+    )
+
+    search = SearchPipeline(
+        embedder=embedder,
+        binary_store=binary_store,
+        ann_index=ann_index,
+        reranker=reranker,
+        fts=fts,
+        config=config,
+        metadata_store=metadata,
+    )
+
+    return indexing, search, config
+
+
+# ---------------------------------------------------------------------------
+# Subcommand handlers
+# ---------------------------------------------------------------------------
+
+def cmd_init(args: argparse.Namespace) -> None:
+    """Initialize an empty index at --db-path."""
+    from codexlens_search.indexing.metadata import MetadataStore
+    from codexlens_search.search.fts import FTSEngine
+
+    db_path = _resolve_db_path(args)
+
+    # Create empty stores - just touch the metadata and FTS databases
+    MetadataStore(db_path / "metadata.db")
+    FTSEngine(db_path / "fts.db")
+
+    _json_output({
+        "status": "initialized",
+        "db_path": str(db_path),
+    })
+
+
+def cmd_search(args: argparse.Namespace) -> None:
+    """Run search query, output JSON array of results."""
+    _, search, _ = _create_pipeline(args)
+
+    results = search.search(args.query, top_k=args.top_k)
+    _json_output([
+        {"path": r.path, "score": r.score, "snippet": r.snippet}
+        for r in results
+    ])
+
+
+def cmd_index_file(args: argparse.Namespace) -> None:
+    """Index a single file."""
+    indexing, _, _ = _create_pipeline(args)
+
+    file_path = Path(args.file).resolve()
+    if not file_path.is_file():
+        _error_exit(f"File not found: {file_path}")
+
+    root = Path(args.root).resolve() if args.root else None
+
+    stats = indexing.index_file(file_path, root=root)
+    _json_output({
+        "status": "indexed",
+        "file": str(file_path),
+        "files_processed": stats.files_processed,
+        "chunks_created": stats.chunks_created,
+        "duration_seconds": stats.duration_seconds,
+    })
+
+
+def cmd_remove_file(args: argparse.Namespace) -> None:
+    """Remove a file from the index."""
+    indexing, _, _ = _create_pipeline(args)
+
+    indexing.remove_file(args.file)
+    _json_output({
+        "status": "removed",
+        "file": args.file,
+    })
+
+
+def cmd_sync(args: argparse.Namespace) -> None:
+    """Sync index with files under --root matching --glob pattern."""
+    indexing, _, _ = _create_pipeline(args)
+
+    root = Path(args.root).resolve()
+    if not root.is_dir():
+        _error_exit(f"Root directory not found: {root}")
+
+    pattern = args.glob or "**/*"
+    file_paths = [
+        p for p in root.glob(pattern)
+        if p.is_file()
+    ]
+
+    stats = indexing.sync(file_paths, root=root)
+    _json_output({
+        "status": "synced",
+        "root": str(root),
+        "files_processed": stats.files_processed,
+        "chunks_created": stats.chunks_created,
+        "duration_seconds": stats.duration_seconds,
+    })
+
+
+def cmd_watch(args: argparse.Namespace) -> None:
+    """Watch --root for changes, output JSONL events."""
+    root = Path(args.root).resolve()
+    if not root.is_dir():
+        _error_exit(f"Root directory not found: {root}")
+
+    debounce_ms = args.debounce_ms
+
+    try:
+        from watchdog.observers import Observer
+        from watchdog.events import FileSystemEventHandler, FileSystemEvent
+    except ImportError:
+        _error_exit(
+            "watchdog is required for watch mode. "
+            "Install with: pip install watchdog"
+        )
+
+    class _JsonEventHandler(FileSystemEventHandler):
+        """Emit JSONL for file events."""
+
+        def _emit(self, event_type: str, path: str) -> None:
+            _json_output({
+                "event": event_type,
+                "path": path,
+                "timestamp": time.time(),
+            })
+
+        def on_created(self, event: FileSystemEvent) -> None:
+            if not event.is_directory:
+                self._emit("created", event.src_path)
+
+        def on_modified(self, event: FileSystemEvent) -> None:
+            if not event.is_directory:
+                self._emit("modified", event.src_path)
+
+        def on_deleted(self, event: FileSystemEvent) -> None:
+            if not event.is_directory:
+                self._emit("deleted", event.src_path)
+
+        def on_moved(self, event: FileSystemEvent) -> None:
+            if not event.is_directory:
+                self._emit("moved", event.dest_path)
+
+    observer = Observer()
+    observer.schedule(_JsonEventHandler(), str(root), recursive=True)
+    observer.start()
+
+    _json_output({
+        "status": "watching",
+        "root": str(root),
+        "debounce_ms": debounce_ms,
+    })
+
+    try:
+        while True:
+            time.sleep(debounce_ms / 1000.0)
+    except KeyboardInterrupt:
+        observer.stop()
+    observer.join()
+
+
+def cmd_download_models(args: argparse.Namespace) -> None:
+    """Download embed + reranker models."""
+    from codexlens_search import model_manager
+
+    config = _create_config(args)
+
+    model_manager.ensure_model(config.embed_model, config)
+    model_manager.ensure_model(config.reranker_model, config)
+
+    _json_output({
+        "status": "downloaded",
+        "embed_model": config.embed_model,
+        "reranker_model": config.reranker_model,
+    })
+
+
+def cmd_status(args: argparse.Namespace) -> None:
+    """Report index statistics."""
+    from codexlens_search.indexing.metadata import MetadataStore
+
+    db_path = _resolve_db_path(args)
+    meta_path = db_path / "metadata.db"
+
+    if not meta_path.exists():
+        _json_output({
+            "status": "not_initialized",
+            "db_path": str(db_path),
+        })
+        return
+
+    metadata = MetadataStore(meta_path)
+    all_files = metadata.get_all_files()
+    deleted_ids = metadata.get_deleted_ids()
+    max_chunk = metadata.max_chunk_id()
+
+    _json_output({
+        "status": "ok",
+        "db_path": str(db_path),
+        "files_tracked": len(all_files),
+        "max_chunk_id": max_chunk,
+        "total_chunks_approx": max_chunk + 1 if max_chunk >= 0 else 0,
+        "deleted_chunks": len(deleted_ids),
+    })
+
+
+# ---------------------------------------------------------------------------
+# CLI parser
+# ---------------------------------------------------------------------------
+
+def _build_parser() -> argparse.ArgumentParser:
+    parser = argparse.ArgumentParser(
+        prog="codexlens-search",
+        description="Lightweight semantic code search - CLI bridge",
+    )
+    parser.add_argument(
+        "--db-path",
+        default=os.environ.get("CODEXLENS_DB_PATH", ".codexlens"),
+        help="Path to index database directory (default: .codexlens or $CODEXLENS_DB_PATH)",
+    )
+    parser.add_argument(
+        "--verbose", "-v",
+        action="store_true",
+        help="Enable debug logging to stderr",
+    )
+
+    sub = parser.add_subparsers(dest="command")
+
+    # init
+    sub.add_parser("init", help="Initialize empty index")
+
+    # search
+    p_search = sub.add_parser("search", help="Search the index")
+    p_search.add_argument("--query", "-q", required=True, help="Search query")
+    p_search.add_argument("--top-k", "-k", type=int, default=10, help="Number of results")
+
+    # index-file
+    p_index = sub.add_parser("index-file", help="Index a single file")
+    p_index.add_argument("--file", "-f", required=True, help="File path to index")
+    p_index.add_argument("--root", "-r", help="Root directory for relative paths")
+
+    # remove-file
+    p_remove = sub.add_parser("remove-file", help="Remove a file from index")
+    p_remove.add_argument("--file", "-f", required=True, help="Relative file path to remove")
+
+    # sync
+    p_sync = sub.add_parser("sync", help="Sync index with directory")
+    p_sync.add_argument("--root", "-r", required=True, help="Root directory to sync")
+    p_sync.add_argument("--glob", "-g", default="**/*", help="Glob pattern (default: **/*)")
+
+    # watch
+    p_watch = sub.add_parser("watch", help="Watch directory for changes (JSONL output)")
+    p_watch.add_argument("--root", "-r", required=True, help="Root directory to watch")
+    p_watch.add_argument("--debounce-ms", type=int, default=500, help="Debounce interval in ms")
+
+    # download-models
+    p_dl = sub.add_parser("download-models", help="Download embed + reranker models")
+    p_dl.add_argument("--embed-model", help="Override embed model name")
+
+    # status
+    sub.add_parser("status", help="Report index statistics")
+
+    return parser
+
+
+def main() -> None:
+    """CLI entry point."""
+    parser = _build_parser()
+    args = parser.parse_args()
+
+    # Configure logging
+    if args.verbose:
+        logging.basicConfig(
+            level=logging.DEBUG,
+            format="%(levelname)s %(name)s: %(message)s",
+            stream=sys.stderr,
+        )
+    else:
+        logging.basicConfig(
+            level=logging.WARNING,
+            format="%(levelname)s: %(message)s",
+            stream=sys.stderr,
+        )
+
+    if not args.command:
+        parser.print_help(sys.stderr)
+        sys.exit(1)
+
+    dispatch = {
+        "init": cmd_init,
+        "search": cmd_search,
+        "index-file": cmd_index_file,
+        "remove-file": cmd_remove_file,
+        "sync": cmd_sync,
+        "watch": cmd_watch,
+        "download-models": cmd_download_models,
+        "status": cmd_status,
+    }
+
+    handler = dispatch.get(args.command)
+    if handler is None:
+        _error_exit(f"Unknown command: {args.command}")
+
+    try:
+        handler(args)
+    except KeyboardInterrupt:
+        sys.exit(130)
+    except SystemExit:
+        raise
+    except Exception as exc:
+        log.debug("Command failed", exc_info=True)
+        _error_exit(str(exc))
+
+
+if __name__ == "__main__":
+    main()
--- a/codex-lens-v2/src/codexlens_search/config.py
+++ b/codex-lens-v2/src/codexlens_search/config.py
@@ -49,6 +49,9 @@ class Config:
    reranker_api_model: str = ""
    reranker_api_max_tokens_per_batch: int = 2048

+    # Metadata store
+    metadata_db_path: str = ""  # empty = no metadata tracking
+
    # FTS
    fts_top_k: int = 50

--- a/codex-lens-v2/src/codexlens_search/indexing/init.py
+++ b/codex-lens-v2/src/codexlens_search/indexing/init.py
@@ -1,5 +1,6 @@
 from __future__ import annotations

+from .metadata import MetadataStore
 from .pipeline import IndexingPipeline, IndexStats

-__all__ = ["IndexingPipeline", "IndexStats"]
+__all__ = ["IndexingPipeline", "IndexStats", "MetadataStore"]
--- a/codex-lens-v2/src/codexlens_search/indexing/metadata.py
+++ b/codex-lens-v2/src/codexlens_search/indexing/metadata.py
@@ -0,0 +1,165 @@
+"""SQLite-backed metadata store for file-to-chunk mapping and tombstone tracking."""
+from __future__ import annotations
+
+import sqlite3
+from pathlib import Path
+
+
+class MetadataStore:
+    """Tracks file-to-chunk mappings and deleted chunk IDs (tombstones).
+
+    Tables:
+        files      - file_path (PK), content_hash, last_modified
+        chunks     - chunk_id (PK), file_path (FK CASCADE), chunk_hash
+        deleted_chunks - chunk_id (PK) for tombstone tracking
+    """
+
+    def __init__(self, db_path: str | Path) -> None:
+        self._conn = sqlite3.connect(str(db_path), check_same_thread=False)
+        self._conn.execute("PRAGMA foreign_keys = ON")
+        self._conn.execute("PRAGMA journal_mode = WAL")
+        self._create_tables()
+
+    def _create_tables(self) -> None:
+        self._conn.executescript("""
+            CREATE TABLE IF NOT EXISTS files (
+                file_path   TEXT PRIMARY KEY,
+                content_hash TEXT NOT NULL,
+                last_modified REAL NOT NULL
+            );
+
+            CREATE TABLE IF NOT EXISTS chunks (
+                chunk_id    INTEGER PRIMARY KEY,
+                file_path   TEXT NOT NULL,
+                chunk_hash  TEXT NOT NULL DEFAULT '',
+                FOREIGN KEY (file_path) REFERENCES files(file_path) ON DELETE CASCADE
+            );
+
+            CREATE TABLE IF NOT EXISTS deleted_chunks (
+                chunk_id    INTEGER PRIMARY KEY
+            );
+        """)
+        self._conn.commit()
+
+    def register_file(
+        self, file_path: str, content_hash: str, mtime: float
+    ) -> None:
+        """Insert or update a file record."""
+        self._conn.execute(
+            "INSERT OR REPLACE INTO files (file_path, content_hash, last_modified) "
+            "VALUES (?, ?, ?)",
+            (file_path, content_hash, mtime),
+        )
+        self._conn.commit()
+
+    def register_chunks(
+        self, file_path: str, chunk_ids_and_hashes: list[tuple[int, str]]
+    ) -> None:
+        """Register chunk IDs belonging to a file.
+
+        Args:
+            file_path: The owning file path (must already exist in files table).
+            chunk_ids_and_hashes: List of (chunk_id, chunk_hash) tuples.
+        """
+        if not chunk_ids_and_hashes:
+            return
+        self._conn.executemany(
+            "INSERT OR REPLACE INTO chunks (chunk_id, file_path, chunk_hash) "
+            "VALUES (?, ?, ?)",
+            [(cid, file_path, chash) for cid, chash in chunk_ids_and_hashes],
+        )
+        self._conn.commit()
+
+    def mark_file_deleted(self, file_path: str) -> int:
+        """Move all chunk IDs for a file to deleted_chunks, then remove the file.
+
+        Returns the number of chunks tombstoned.
+        """
+        # Collect chunk IDs before CASCADE deletes them
+        rows = self._conn.execute(
+            "SELECT chunk_id FROM chunks WHERE file_path = ?", (file_path,)
+        ).fetchall()
+
+        if not rows:
+            # Still remove the file record if it exists
+            self._conn.execute(
+                "DELETE FROM files WHERE file_path = ?", (file_path,)
+            )
+            self._conn.commit()
+            return 0
+
+        chunk_ids = [(r[0],) for r in rows]
+        self._conn.executemany(
+            "INSERT OR IGNORE INTO deleted_chunks (chunk_id) VALUES (?)",
+            chunk_ids,
+        )
+        # CASCADE deletes chunks rows automatically
+        self._conn.execute(
+            "DELETE FROM files WHERE file_path = ?", (file_path,)
+        )
+        self._conn.commit()
+        return len(chunk_ids)
+
+    def get_deleted_ids(self) -> set[int]:
+        """Return all tombstoned chunk IDs for search-time filtering."""
+        rows = self._conn.execute(
+            "SELECT chunk_id FROM deleted_chunks"
+        ).fetchall()
+        return {r[0] for r in rows}
+
+    def get_file_hash(self, file_path: str) -> str | None:
+        """Return the stored content hash for a file, or None if not tracked."""
+        row = self._conn.execute(
+            "SELECT content_hash FROM files WHERE file_path = ?", (file_path,)
+        ).fetchone()
+        return row[0] if row else None
+
+    def file_needs_update(self, file_path: str, content_hash: str) -> bool:
+        """Check if a file needs re-indexing based on its content hash."""
+        stored = self.get_file_hash(file_path)
+        if stored is None:
+            return True  # New file
+        return stored != content_hash
+
+    def compact_deleted(self) -> set[int]:
+        """Return deleted IDs and clear the deleted_chunks table.
+
+        Call this after rebuilding the vector index to reclaim space.
+        """
+        deleted = self.get_deleted_ids()
+        if deleted:
+            self._conn.execute("DELETE FROM deleted_chunks")
+            self._conn.commit()
+        return deleted
+
+    def get_chunk_ids_for_file(self, file_path: str) -> list[int]:
+        """Return all chunk IDs belonging to a file."""
+        rows = self._conn.execute(
+            "SELECT chunk_id FROM chunks WHERE file_path = ?", (file_path,)
+        ).fetchall()
+        return [r[0] for r in rows]
+
+    def get_all_files(self) -> dict[str, str]:
+        """Return all tracked files as {file_path: content_hash}."""
+        rows = self._conn.execute(
+            "SELECT file_path, content_hash FROM files"
+        ).fetchall()
+        return {r[0]: r[1] for r in rows}
+
+    def max_chunk_id(self) -> int:
+        """Return the maximum chunk_id across chunks and deleted_chunks.
+
+        Returns -1 if no chunks exist, so that next_id = max_chunk_id() + 1
+        starts at 0 for an empty store.
+        """
+        row = self._conn.execute(
+            "SELECT MAX(m) FROM ("
+            "  SELECT MAX(chunk_id) AS m FROM chunks"
+            "  UNION ALL"
+            "  SELECT MAX(chunk_id) AS m FROM deleted_chunks"
+            ")"
+        ).fetchone()
+        return row[0] if row[0] is not None else -1
+
+    def close(self) -> None:
+        self._conn.close()
--- a/codex-lens-v2/src/codexlens_search/indexing/pipeline.py
+++ b/codex-lens-v2/src/codexlens_search/indexing/pipeline.py
@@ -5,6 +5,7 @@ The GIL is acceptable because embedding (onnxruntime) releases it in C extension
 """
 from __future__ import annotations

+import hashlib
 import logging
 import queue
 import threading
@@ -18,6 +19,7 @@ from codexlens_search.config import Config
 from codexlens_search.core.binary import BinaryStore
 from codexlens_search.core.index import ANNIndex
 from codexlens_search.embed.base import BaseEmbedder
+from codexlens_search.indexing.metadata import MetadataStore
 from codexlens_search.search.fts import FTSEngine

 logger = logging.getLogger(__name__)
@@ -55,12 +57,14 @@ class IndexingPipeline:
        ann_index: ANNIndex,
        fts: FTSEngine,
        config: Config,
+        metadata: MetadataStore | None = None,
    ) -> None:
        self._embedder = embedder
        self._binary_store = binary_store
        self._ann_index = ann_index
        self._fts = fts
        self._config = config
+        self._metadata = metadata

    def index_files(
        self,
@@ -275,3 +279,271 @@ class IndexingPipeline:
            chunks.append(("".join(current), path))

        return chunks
+
+    # ------------------------------------------------------------------
+    # Incremental API
+    # ------------------------------------------------------------------
+
+    @staticmethod
+    def _content_hash(text: str) -> str:
+        """Compute SHA-256 hex digest of file content."""
+        return hashlib.sha256(text.encode("utf-8", errors="replace")).hexdigest()
+
+    def _require_metadata(self) -> MetadataStore:
+        """Return metadata store or raise if not configured."""
+        if self._metadata is None:
+            raise RuntimeError(
+                "MetadataStore is required for incremental indexing. "
+                "Pass metadata= to IndexingPipeline.__init__."
+            )
+        return self._metadata
+
+    def _next_chunk_id(self) -> int:
+        """Return the next available chunk ID from MetadataStore."""
+        meta = self._require_metadata()
+        return meta.max_chunk_id() + 1
+
+    def index_file(
+        self,
+        file_path: Path,
+        *,
+        root: Path | None = None,
+        force: bool = False,
+        max_chunk_chars: int = _DEFAULT_MAX_CHUNK_CHARS,
+        chunk_overlap: int = _DEFAULT_CHUNK_OVERLAP,
+        max_file_size: int = 50_000,
+    ) -> IndexStats:
+        """Index a single file incrementally.
+
+        Skips files that have not changed (same content_hash) unless
+        *force* is True.
+
+        Args:
+            file_path: Path to the file to index.
+            root: Optional root for computing relative path identifiers.
+            force: Re-index even if content hash has not changed.
+            max_chunk_chars: Maximum characters per chunk.
+            chunk_overlap: Character overlap between consecutive chunks.
+            max_file_size: Skip files larger than this (bytes).
+
+        Returns:
+            IndexStats with counts and timing.
+        """
+        meta = self._require_metadata()
+        t0 = time.monotonic()
+
+        # Read file
+        try:
+            if file_path.stat().st_size > max_file_size:
+                logger.debug("Skipping %s: exceeds max_file_size", file_path)
+                return IndexStats(duration_seconds=round(time.monotonic() - t0, 2))
+            text = file_path.read_text(encoding="utf-8", errors="replace")
+        except Exception as exc:
+            logger.debug("Skipping %s: %s", file_path, exc)
+            return IndexStats(duration_seconds=round(time.monotonic() - t0, 2))
+
+        content_hash = self._content_hash(text)
+        rel_path = str(file_path.relative_to(root)) if root else str(file_path)
+
+        # Check if update is needed
+        if not force and not meta.file_needs_update(rel_path, content_hash):
+            logger.debug("Skipping %s: unchanged", rel_path)
+            return IndexStats(duration_seconds=round(time.monotonic() - t0, 2))
+
+        # If file was previously indexed, remove old data first
+        if meta.get_file_hash(rel_path) is not None:
+            meta.mark_file_deleted(rel_path)
+            self._fts.delete_by_path(rel_path)
+
+        # Chunk
+        file_chunks = self._chunk_text(text, rel_path, max_chunk_chars, chunk_overlap)
+        if not file_chunks:
+            # Register file with no chunks
+            meta.register_file(rel_path, content_hash, file_path.stat().st_mtime)
+            return IndexStats(
+                files_processed=1,
+                duration_seconds=round(time.monotonic() - t0, 2),
+            )
+
+        # Assign chunk IDs
+        start_id = self._next_chunk_id()
+        batch_ids = []
+        batch_texts = []
+        batch_paths = []
+        for i, (chunk_text, path) in enumerate(file_chunks):
+            batch_ids.append(start_id + i)
+            batch_texts.append(chunk_text)
+            batch_paths.append(path)
+
+        # Embed synchronously
+        vecs = self._embedder.embed_batch(batch_texts)
+        vec_array = np.array(vecs, dtype=np.float32)
+        id_array = np.array(batch_ids, dtype=np.int64)
+
+        # Index: write to stores
+        self._binary_store.add(id_array, vec_array)
+        self._ann_index.add(id_array, vec_array)
+        fts_docs = [
+            (batch_ids[i], batch_paths[i], batch_texts[i])
+            for i in range(len(batch_ids))
+        ]
+        self._fts.add_documents(fts_docs)
+
+        # Register in metadata
+        meta.register_file(rel_path, content_hash, file_path.stat().st_mtime)
+        chunk_id_hashes = [
+            (batch_ids[i], self._content_hash(batch_texts[i]))
+            for i in range(len(batch_ids))
+        ]
+        meta.register_chunks(rel_path, chunk_id_hashes)
+
+        # Flush stores
+        self._binary_store.save()
+        self._ann_index.save()
+
+        duration = time.monotonic() - t0
+        stats = IndexStats(
+            files_processed=1,
+            chunks_created=len(batch_ids),
+            duration_seconds=round(duration, 2),
+        )
+        logger.info(
+            "Indexed file %s: %d chunks in %.2fs",
+            rel_path, stats.chunks_created, stats.duration_seconds,
+        )
+        return stats
+
+    def remove_file(self, file_path: str) -> None:
+        """Mark a file as deleted via tombstone strategy.
+
+        Marks all chunk IDs for the file in MetadataStore.deleted_chunks
+        and removes the file's FTS entries.
+
+        Args:
+            file_path: The relative path identifier of the file to remove.
+        """
+        meta = self._require_metadata()
+        count = meta.mark_file_deleted(file_path)
+        fts_count = self._fts.delete_by_path(file_path)
+        logger.info(
+            "Removed file %s: %d chunks tombstoned, %d FTS entries deleted",
+            file_path, count, fts_count,
+        )
+
+    def sync(
+        self,
+        file_paths: list[Path],
+        *,
+        root: Path | None = None,
+        max_chunk_chars: int = _DEFAULT_MAX_CHUNK_CHARS,
+        chunk_overlap: int = _DEFAULT_CHUNK_OVERLAP,
+        max_file_size: int = 50_000,
+    ) -> IndexStats:
+        """Reconcile index state against a current file list.
+
+        Identifies files that are new, changed, or removed and processes
+        each accordingly.
+
+        Args:
+            file_paths: Current list of files that should be indexed.
+            root: Optional root for computing relative path identifiers.
+            max_chunk_chars: Maximum characters per chunk.
+            chunk_overlap: Character overlap between consecutive chunks.
+            max_file_size: Skip files larger than this (bytes).
+
+        Returns:
+            Aggregated IndexStats for all operations.
+        """
+        meta = self._require_metadata()
+        t0 = time.monotonic()
+
+        # Build set of current relative paths
+        current_rel_paths: dict[str, Path] = {}
+        for fpath in file_paths:
+            rel = str(fpath.relative_to(root)) if root else str(fpath)
+            current_rel_paths[rel] = fpath
+
+        # Get known files from metadata
+        known_files = meta.get_all_files()  # {rel_path: content_hash}
+
+        # Detect removed files
+        removed = set(known_files.keys()) - set(current_rel_paths.keys())
+        for rel in removed:
+            self.remove_file(rel)
+
+        # Index new and changed files
+        total_files = 0
+        total_chunks = 0
+        for rel, fpath in current_rel_paths.items():
+            stats = self.index_file(
+                fpath,
+                root=root,
+                max_chunk_chars=max_chunk_chars,
+                chunk_overlap=chunk_overlap,
+                max_file_size=max_file_size,
+            )
+            total_files += stats.files_processed
+            total_chunks += stats.chunks_created
+
+        duration = time.monotonic() - t0
+        result = IndexStats(
+            files_processed=total_files,
+            chunks_created=total_chunks,
+            duration_seconds=round(duration, 2),
+        )
+        logger.info(
+            "Sync complete: %d files indexed, %d chunks created, "
+            "%d files removed in %.1fs",
+            result.files_processed, result.chunks_created,
+            len(removed), result.duration_seconds,
+        )
+        return result
+
+    def compact(self) -> None:
+        """Rebuild indexes excluding tombstoned chunk IDs.
+
+        Reads all deleted IDs from MetadataStore, rebuilds BinaryStore
+        and ANNIndex without those entries, then clears the
+        deleted_chunks table.
+        """
+        meta = self._require_metadata()
+        deleted_ids = meta.compact_deleted()
+        if not deleted_ids:
+            logger.debug("Compact: no deleted IDs, nothing to do")
+            return
+
+        logger.info("Compact: rebuilding indexes, excluding %d deleted IDs", len(deleted_ids))
+
+        # Rebuild BinaryStore: read current data, filter, replace
+        if self._binary_store._count > 0:
+            active_ids = self._binary_store._ids[: self._binary_store._count]
+            active_matrix = self._binary_store._matrix[: self._binary_store._count]
+            mask = ~np.isin(active_ids, list(deleted_ids))
+            kept_ids = active_ids[mask]
+            kept_matrix = active_matrix[mask]
+            # Reset store
+            self._binary_store._count = 0
+            self._binary_store._matrix = None
+            self._binary_store._ids = None
+            if len(kept_ids) > 0:
+                self._binary_store._ensure_capacity(len(kept_ids))
+                self._binary_store._matrix[: len(kept_ids)] = kept_matrix
+                self._binary_store._ids[: len(kept_ids)] = kept_ids
+                self._binary_store._count = len(kept_ids)
+            self._binary_store.save()
+
+        # Rebuild ANNIndex: must reconstruct from scratch since HNSW
+        # does not support deletion. We re-initialize and re-add kept items.
+        # Note: we need the float32 vectors, but BinaryStore only has quantized.
+        # ANNIndex (hnswlib) supports mark_deleted, but compact means full rebuild.
+        # Since we don't have original float vectors cached, we rely on the fact
+        # that ANNIndex.mark_deleted is not available in all hnswlib versions.
+        # Instead, we reinitialize the index and let future searches filter via
+        # deleted_ids at query time. The BinaryStore is already compacted above.
+        # For a full ANN rebuild, the caller should re-run index_files() on all
+        # files after compact.
+        logger.info(
+            "Compact: BinaryStore rebuilt (%d entries kept). "
+            "Note: ANNIndex retains stale entries; run full re-index for clean ANN state.",
+            self._binary_store._count,
+        )
--- a/codex-lens-v2/src/codexlens_search/search/fts.py
+++ b/codex-lens-v2/src/codexlens_search/search/fts.py
@@ -67,3 +67,28 @@ class FTSEngine:
            "SELECT content FROM docs WHERE rowid = ?", (doc_id,)
        ).fetchone()
        return row[0] if row else ""
+
+    def get_chunk_ids_by_path(self, path: str) -> list[int]:
+        """Return all doc IDs associated with a given file path."""
+        rows = self._conn.execute(
+            "SELECT id FROM docs_meta WHERE path = ?", (path,)
+        ).fetchall()
+        return [r[0] for r in rows]
+
+    def delete_by_path(self, path: str) -> int:
+        """Delete all docs and docs_meta rows for a given file path.
+
+        Returns the number of deleted documents.
+        """
+        ids = self.get_chunk_ids_by_path(path)
+        if not ids:
+            return 0
+        placeholders = ",".join("?" for _ in ids)
+        self._conn.execute(
+            f"DELETE FROM docs WHERE rowid IN ({placeholders})", ids
+        )
+        self._conn.execute(
+            f"DELETE FROM docs_meta WHERE id IN ({placeholders})", ids
+        )
+        self._conn.commit()
+        return len(ids)
--- a/codex-lens-v2/src/codexlens_search/search/pipeline.py
+++ b/codex-lens-v2/src/codexlens_search/search/pipeline.py
@@ -9,6 +9,7 @@ import numpy as np
 from ..config import Config
 from ..core import ANNIndex, BinaryStore
 from ..embed import BaseEmbedder
+from ..indexing.metadata import MetadataStore
 from ..rerank import BaseReranker
 from .fts import FTSEngine
 from .fusion import (
@@ -38,6 +39,7 @@ class SearchPipeline:
        reranker: BaseReranker,
        fts: FTSEngine,
        config: Config,
+        metadata_store: MetadataStore | None = None,
    ) -> None:
        self._embedder = embedder
        self._binary_store = binary_store
@@ -45,6 +47,7 @@ class SearchPipeline:
        self._reranker = reranker
        self._fts = fts
        self._config = config
+        self._metadata_store = metadata_store

    # -- Helper: vector search (binary coarse + ANN fine) -----------------

@@ -137,6 +140,16 @@ class SearchPipeline:

        fused = reciprocal_rank_fusion(fusion_input, weights=weights, k=cfg.fusion_k)

+        # 4b. Filter out deleted IDs (tombstone filtering)
+        if self._metadata_store is not None:
+            deleted_ids = self._metadata_store.get_deleted_ids()
+            if deleted_ids:
+                fused = [
+                    (doc_id, score)
+                    for doc_id, score in fused
+                    if doc_id not in deleted_ids
+                ]
+
        # 5. Rerank top candidates
        rerank_ids = [doc_id for doc_id, _ in fused[:50]]
        contents = [self._fts.get_content(doc_id) for doc_id in rerank_ids]
--- a/codex-lens-v2/src/codexlens_search/watcher/init.py
+++ b/codex-lens-v2/src/codexlens_search/watcher/init.py
@@ -0,0 +1,17 @@
+"""File watcher and incremental indexer for codexlens-search.
+
+Requires the ``watcher`` extra::
+
+    pip install codexlens-search[watcher]
+"""
+from codexlens_search.watcher.events import ChangeType, FileEvent, WatcherConfig
+from codexlens_search.watcher.file_watcher import FileWatcher
+from codexlens_search.watcher.incremental_indexer import IncrementalIndexer
+
+__all__ = [
+    "ChangeType",
+    "FileEvent",
+    "FileWatcher",
+    "IncrementalIndexer",
+    "WatcherConfig",
+]
--- a/codex-lens-v2/src/codexlens_search/watcher/events.py
+++ b/codex-lens-v2/src/codexlens_search/watcher/events.py
@@ -0,0 +1,57 @@
+"""Event types for file watcher."""
+from __future__ import annotations
+
+import time
+from dataclasses import dataclass, field
+from enum import Enum
+from pathlib import Path
+from typing import Optional, Set
+
+
+class ChangeType(Enum):
+    """Type of file system change."""
+
+    CREATED = "created"
+    MODIFIED = "modified"
+    DELETED = "deleted"
+
+
+@dataclass
+class FileEvent:
+    """A file system change event."""
+
+    path: Path
+    change_type: ChangeType
+    timestamp: float = field(default_factory=time.time)
+
+
+@dataclass
+class WatcherConfig:
+    """Configuration for file watcher.
+
+    Attributes:
+        debounce_ms: Milliseconds to wait after the last event before
+            flushing the batch.  Default 500ms for low-latency indexing.
+        ignored_patterns: Directory/file name patterns to skip.  Any
+            path component matching one of these strings is ignored.
+    """
+
+    debounce_ms: int = 500
+    ignored_patterns: Set[str] = field(default_factory=lambda: {
+        # Version control
+        ".git", ".svn", ".hg",
+        # Python
+        ".venv", "venv", "env", "__pycache__", ".pytest_cache",
+        ".mypy_cache", ".ruff_cache",
+        # Node.js
+        "node_modules", "bower_components",
+        # Build artifacts
+        "dist", "build", "out", "target", "bin", "obj",
+        "coverage", "htmlcov",
+        # IDE / Editor
+        ".idea", ".vscode", ".vs",
+        # Package / cache
+        ".cache", ".parcel-cache", ".turbo", ".next", ".nuxt",
+        # Logs / temp
+        "logs", "tmp", "temp",
+    })
--- a/codex-lens-v2/src/codexlens_search/watcher/file_watcher.py
+++ b/codex-lens-v2/src/codexlens_search/watcher/file_watcher.py
@@ -0,0 +1,263 @@
+"""File system watcher using watchdog library.
+
+Ported from codex-lens v1 with simplifications:
+- Removed v1-specific Config dependency (uses WatcherConfig directly)
+- Removed MAX_QUEUE_SIZE (v2 processes immediately via debounce)
+- Removed flush.signal file mechanism
+- Added optional JSONL output mode for bridge CLI integration
+"""
+from __future__ import annotations
+
+import json
+import logging
+import sys
+import threading
+import time
+from pathlib import Path
+from typing import Callable, Dict, List, Optional
+
+from watchdog.events import FileSystemEventHandler
+from watchdog.observers import Observer
+
+from .events import ChangeType, FileEvent, WatcherConfig
+
+logger = logging.getLogger(__name__)
+
+
+# Event priority for deduplication: higher wins when same file appears
+# multiple times within one debounce window.
+_EVENT_PRIORITY: Dict[ChangeType, int] = {
+    ChangeType.CREATED: 1,
+    ChangeType.MODIFIED: 2,
+    ChangeType.DELETED: 3,
+}
+
+
+class _Handler(FileSystemEventHandler):
+    """Internal watchdog handler that converts events to FileEvent."""
+
+    def __init__(self, watcher: FileWatcher) -> None:
+        super().__init__()
+        self._watcher = watcher
+
+    def on_created(self, event) -> None:
+        if not event.is_directory:
+            self._watcher._on_raw_event(event.src_path, ChangeType.CREATED)
+
+    def on_modified(self, event) -> None:
+        if not event.is_directory:
+            self._watcher._on_raw_event(event.src_path, ChangeType.MODIFIED)
+
+    def on_deleted(self, event) -> None:
+        if not event.is_directory:
+            self._watcher._on_raw_event(event.src_path, ChangeType.DELETED)
+
+    def on_moved(self, event) -> None:
+        if event.is_directory:
+            return
+        # Treat move as delete old + create new
+        self._watcher._on_raw_event(event.src_path, ChangeType.DELETED)
+        self._watcher._on_raw_event(event.dest_path, ChangeType.CREATED)
+
+
+class FileWatcher:
+    """File system watcher with debounce and event deduplication.
+
+    Monitors a directory recursively using watchdog.  Raw events are
+    collected into a queue.  After *debounce_ms* of silence the queue
+    is flushed: events are deduplicated per-path (keeping the highest
+    priority change type) and delivered via *on_changes*.
+
+    Example::
+
+        def handle(events: list[FileEvent]) -> None:
+            for e in events:
+                print(e.change_type.value, e.path)
+
+        watcher = FileWatcher(Path("."), WatcherConfig(), handle)
+        watcher.start()
+        watcher.wait()
+    """
+
+    def __init__(
+        self,
+        root_path: Path,
+        config: WatcherConfig,
+        on_changes: Callable[[List[FileEvent]], None],
+    ) -> None:
+        self.root_path = Path(root_path).resolve()
+        self.config = config
+        self.on_changes = on_changes
+
+        self._observer: Optional[Observer] = None
+        self._running = False
+        self._stop_event = threading.Event()
+        self._lock = threading.RLock()
+
+        # Pending events keyed by resolved path
+        self._pending: Dict[Path, FileEvent] = {}
+        self._pending_lock = threading.Lock()
+
+        # True-debounce timer: resets on every new event
+        self._flush_timer: Optional[threading.Timer] = None
+
+    # ------------------------------------------------------------------
+    # Filtering
+    # ------------------------------------------------------------------
+
+    def _should_watch(self, path: Path) -> bool:
+        """Return True if *path* should not be ignored."""
+        parts = path.parts
+        for pattern in self.config.ignored_patterns:
+            if pattern in parts:
+                return False
+        return True
+
+    # ------------------------------------------------------------------
+    # Event intake (called from watchdog thread)
+    # ------------------------------------------------------------------
+
+    def _on_raw_event(self, raw_path: str, change_type: ChangeType) -> None:
+        """Accept a raw watchdog event, filter, and queue with debounce."""
+        path = Path(raw_path).resolve()
+
+        if not self._should_watch(path):
+            return
+
+        event = FileEvent(path=path, change_type=change_type)
+
+        with self._pending_lock:
+            existing = self._pending.get(path)
+            if existing is None or _EVENT_PRIORITY[change_type] >= _EVENT_PRIORITY[existing.change_type]:
+                self._pending[path] = event
+
+            # Cancel previous timer and start a new one (true debounce)
+            if self._flush_timer is not None:
+                self._flush_timer.cancel()
+
+            self._flush_timer = threading.Timer(
+                self.config.debounce_ms / 1000.0,
+                self._flush,
+            )
+            self._flush_timer.daemon = True
+            self._flush_timer.start()
+
+    # ------------------------------------------------------------------
+    # Flush
+    # ------------------------------------------------------------------
+
+    def _flush(self) -> None:
+        """Deduplicate and deliver pending events."""
+        with self._pending_lock:
+            if not self._pending:
+                return
+            events = list(self._pending.values())
+            self._pending.clear()
+            self._flush_timer = None
+
+        try:
+            self.on_changes(events)
+        except Exception:
+            logger.exception("Error in on_changes callback")
+
+    def flush_now(self) -> None:
+        """Immediately flush pending events (manual trigger)."""
+        with self._pending_lock:
+            if self._flush_timer is not None:
+                self._flush_timer.cancel()
+                self._flush_timer = None
+        self._flush()
+
+    # ------------------------------------------------------------------
+    # Lifecycle
+    # ------------------------------------------------------------------
+
+    def start(self) -> None:
+        """Start watching the directory (non-blocking)."""
+        with self._lock:
+            if self._running:
+                logger.warning("Watcher already running")
+                return
+
+            if not self.root_path.exists():
+                raise ValueError(f"Root path does not exist: {self.root_path}")
+
+            self._observer = Observer()
+            handler = _Handler(self)
+            self._observer.schedule(handler, str(self.root_path), recursive=True)
+
+            self._running = True
+            self._stop_event.clear()
+            self._observer.start()
+            logger.info("Started watching: %s", self.root_path)
+
+    def stop(self) -> None:
+        """Stop watching and flush remaining events."""
+        with self._lock:
+            if not self._running:
+                return
+
+            self._running = False
+            self._stop_event.set()
+
+            with self._pending_lock:
+                if self._flush_timer is not None:
+                    self._flush_timer.cancel()
+                    self._flush_timer = None
+
+            if self._observer is not None:
+                self._observer.stop()
+                self._observer.join(timeout=5.0)
+                self._observer = None
+
+            # Deliver any remaining events
+            self._flush()
+            logger.info("Stopped watching: %s", self.root_path)
+
+    def wait(self) -> None:
+        """Block until stopped (Ctrl+C or stop() from another thread)."""
+        try:
+            while self._running:
+                self._stop_event.wait(timeout=1.0)
+        except KeyboardInterrupt:
+            logger.info("Received interrupt, stopping watcher...")
+            self.stop()
+
+    @property
+    def is_running(self) -> bool:
+        """True if the watcher is currently running."""
+        return self._running
+
+    # ------------------------------------------------------------------
+    # JSONL output helper
+    # ------------------------------------------------------------------
+
+    @staticmethod
+    def events_to_jsonl(events: List[FileEvent]) -> str:
+        """Serialize a batch of events as newline-delimited JSON.
+
+        Each line is a JSON object with keys: ``path``, ``change_type``,
+        ``timestamp``.  Useful for bridge CLI integration.
+        """
+        lines: list[str] = []
+        for evt in events:
+            obj = {
+                "path": str(evt.path),
+                "change_type": evt.change_type.value,
+                "timestamp": evt.timestamp,
+            }
+            lines.append(json.dumps(obj, ensure_ascii=False))
+        return "\n".join(lines)
+
+    @staticmethod
+    def jsonl_callback(events: List[FileEvent]) -> None:
+        """Callback that writes JSONL to stdout.
+
+        Suitable as *on_changes* when running in bridge/CLI mode::
+
+            watcher = FileWatcher(root, config, FileWatcher.jsonl_callback)
+        """
+        output = FileWatcher.events_to_jsonl(events)
+        if output:
+            sys.stdout.write(output + "\n")
+            sys.stdout.flush()
--- a/codex-lens-v2/src/codexlens_search/watcher/incremental_indexer.py
+++ b/codex-lens-v2/src/codexlens_search/watcher/incremental_indexer.py
@@ -0,0 +1,129 @@
+"""Incremental indexer that processes FileEvents via IndexingPipeline.
+
+Ported from codex-lens v1 with simplifications:
+- Uses IndexingPipeline.index_file() / remove_file() directly
+- No v1-specific Config, ParserFactory, DirIndexStore dependencies
+- Per-file error isolation: one failure does not stop batch processing
+"""
+from __future__ import annotations
+
+import logging
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import List, Optional
+
+from codexlens_search.indexing.pipeline import IndexingPipeline
+
+from .events import ChangeType, FileEvent
+
+logger = logging.getLogger(__name__)
+
+
+@dataclass
+class BatchResult:
+    """Result of processing a batch of file events."""
+
+    files_indexed: int = 0
+    files_removed: int = 0
+    chunks_created: int = 0
+    errors: List[str] = field(default_factory=list)
+
+    @property
+    def total_processed(self) -> int:
+        return self.files_indexed + self.files_removed
+
+    @property
+    def has_errors(self) -> bool:
+        return len(self.errors) > 0
+
+
+class IncrementalIndexer:
+    """Routes file change events to IndexingPipeline operations.
+
+    CREATED / MODIFIED events call ``pipeline.index_file()``.
+    DELETED events call ``pipeline.remove_file()``.
+
+    Each file is processed in isolation so that a single failure
+    does not prevent the rest of the batch from being indexed.
+
+    Example::
+
+        indexer = IncrementalIndexer(pipeline, root=Path("/project"))
+        result = indexer.process_events([
+            FileEvent(Path("src/main.py"), ChangeType.MODIFIED),
+        ])
+        print(f"Indexed {result.files_indexed}, removed {result.files_removed}")
+    """
+
+    def __init__(
+        self,
+        pipeline: IndexingPipeline,
+        *,
+        root: Optional[Path] = None,
+    ) -> None:
+        """Initialize the incremental indexer.
+
+        Args:
+            pipeline: The indexing pipeline with metadata store configured.
+            root: Optional project root for computing relative paths.
+                  If None, absolute paths are used as identifiers.
+        """
+        self._pipeline = pipeline
+        self._root = root
+
+    def process_events(self, events: List[FileEvent]) -> BatchResult:
+        """Process a batch of file events with per-file error isolation.
+
+        Args:
+            events: List of file events to process.
+
+        Returns:
+            BatchResult with per-batch statistics.
+        """
+        result = BatchResult()
+
+        for event in events:
+            try:
+                if event.change_type in (ChangeType.CREATED, ChangeType.MODIFIED):
+                    self._handle_index(event, result)
+                elif event.change_type == ChangeType.DELETED:
+                    self._handle_remove(event, result)
+            except Exception as exc:
+                error_msg = (
+                    f"Error processing {event.path} "
+                    f"({event.change_type.value}): "
+                    f"{type(exc).__name__}: {exc}"
+                )
+                logger.error(error_msg)
+                result.errors.append(error_msg)
+
+        if result.total_processed > 0:
+            logger.info(
+                "Batch complete: %d indexed, %d removed, %d errors",
+                result.files_indexed,
+                result.files_removed,
+                len(result.errors),
+            )
+
+        return result
+
+    def _handle_index(self, event: FileEvent, result: BatchResult) -> None:
+        """Index a created or modified file."""
+        stats = self._pipeline.index_file(
+            event.path,
+            root=self._root,
+            force=(event.change_type == ChangeType.MODIFIED),
+        )
+        if stats.files_processed > 0:
+            result.files_indexed += 1
+            result.chunks_created += stats.chunks_created
+
+    def _handle_remove(self, event: FileEvent, result: BatchResult) -> None:
+        """Remove a deleted file from the index."""
+        rel_path = (
+            str(event.path.relative_to(self._root))
+            if self._root
+            else str(event.path)
+        )
+        self._pipeline.remove_file(rel_path)
+        result.files_removed += 1
--- a/codex-lens-v2/tests/unit/test_incremental.py
+++ b/codex-lens-v2/tests/unit/test_incremental.py
@@ -0,0 +1,388 @@
+"""Unit tests for IndexingPipeline incremental API (index_file, remove_file, sync, compact)."""
+from __future__ import annotations
+
+import tempfile
+from pathlib import Path
+from unittest.mock import MagicMock
+
+import numpy as np
+import pytest
+
+from codexlens_search.config import Config
+from codexlens_search.core.binary import BinaryStore
+from codexlens_search.core.index import ANNIndex
+from codexlens_search.embed.base import BaseEmbedder
+from codexlens_search.indexing.metadata import MetadataStore
+from codexlens_search.indexing.pipeline import IndexingPipeline, IndexStats
+from codexlens_search.search.fts import FTSEngine
+
+
+DIM = 32
+
+
+class FakeEmbedder(BaseEmbedder):
+    """Deterministic embedder for testing."""
+
+    def __init__(self) -> None:
+        pass
+
+    def embed_single(self, text: str) -> np.ndarray:
+        rng = np.random.default_rng(hash(text) % (2**31))
+        return rng.standard_normal(DIM).astype(np.float32)
+
+    def embed_batch(self, texts: list[str]) -> list[np.ndarray]:
+        return [self.embed_single(t) for t in texts]
+
+
+@pytest.fixture
+def workspace(tmp_path: Path):
+    """Create workspace with stores, metadata, and pipeline."""
+    cfg = Config.small()
+    # Override embed_dim to match our test dim
+    cfg.embed_dim = DIM
+
+    store_dir = tmp_path / "stores"
+    store_dir.mkdir()
+
+    binary_store = BinaryStore(store_dir, DIM, cfg)
+    ann_index = ANNIndex(store_dir, DIM, cfg)
+    fts = FTSEngine(str(store_dir / "fts.db"))
+    metadata = MetadataStore(str(store_dir / "metadata.db"))
+    embedder = FakeEmbedder()
+
+    pipeline = IndexingPipeline(
+        embedder=embedder,
+        binary_store=binary_store,
+        ann_index=ann_index,
+        fts=fts,
+        config=cfg,
+        metadata=metadata,
+    )
+
+    # Create sample source files
+    src_dir = tmp_path / "src"
+    src_dir.mkdir()
+
+    return {
+        "pipeline": pipeline,
+        "metadata": metadata,
+        "binary_store": binary_store,
+        "ann_index": ann_index,
+        "fts": fts,
+        "src_dir": src_dir,
+        "store_dir": store_dir,
+        "config": cfg,
+    }
+
+
+def _write_file(src_dir: Path, name: str, content: str) -> Path:
+    """Write a file and return its path."""
+    p = src_dir / name
+    p.write_text(content, encoding="utf-8")
+    return p
+
+
+# ---------------------------------------------------------------------------
+# MetadataStore helper method tests
+# ---------------------------------------------------------------------------
+
+
+class TestMetadataHelpers:
+    def test_get_all_files_empty(self, workspace):
+        meta = workspace["metadata"]
+        assert meta.get_all_files() == {}
+
+    def test_get_all_files_after_register(self, workspace):
+        meta = workspace["metadata"]
+        meta.register_file("a.py", "hash_a", 1000.0)
+        meta.register_file("b.py", "hash_b", 2000.0)
+        result = meta.get_all_files()
+        assert result == {"a.py": "hash_a", "b.py": "hash_b"}
+
+    def test_max_chunk_id_empty(self, workspace):
+        meta = workspace["metadata"]
+        assert meta.max_chunk_id() == -1
+
+    def test_max_chunk_id_with_chunks(self, workspace):
+        meta = workspace["metadata"]
+        meta.register_file("a.py", "hash_a", 1000.0)
+        meta.register_chunks("a.py", [(0, "h0"), (1, "h1"), (5, "h5")])
+        assert meta.max_chunk_id() == 5
+
+    def test_max_chunk_id_includes_deleted(self, workspace):
+        meta = workspace["metadata"]
+        meta.register_file("a.py", "hash_a", 1000.0)
+        meta.register_chunks("a.py", [(0, "h0"), (3, "h3")])
+        meta.mark_file_deleted("a.py")
+        # Chunks moved to deleted_chunks, max should still be 3
+        assert meta.max_chunk_id() == 3
+
+
+# ---------------------------------------------------------------------------
+# index_file tests
+# ---------------------------------------------------------------------------
+
+
+class TestIndexFile:
+    def test_index_file_basic(self, workspace):
+        pipeline = workspace["pipeline"]
+        meta = workspace["metadata"]
+        src_dir = workspace["src_dir"]
+
+        f = _write_file(src_dir, "hello.py", "print('hello world')\n")
+        stats = pipeline.index_file(f, root=src_dir)
+
+        assert stats.files_processed == 1
+        assert stats.chunks_created >= 1
+        assert meta.get_file_hash("hello.py") is not None
+        assert len(meta.get_chunk_ids_for_file("hello.py")) >= 1
+
+    def test_index_file_skips_unchanged(self, workspace):
+        pipeline = workspace["pipeline"]
+        src_dir = workspace["src_dir"]
+
+        f = _write_file(src_dir, "same.py", "x = 1\n")
+        stats1 = pipeline.index_file(f, root=src_dir)
+        assert stats1.files_processed == 1
+
+        stats2 = pipeline.index_file(f, root=src_dir)
+        assert stats2.files_processed == 0
+        assert stats2.chunks_created == 0
+
+    def test_index_file_force_reindex(self, workspace):
+        pipeline = workspace["pipeline"]
+        src_dir = workspace["src_dir"]
+
+        f = _write_file(src_dir, "force.py", "x = 1\n")
+        pipeline.index_file(f, root=src_dir)
+
+        stats = pipeline.index_file(f, root=src_dir, force=True)
+        assert stats.files_processed == 1
+        assert stats.chunks_created >= 1
+
+    def test_index_file_updates_changed_file(self, workspace):
+        pipeline = workspace["pipeline"]
+        meta = workspace["metadata"]
+        src_dir = workspace["src_dir"]
+
+        f = _write_file(src_dir, "changing.py", "version = 1\n")
+        pipeline.index_file(f, root=src_dir)
+        old_chunks = meta.get_chunk_ids_for_file("changing.py")
+
+        # Modify file
+        f.write_text("version = 2\nmore code\n", encoding="utf-8")
+        stats = pipeline.index_file(f, root=src_dir)
+        assert stats.files_processed == 1
+
+        new_chunks = meta.get_chunk_ids_for_file("changing.py")
+        # Old chunks should have been tombstoned, new ones assigned
+        assert set(old_chunks) != set(new_chunks)
+
+    def test_index_file_registers_in_metadata(self, workspace):
+        pipeline = workspace["pipeline"]
+        meta = workspace["metadata"]
+        fts = workspace["fts"]
+        src_dir = workspace["src_dir"]
+
+        f = _write_file(src_dir, "meta_test.py", "def foo(): pass\n")
+        pipeline.index_file(f, root=src_dir)
+
+        # MetadataStore has file registered
+        assert meta.get_file_hash("meta_test.py") is not None
+        chunk_ids = meta.get_chunk_ids_for_file("meta_test.py")
+        assert len(chunk_ids) >= 1
+
+        # FTS has the content
+        fts_ids = fts.get_chunk_ids_by_path("meta_test.py")
+        assert len(fts_ids) >= 1
+
+    def test_index_file_no_metadata_raises(self, workspace):
+        cfg = workspace["config"]
+        pipeline_no_meta = IndexingPipeline(
+            embedder=FakeEmbedder(),
+            binary_store=workspace["binary_store"],
+            ann_index=workspace["ann_index"],
+            fts=workspace["fts"],
+            config=cfg,
+        )
+        f = _write_file(workspace["src_dir"], "no_meta.py", "x = 1\n")
+        with pytest.raises(RuntimeError, match="MetadataStore is required"):
+            pipeline_no_meta.index_file(f)
+
+
+# ---------------------------------------------------------------------------
+# remove_file tests
+# ---------------------------------------------------------------------------
+
+
+class TestRemoveFile:
+    def test_remove_file_tombstones_and_fts(self, workspace):
+        pipeline = workspace["pipeline"]
+        meta = workspace["metadata"]
+        fts = workspace["fts"]
+        src_dir = workspace["src_dir"]
+
+        f = _write_file(src_dir, "to_remove.py", "data = [1, 2, 3]\n")
+        pipeline.index_file(f, root=src_dir)
+
+        chunk_ids = meta.get_chunk_ids_for_file("to_remove.py")
+        assert len(chunk_ids) >= 1
+
+        pipeline.remove_file("to_remove.py")
+
+        # File should be gone from metadata
+        assert meta.get_file_hash("to_remove.py") is None
+        assert meta.get_chunk_ids_for_file("to_remove.py") == []
+
+        # Chunks should be in deleted_chunks
+        deleted = meta.get_deleted_ids()
+        for cid in chunk_ids:
+            assert cid in deleted
+
+        # FTS should be cleared
+        assert fts.get_chunk_ids_by_path("to_remove.py") == []
+
+    def test_remove_nonexistent_file(self, workspace):
+        pipeline = workspace["pipeline"]
+        # Should not raise
+        pipeline.remove_file("nonexistent.py")
+
+
+# ---------------------------------------------------------------------------
+# sync tests
+# ---------------------------------------------------------------------------
+
+
+class TestSync:
+    def test_sync_indexes_new_files(self, workspace):
+        pipeline = workspace["pipeline"]
+        meta = workspace["metadata"]
+        src_dir = workspace["src_dir"]
+
+        f1 = _write_file(src_dir, "a.py", "a = 1\n")
+        f2 = _write_file(src_dir, "b.py", "b = 2\n")
+
+        stats = pipeline.sync([f1, f2], root=src_dir)
+        assert stats.files_processed == 2
+        assert meta.get_file_hash("a.py") is not None
+        assert meta.get_file_hash("b.py") is not None
+
+    def test_sync_removes_missing_files(self, workspace):
+        pipeline = workspace["pipeline"]
+        meta = workspace["metadata"]
+        src_dir = workspace["src_dir"]
+
+        f1 = _write_file(src_dir, "keep.py", "keep = True\n")
+        f2 = _write_file(src_dir, "remove.py", "remove = True\n")
+
+        pipeline.sync([f1, f2], root=src_dir)
+        assert meta.get_file_hash("remove.py") is not None
+
+        # Sync with only f1 -- f2 should be removed
+        stats = pipeline.sync([f1], root=src_dir)
+        assert meta.get_file_hash("remove.py") is None
+        deleted = meta.get_deleted_ids()
+        assert len(deleted) > 0
+
+    def test_sync_detects_changed_files(self, workspace):
+        pipeline = workspace["pipeline"]
+        meta = workspace["metadata"]
+        src_dir = workspace["src_dir"]
+
+        f = _write_file(src_dir, "mutable.py", "v1\n")
+        pipeline.sync([f], root=src_dir)
+        old_hash = meta.get_file_hash("mutable.py")
+
+        f.write_text("v2\n", encoding="utf-8")
+        stats = pipeline.sync([f], root=src_dir)
+        assert stats.files_processed == 1
+        new_hash = meta.get_file_hash("mutable.py")
+        assert old_hash != new_hash
+
+    def test_sync_skips_unchanged(self, workspace):
+        pipeline = workspace["pipeline"]
+        src_dir = workspace["src_dir"]
+
+        f = _write_file(src_dir, "stable.py", "stable = True\n")
+        pipeline.sync([f], root=src_dir)
+
+        # Second sync with same file, unchanged
+        stats = pipeline.sync([f], root=src_dir)
+        assert stats.files_processed == 0
+        assert stats.chunks_created == 0
+
+
+# ---------------------------------------------------------------------------
+# compact tests
+# ---------------------------------------------------------------------------
+
+
+class TestCompact:
+    def test_compact_removes_tombstoned_from_binary_store(self, workspace):
+        pipeline = workspace["pipeline"]
+        meta = workspace["metadata"]
+        binary_store = workspace["binary_store"]
+        src_dir = workspace["src_dir"]
+
+        f1 = _write_file(src_dir, "alive.py", "alive = True\n")
+        f2 = _write_file(src_dir, "dead.py", "dead = True\n")
+
+        pipeline.index_file(f1, root=src_dir)
+        pipeline.index_file(f2, root=src_dir)
+
+        count_before = binary_store._count
+        assert count_before >= 2
+
+        pipeline.remove_file("dead.py")
+        pipeline.compact()
+
+        # BinaryStore should have fewer entries
+        assert binary_store._count < count_before
+        # deleted_chunks should be cleared
+        assert meta.get_deleted_ids() == set()
+
+    def test_compact_noop_when_no_deletions(self, workspace):
+        pipeline = workspace["pipeline"]
+        meta = workspace["metadata"]
+        binary_store = workspace["binary_store"]
+        src_dir = workspace["src_dir"]
+
+        f = _write_file(src_dir, "solo.py", "solo = True\n")
+        pipeline.index_file(f, root=src_dir)
+        count_before = binary_store._count
+
+        pipeline.compact()
+        assert binary_store._count == count_before
+
+
+# ---------------------------------------------------------------------------
+# Backward compatibility: existing batch API still works
+# ---------------------------------------------------------------------------
+
+
+class TestBatchAPIUnchanged:
+    def test_index_files_still_works(self, workspace):
+        pipeline = workspace["pipeline"]
+        src_dir = workspace["src_dir"]
+
+        f1 = _write_file(src_dir, "batch1.py", "batch1 = 1\n")
+        f2 = _write_file(src_dir, "batch2.py", "batch2 = 2\n")
+
+        stats = pipeline.index_files([f1, f2], root=src_dir)
+        assert stats.files_processed == 2
+        assert stats.chunks_created >= 2
+
+    def test_index_files_works_without_metadata(self, workspace):
+        """Batch API should work even without MetadataStore."""
+        cfg = workspace["config"]
+        pipeline_no_meta = IndexingPipeline(
+            embedder=FakeEmbedder(),
+            binary_store=BinaryStore(workspace["store_dir"] / "no_meta", DIM, cfg),
+            ann_index=ANNIndex(workspace["store_dir"] / "no_meta", DIM, cfg),
+            fts=FTSEngine(str(workspace["store_dir"] / "no_meta_fts.db")),
+            config=cfg,
+        )
+        src_dir = workspace["src_dir"]
+        f = _write_file(src_dir, "no_meta_batch.py", "x = 1\n")
+        stats = pipeline_no_meta.index_files([f], root=src_dir)
+        assert stats.files_processed == 1