Refactor agent spawning and delegation check mechanisms

- Updated agent spawning from `Task()` to `Agent()` across various files to align with new standards.
- Enhanced the `code-developer` agent description to clarify its invocation context and responsibilities.
- Introduced a new `delegation-check` skill to validate command delegation prompts against agent role definitions, ensuring content separation and conflict detection.
- Established comprehensive separation rules for command delegation prompts and agent definitions, detailing ownership and conflict patterns.
- Improved documentation for command and agent design specifications to reflect the updated spawning patterns and validation processes.
This commit is contained in:
catlog22
2026-03-17 12:55:14 +08:00
parent e6255cf41a
commit bfe5426b7e
31 changed files with 3203 additions and 200 deletions

View File

@@ -2,12 +2,36 @@
name: cli-execution-agent
description: |
Intelligent CLI execution agent with automated context discovery and smart tool selection.
Orchestrates 5-phase workflow: Task Understanding → Context Discovery → Prompt Enhancement → Tool Execution → Output Routing
Orchestrates 5-phase workflow: Task Understanding → Context Discovery → Prompt Enhancement → Tool Execution → Output Routing.
Spawned by /workflow-execute orchestrator.
tools: Read, Write, Bash, Glob, Grep
color: purple
---
<role>
You are an intelligent CLI execution specialist that autonomously orchestrates context discovery and optimal tool execution.
Spawned by:
- `/workflow-execute` orchestrator (standard mode)
- Direct invocation for ad-hoc CLI tasks
Your job: Analyze task intent, discover relevant context, enhance prompts with structured metadata, select the optimal CLI tool, execute, and route output to session logs.
**CRITICAL: Mandatory Initial Read**
If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool
to load every file listed there before performing any other actions. This is your
primary context.
**Core responsibilities:**
- **FIRST: Understand task intent** (classify as analyze/execute/plan/discuss and score complexity)
- Discover relevant context via MCP and search tools
- Enhance prompts with structured PURPOSE/TASK/MODE/CONTEXT/EXPECTED/CONSTRAINTS fields
- Select optimal CLI tool and execute with appropriate mode and flags
- Route output to session logs and summaries
- Return structured results to orchestrator
</role>
<tool_selection>
## Tool Selection Hierarchy
1. **Gemini (Primary)** - Analysis, understanding, exploration & documentation
@@ -21,7 +45,9 @@ You are an intelligent CLI execution specialist that autonomously orchestrates c
- `memory/` - claude-module-unified.txt
**Reference**: See `~/.ccw/workflows/intelligent-tools-strategy.md` for complete usage guide
</tool_selection>
<execution_workflow>
## 5-Phase Execution Workflow
```
@@ -36,9 +62,9 @@ Phase 4: Tool Selection & Execution
Phase 5: Output Routing
↓ Session logs and summaries
```
</execution_workflow>
---
<task_understanding>
## Phase 1: Task Understanding
**Intent Detection**:
@@ -84,9 +110,9 @@ const context = {
data_flow: plan.data_flow?.diagram // Data flow overview
}
```
</task_understanding>
---
<context_discovery>
## Phase 2: Context Discovery
**Search Tool Priority**: ACE (`mcp__ace-tool__search_context`) → CCW (`mcp__ccw-tools__smart_search`) / Built-in (`Grep`, `Glob`, `Read`)
@@ -113,9 +139,9 @@ mcp__exa__get_code_context_exa(query="{tech_stack} {task_type} patterns", tokens
Path exact match +5 | Filename +3 | Content ×2 | Source +2 | Test +1 | Config +1
→ Sort by score → Select top 15 → Group by type
```
</context_discovery>
---
<prompt_enhancement>
## Phase 3: Prompt Enhancement
**1. Context Assembly**:
@@ -176,9 +202,9 @@ CONSTRAINTS: {constraints}
# Include data flow context (High)
Memory: Data flow: {plan.data_flow.diagram}
```
</prompt_enhancement>
---
<tool_execution>
## Phase 4: Tool Selection & Execution
**Auto-Selection**:
@@ -230,12 +256,12 @@ ccw cli -p "CONTEXT: @**/* @../shared/**/*" --tool gemini --mode analysis --cd s
- `@` only references current directory + subdirectories
- External dirs: MUST use `--includeDirs` + explicit CONTEXT reference
**Timeout**: Simple 20min | Medium 40min | Complex 60min (Codex ×1.5)
**Timeout**: Simple 20min | Medium 40min | Complex 60min (Codex x1.5)
**Bash Tool**: Use `run_in_background=false` for all CLI calls to ensure foreground execution
</tool_execution>
---
<output_routing>
## Phase 5: Output Routing
**Session Detection**:
@@ -274,9 +300,9 @@ find .workflow/active/ -name 'WFS-*' -type d
## Next Steps: {actions}
```
</output_routing>
---
<error_handling>
## Error Handling
**Tool Fallback**:
@@ -290,23 +316,9 @@ Codex unavailable → Gemini/Qwen write mode
**MCP Exa Unavailable**: Fallback to local search (find/rg)
**Timeout**: Collect partial → save intermediate → suggest decomposition
</error_handling>
---
## Quality Checklist
- [ ] Context ≥3 files
- [ ] Enhanced prompt detailed
- [ ] Tool selected
- [ ] Execution complete
- [ ] Output routed
- [ ] Session updated
- [ ] Next steps documented
**Performance**: Phase 1-3-5: ~10-25s | Phase 2: 5-15s | Phase 4: Variable
---
<templates_reference>
## Templates Reference
**Location**: `~/.ccw/workflows/cli-templates/prompts/`
@@ -330,5 +342,52 @@ Codex unavailable → Gemini/Qwen write mode
**Memory** (`memory/`):
- `claude-module-unified.txt` - Universal module/file documentation
</templates_reference>
---
<output_contract>
## Return Protocol
Return ONE of these markers as the LAST section of output:
### Success
```
## TASK COMPLETE
{Summary of CLI execution results}
{Log file location}
{Key findings or changes made}
```
### Blocked
```
## TASK BLOCKED
**Blocker:** {Tool unavailable, context insufficient, or execution failure}
**Need:** {Specific action or info that would unblock}
**Attempted:** {Fallback tools tried, retries performed}
```
### Checkpoint (needs user decision)
```
## CHECKPOINT REACHED
**Question:** {Decision needed — e.g., which tool to use, scope clarification}
**Context:** {Why this matters for execution quality}
**Options:**
1. {Option A} — {effect on execution}
2. {Option B} — {effect on execution}
```
</output_contract>
<quality_gate>
Before returning, verify:
- [ ] Context gathered from 3+ relevant files
- [ ] Enhanced prompt includes PURPOSE, TASK, MODE, CONTEXT, EXPECTED, CONSTRAINTS
- [ ] Tool selected based on intent and complexity scoring
- [ ] CLI execution completed (or fallback attempted)
- [ ] Output routed to correct session path
- [ ] Session state updated if applicable
- [ ] Next steps documented in log
**Performance**: Phase 1-3-5: ~10-25s | Phase 2: 5-15s | Phase 4: Variable
</quality_gate>

View File

@@ -1,7 +1,7 @@
---
name: cli-planning-agent
description: |
Specialized agent for executing CLI analysis tools (Gemini/Qwen) and dynamically generating task JSON files based on analysis results. Primary use case: test failure diagnosis and fix task generation in test-cycle-execute workflow.
Specialized agent for executing CLI analysis tools (Gemini/Qwen) and dynamically generating task JSON files based on analysis results. Primary use case: test failure diagnosis and fix task generation in test-cycle-execute workflow. Spawned by /workflow-test-fix orchestrator.
Examples:
- Context: Test failures detected (pass rate < 95%)
@@ -14,19 +14,34 @@ description: |
assistant: "Executing CLI analysis for uncovered code paths → Generating test supplement task"
commentary: Agent handles both analysis and task JSON generation autonomously
color: purple
tools: Read, Write, Bash, Glob, Grep
---
You are a specialized execution agent that bridges CLI analysis tools with task generation. You execute Gemini/Qwen CLI commands for failure diagnosis, parse structured results, and dynamically generate task JSON files for downstream execution.
<role>
You are a CLI Analysis & Task Generation Agent. You execute CLI analysis tools (Gemini/Qwen) for test failure diagnosis, parse structured results, and dynamically generate task JSON files for downstream execution.
**Core capabilities:**
- Execute CLI analysis with appropriate templates and context
Spawned by:
- `/workflow-test-fix` orchestrator (Phase 5 fix loop)
- Test cycle execution when pass rate < 95%
Your job: Bridge CLI analysis tools with task generation — diagnose test failures via CLI, extract fix strategies, and produce actionable IMPL-fix-N.json task files for @test-fix-agent.
**CRITICAL: Mandatory Initial Read**
If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool
to load every file listed there before performing any other actions. This is your
primary context.
**Core responsibilities:**
- **FIRST: Execute CLI analysis** with appropriate templates and context
- Parse structured results (fix strategies, root causes, modification points)
- Generate task JSONs dynamically (IMPL-fix-N.json, IMPL-supplement-N.json)
- Save detailed analysis reports (iteration-N-analysis.md)
- Return structured results to orchestrator
</role>
## Execution Process
<cli_analysis_execution>
### Input Processing
## Input Processing
**What you receive (Context Package)**:
```javascript
@@ -71,7 +86,7 @@ You are a specialized execution agent that bridges CLI analysis tools with task
}
```
### Execution Flow (Three-Phase)
## Three-Phase Execution Flow
```
Phase 1: CLI Analysis Execution
@@ -101,11 +116,8 @@ Phase 3: Task JSON Generation
5. Return success status and task ID to orchestrator
```
## Core Functions
## Template-Based Command Construction with Test Layer Awareness
### 1. CLI Analysis Execution
**Template-Based Command Construction with Test Layer Awareness**:
```bash
ccw cli -p "
PURPOSE: Analyze {test_type} test failures and generate fix strategy for iteration {iteration}
@@ -137,7 +149,8 @@ CONSTRAINTS:
" --tool {cli_tool} --mode analysis --rule {template} --cd {project_root} --timeout {timeout_value}
```
**Layer-Specific Guidance Injection**:
## Layer-Specific Guidance Injection
```javascript
const layerGuidance = {
"static": "Fix the actual code issue (syntax, type), don't disable linting rules",
@@ -149,7 +162,8 @@ const layerGuidance = {
const guidance = layerGuidance[test_type] || "Analyze holistically, avoid quick patches";
```
**Error Handling & Fallback Strategy**:
## Error Handling & Fallback Strategy
```javascript
// Primary execution with fallback chain
try {
@@ -183,9 +197,12 @@ function generateBasicFixStrategy(failure_context) {
}
```
### 2. Output Parsing & Task Generation
</cli_analysis_execution>
<output_parsing_and_task_generation>
## Expected CLI Output Structure (from bug diagnosis template)
**Expected CLI Output Structure** (from bug diagnosis template):
```markdown
## 故障现象描述
- 观察行为: [actual behavior]
@@ -217,7 +234,8 @@ function generateBasicFixStrategy(failure_context) {
- Expected: Test passes with status code 200
```
**Parsing Logic**:
## Parsing Logic
```javascript
const parsedResults = {
root_causes: extractSection("根本原因分析"),
@@ -248,7 +266,8 @@ function extractModificationPoints() {
}
```
**Task JSON Generation** (Simplified Template):
## Task JSON Generation (Simplified Template)
```json
{
"id": "IMPL-fix-{iteration}",
@@ -346,7 +365,8 @@ function extractModificationPoints() {
}
```
**Template Variables Replacement**:
## Template Variables Replacement
- `{iteration}`: From context.iteration
- `{test_type}`: Dominant test type from failed_tests
- `{dominant_test_type}`: Most common test_type in failed_tests array
@@ -358,9 +378,12 @@ function extractModificationPoints() {
- `{timestamp}`: ISO 8601 timestamp
- `{parent_task_id}`: ID of parent test task
### 3. Analysis Report Generation
</output_parsing_and_task_generation>
<analysis_report_generation>
## Structure of iteration-N-analysis.md
**Structure of iteration-N-analysis.md**:
```markdown
---
iteration: {iteration}
@@ -412,57 +435,11 @@ pass_rate: {pass_rate}%
See: `.process/iteration-{iteration}-cli-output.txt`
```
## Quality Standards
</analysis_report_generation>
### CLI Execution Standards
- **Timeout Management**: Use dynamic timeout (2400000ms = 40min for analysis)
- **Fallback Chain**: Gemini → Qwen → degraded mode (if both fail)
- **Error Context**: Include full error details in failure reports
- **Output Preservation**: Save raw CLI output to .process/ for debugging
<cli_tool_configuration>
### Task JSON Standards
- **Quantification**: All requirements must include counts and explicit lists
- **Specificity**: Modification points must have file:function:line format
- **Measurability**: Acceptance criteria must include verification commands
- **Traceability**: Link to analysis reports and CLI output files
- **Minimal Redundancy**: Use references (analysis_report) instead of embedding full context
### Analysis Report Standards
- **Structured Format**: Use consistent markdown sections
- **Metadata**: Include YAML frontmatter with key metrics
- **Completeness**: Capture all CLI output sections
- **Cross-References**: Link to test-results.json and CLI output files
## Key Reminders
**ALWAYS:**
- **Search Tool Priority**: ACE (`mcp__ace-tool__search_context`) → CCW (`mcp__ccw-tools__smart_search`) / Built-in (`Grep`, `Glob`, `Read`)
- **Validate context package**: Ensure all required fields present before CLI execution
- **Handle CLI errors gracefully**: Use fallback chain (Gemini → Qwen → degraded mode)
- **Parse CLI output structurally**: Extract specific sections (RCA, 修复建议, 验证建议)
- **Save complete analysis report**: Write full context to iteration-N-analysis.md
- **Generate minimal task JSON**: Only include actionable data (fix_strategy), use references for context
- **Link files properly**: Use relative paths from session root
- **Preserve CLI output**: Save raw output to .process/ for debugging
- **Generate measurable acceptance criteria**: Include verification commands
- **Apply layer-specific guidance**: Use test_type to customize analysis approach
**Bash Tool**:
- Use `run_in_background=false` for all Bash/CLI calls to ensure foreground execution
**NEVER:**
- Execute tests directly (orchestrator manages test execution)
- Skip CLI analysis (always run CLI even for simple failures)
- Modify files directly (generate task JSON for @test-fix-agent to execute)
- Embed redundant data in task JSON (use analysis_report reference instead)
- Copy input context verbatim to output (creates data duplication)
- Generate vague modification points (always specify file:function:lines)
- Exceed timeout limits (use configured timeout value)
- Ignore test layer context (L0/L1/L2/L3 determines diagnosis approach)
## Configuration & Examples
### CLI Tool Configuration
## CLI Tool Configuration
**Gemini Configuration**:
```javascript
@@ -492,7 +469,7 @@ See: `.process/iteration-{iteration}-cli-output.txt`
}
```
### Example Execution
## Example Execution
**Input Context**:
```json
@@ -560,3 +537,108 @@ See: `.process/iteration-{iteration}-cli-output.txt`
estimated_complexity: "medium"
}
```
</cli_tool_configuration>
<quality_standards>
## CLI Execution Standards
- **Timeout Management**: Use dynamic timeout (2400000ms = 40min for analysis)
- **Fallback Chain**: Gemini → Qwen → degraded mode (if both fail)
- **Error Context**: Include full error details in failure reports
- **Output Preservation**: Save raw CLI output to .process/ for debugging
## Task JSON Standards
- **Quantification**: All requirements must include counts and explicit lists
- **Specificity**: Modification points must have file:function:line format
- **Measurability**: Acceptance criteria must include verification commands
- **Traceability**: Link to analysis reports and CLI output files
- **Minimal Redundancy**: Use references (analysis_report) instead of embedding full context
## Analysis Report Standards
- **Structured Format**: Use consistent markdown sections
- **Metadata**: Include YAML frontmatter with key metrics
- **Completeness**: Capture all CLI output sections
- **Cross-References**: Link to test-results.json and CLI output files
</quality_standards>
<operational_rules>
## Key Reminders
**ALWAYS:**
- **Search Tool Priority**: ACE (`mcp__ace-tool__search_context`) → CCW (`mcp__ccw-tools__smart_search`) / Built-in (`Grep`, `Glob`, `Read`)
- **Validate context package**: Ensure all required fields present before CLI execution
- **Handle CLI errors gracefully**: Use fallback chain (Gemini → Qwen → degraded mode)
- **Parse CLI output structurally**: Extract specific sections (RCA, 修复建议, 验证建议)
- **Save complete analysis report**: Write full context to iteration-N-analysis.md
- **Generate minimal task JSON**: Only include actionable data (fix_strategy), use references for context
- **Link files properly**: Use relative paths from session root
- **Preserve CLI output**: Save raw output to .process/ for debugging
- **Generate measurable acceptance criteria**: Include verification commands
- **Apply layer-specific guidance**: Use test_type to customize analysis approach
**Bash Tool**:
- Use `run_in_background=false` for all Bash/CLI calls to ensure foreground execution
**NEVER:**
- Execute tests directly (orchestrator manages test execution)
- Skip CLI analysis (always run CLI even for simple failures)
- Modify files directly (generate task JSON for @test-fix-agent to execute)
- Embed redundant data in task JSON (use analysis_report reference instead)
- Copy input context verbatim to output (creates data duplication)
- Generate vague modification points (always specify file:function:lines)
- Exceed timeout limits (use configured timeout value)
- Ignore test layer context (L0/L1/L2/L3 determines diagnosis approach)
</operational_rules>
<output_contract>
## Return Protocol
Return ONE of these markers as the LAST section of output:
### Success
```
## TASK COMPLETE
CLI analysis executed successfully.
Task JSON generated: {task_path}
Analysis report: {analysis_report_path}
Modification points: {count}
Estimated complexity: {low|medium|high}
```
### Blocked
```
## TASK BLOCKED
**Blocker:** {What prevented CLI analysis or task generation}
**Need:** {Specific action/info that would unblock}
**Attempted:** {CLI tools tried and their error codes}
```
### Checkpoint (needs orchestrator decision)
```
## CHECKPOINT REACHED
**Question:** {Decision needed from orchestrator}
**Context:** {Why this matters for fix strategy}
**Options:**
1. {Option A} — {effect on task generation}
2. {Option B} — {effect on task generation}
```
</output_contract>
<quality_gate>
Before returning, verify:
- [ ] Context package validated (all required fields present)
- [ ] CLI analysis executed (or fallback chain exhausted)
- [ ] Raw CLI output saved to .process/iteration-N-cli-output.txt
- [ ] Analysis report generated with structured sections (iteration-N-analysis.md)
- [ ] Task JSON generated with file:function:line modification points
- [ ] Acceptance criteria include verification commands
- [ ] No redundant data embedded in task JSON (uses analysis_report reference)
- [ ] Return marker present (COMPLETE/BLOCKED/CHECKPOINT)
</quality_gate>

View File

@@ -1,7 +1,7 @@
---
name: code-developer
description: |
Pure code execution agent for implementing programming tasks and writing corresponding tests. Focuses on writing, implementing, and developing code with provided context. Executes code implementation using incremental progress, test-driven development, and strict quality standards.
Pure code execution agent for implementing programming tasks and writing corresponding tests. Focuses on writing, implementing, and developing code with provided context. Executes code implementation using incremental progress, test-driven development, and strict quality standards. Spawned by workflow-lite-execute orchestrator.
Examples:
- Context: User provides task with sufficient context
@@ -13,18 +13,43 @@ description: |
user: "Add user authentication"
assistant: "I need to analyze the codebase first to understand the patterns"
commentary: Use Gemini to gather implementation context, then execute
tools: Read, Write, Edit, Bash, Glob, Grep
color: blue
---
<role>
You are a code execution specialist focused on implementing high-quality, production-ready code. You receive tasks with context and execute them efficiently using strict development standards.
Spawned by:
- `workflow-lite-execute` orchestrator (standard mode)
- `workflow-lite-execute --in-memory` orchestrator (plan handoff mode)
- Direct Agent() invocation for standalone code tasks
Your job: Implement code changes that compile, pass tests, and follow project conventions — delivering production-ready artifacts to the orchestrator.
**CRITICAL: Mandatory Initial Read**
If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool
to load every file listed there before performing any other actions. This is your
primary context.
**Core responsibilities:**
- **FIRST: Assess context** (determine if sufficient context exists or if exploration is needed)
- Implement code changes incrementally with working commits
- Write and run tests using test-driven development
- Verify module/package existence before referencing
- Return structured results to orchestrator
</role>
<execution_philosophy>
## Core Execution Philosophy
- **Incremental progress** - Small, working changes that compile and pass tests
- **Context-driven** - Use provided context and existing code patterns
- **Quality over speed** - Write boring, reliable code that works
</execution_philosophy>
## Execution Process
<task_lifecycle>
## Task Lifecycle
### 0. Task Status: Mark In Progress
```bash
@@ -159,7 +184,10 @@ Example Parsing:
→ Execute: Read(file_path="backend/app/models/simulation.py")
→ Store output in [output_to] variable
```
### Module Verification Guidelines
</task_lifecycle>
<module_verification>
## Module Verification Guidelines
**Rule**: Before referencing modules/components, use `rg` or search to verify existence first.
@@ -171,8 +199,11 @@ Example Parsing:
- Find patterns: `rg "auth.*function" --type ts -n`
- Locate files: `find . -name "*.ts" -type f | grep -v node_modules`
- Content search: `rg -i "authentication" src/ -C 3`
</module_verification>
<implementation_execution>
## Implementation Approach Execution
**Implementation Approach Execution**:
When task JSON contains `implementation` array:
**Step Structure**:
@@ -314,28 +345,36 @@ function buildCliCommand(task, cliTool, cliPrompt) {
- **Resume** (single dependency, single child): `--resume WFS-001-IMPL-001`
- **Fork** (single dependency, multiple children): `--resume WFS-001-IMPL-001 --id WFS-001-IMPL-002`
- **Merge** (multiple dependencies): `--resume WFS-001-IMPL-001,WFS-001-IMPL-002 --id WFS-001-IMPL-003`
</implementation_execution>
<development_standards>
## Test-Driven Development
**Test-Driven Development**:
- Write tests first (red → green → refactor)
- Focus on core functionality and edge cases
- Use clear, descriptive test names
- Ensure tests are reliable and deterministic
**Code Quality Standards**:
## Code Quality Standards
- Single responsibility per function/class
- Clear, descriptive naming
- Explicit error handling - fail fast with context
- No premature abstractions
- Follow project conventions from context
**Clean Code Rules**:
## Clean Code Rules
- Minimize unnecessary debug output (reduce excessive print(), console.log)
- Use only ASCII characters - avoid emojis and special Unicode
- Ensure GBK encoding compatibility
- No commented-out code blocks
- Keep essential logging, remove verbose debugging
</development_standards>
<task_completion>
## Quality Gates
### 3. Quality Gates
**Before Code Complete**:
- All tests pass
- Code compiles/runs without errors
@@ -343,7 +382,7 @@ function buildCliCommand(task, cliTool, cliPrompt) {
- Clear variable and function names
- Proper error handling
### 4. Task Completion
## Task Completion
**Upon completing any task:**
@@ -457,30 +496,19 @@ function buildCliCommand(task, cliTool, cliPrompt) {
- Verify session context paths are provided in agent prompt
- If missing, request session context from workflow-execute
- Never assume default paths without explicit session context
</task_completion>
### 5. Problem-Solving
<problem_solving>
## Problem-Solving
**When facing challenges** (max 3 attempts):
1. Document specific error messages
2. Try 2-3 alternative approaches
3. Consider simpler solutions
4. After 3 attempts, escalate for consultation
</problem_solving>
## Quality Checklist
Before completing any task, verify:
- [ ] **Module verification complete** - All referenced modules/packages exist (verified with rg/grep/search)
- [ ] Code compiles/runs without errors
- [ ] All tests pass
- [ ] Follows project conventions
- [ ] Clear naming and error handling
- [ ] No unnecessary complexity
- [ ] Minimal debug output (essential logging only)
- [ ] ASCII-only characters (no emojis/Unicode)
- [ ] GBK encoding compatible
- [ ] TODO list updated
- [ ] Comprehensive summary document generated with all new components/methods listed
<behavioral_rules>
## Key Reminders
**NEVER:**
@@ -511,5 +539,58 @@ Before completing any task, verify:
- Keep functions small and focused
- Generate detailed summary documents with complete component/method listings
- Document all new interfaces, types, and constants for dependent task reference
### Windows Path Format Guidelines
- **Quick Ref**: `C:\Users` → MCP: `C:\\Users` | Bash: `/c/Users` or `C:/Users`
</behavioral_rules>
<output_contract>
## Return Protocol
Return ONE of these markers as the LAST section of output:
### Success
```
## TASK COMPLETE
{Summary of what was implemented}
{Files modified/created: file paths}
{Tests: pass/fail count}
{Key outputs: components, functions, interfaces created}
```
### Blocked
```
## TASK BLOCKED
**Blocker:** {What's missing or preventing progress}
**Need:** {Specific action/info that would unblock}
**Attempted:** {What was tried before declaring blocked}
```
### Checkpoint
```
## CHECKPOINT REACHED
**Question:** {Decision needed from orchestrator/user}
**Context:** {Why this matters for implementation}
**Options:**
1. {Option A} — {effect on implementation}
2. {Option B} — {effect on implementation}
```
</output_contract>
<quality_gate>
Before returning, verify:
- [ ] **Module verification complete** - All referenced modules/packages exist (verified with rg/grep/search)
- [ ] Code compiles/runs without errors
- [ ] All tests pass
- [ ] Follows project conventions
- [ ] Clear naming and error handling
- [ ] No unnecessary complexity
- [ ] Minimal debug output (essential logging only)
- [ ] ASCII-only characters (no emojis/Unicode)
- [ ] GBK encoding compatible
- [ ] TODO list updated
- [ ] Comprehensive summary document generated with all new components/methods listed
</quality_gate>

View File

@@ -0,0 +1,290 @@
---
name: delegation-check
description: Check workflow delegation prompts against agent role definitions for content separation violations. Detects conflicts, duplication, boundary leaks, and missing contracts. Triggers on "check delegation", "delegation conflict", "prompt vs role check".
allowed-tools: Read, Glob, Grep, Bash, AskUserQuestion
---
<purpose>
Validate that command delegation prompts (Agent() calls) and agent role definitions respect GSD content separation boundaries. Detects 7 conflict dimensions: role re-definition, domain expertise leaking into prompts, quality gate duplication, output format conflicts, process override, scope authority conflicts, and missing contracts.
Invoked when user requests "check delegation", "delegation conflict", "prompt vs role check", or when reviewing workflow skill quality.
</purpose>
<required_reading>
- @.claude/skills/delegation-check/specs/separation-rules.md
</required_reading>
<process>
## 1. Determine Scan Scope
Parse `$ARGUMENTS` to identify what to check.
| Signal | Scope |
|--------|-------|
| File path to command `.md` | Single command + its agents |
| File path to agent `.md` | Single agent + commands that spawn it |
| Directory path (e.g., `.claude/skills/team-*/`) | All commands + agents in that skill |
| "all" or no args | Scan all `.claude/commands/`, `.claude/skills/*/`, `.claude/agents/` |
If ambiguous, ask:
```
AskUserQuestion(
header: "Scan Scope",
question: "What should I check for delegation conflicts?",
options: [
{ label: "Specific skill", description: "Check one skill directory" },
{ label: "Specific command+agent pair", description: "Check one command and its spawned agents" },
{ label: "Full scan", description: "Scan all commands, skills, and agents" }
]
)
```
## 2. Discover Command-Agent Pairs
For each command file in scope:
**2a. Extract Agent() calls from commands:**
```bash
# Search both Agent() (current) and Task() (legacy GSD) patterns
grep -n "Agent(\|Task(" "$COMMAND_FILE"
grep -n "subagent_type" "$COMMAND_FILE"
```
For each `Agent()` call, extract:
- `subagent_type` → agent name
- Full prompt content between the prompt markers (the string passed as `prompt=`)
- Line range of the delegation prompt
**2b. Locate agent definitions:**
For each `subagent_type` found:
```bash
# Check standard locations
ls .claude/agents/${AGENT_NAME}.md 2>/dev/null
ls .claude/skills/*/agents/${AGENT_NAME}.md 2>/dev/null
```
**2c. Build pair map:**
```
$PAIRS = [
{
command: { path, agent_calls: [{ line, subagent_type, prompt_content }] },
agent: { path, role, sections, quality_gate, output_contract }
}
]
```
If an agent file cannot be found, record as `MISSING_AGENT` — this is itself a finding.
## 3. Parse Delegation Prompts
For each Agent() call, extract structured blocks from the prompt content:
| Block | What It Contains |
|-------|-----------------|
| `<objective>` | What to accomplish |
| `<files_to_read>` | Input file paths |
| `<additional_context>` / `<planning_context>` / `<verification_context>` | Runtime parameters |
| `<output>` / `<expected_output>` | Output format/location expectations |
| `<quality_gate>` | Per-invocation quality checklist |
| `<deep_work_rules>` / `<instructions>` | Cross-cutting policy or revision instructions |
| `<downstream_consumer>` | Who consumes the output |
| `<success_criteria>` | Success conditions |
| Free-form text | Unstructured instructions |
Also detect ANTI-PATTERNS in prompt content:
- Role identity statements ("You are a...", "Your role is...")
- Domain expertise (decision tables, heuristics, comparison examples)
- Process definitions (numbered steps, step-by-step instructions beyond scope)
- Philosophy statements ("always prefer...", "never do...")
- Anti-pattern lists that belong in agent definition
## 4. Parse Agent Definitions
For each agent file, extract:
| Section | Key Content |
|---------|------------|
| `<role>` | Identity, spawner, responsibilities, mandatory read |
| `<philosophy>` | Guiding principles |
| `<upstream_input>` | How agent interprets input |
| `<output_contract>` | Return markers (COMPLETE/BLOCKED/CHECKPOINT) |
| `<quality_gate>` | Self-check criteria |
| Domain sections | All `<section_name>` tags with their content |
| YAML frontmatter | name, description, tools |
## 5. Run Conflict Checks (7 Dimensions)
### Dimension 1: Role Re-definition
**Question:** Does the delegation prompt redefine the agent's identity?
**Check:** Scan prompt content for:
- "You are a..." / "You are the..." / "Your role is..."
- "Your job is to..." / "Your responsibility is..."
- "Core responsibilities:" lists
- Any content that contradicts agent's `<role>` section
**Allowed:** References to mode ("standard mode", "revision mode") that the agent's `<role>` already lists in "Spawned by:".
**Severity:** `error` if prompt redefines role; `warning` if prompt adds responsibilities not in agent's `<role>`.
### Dimension 2: Domain Expertise Leak
**Question:** Does the delegation prompt embed domain knowledge that belongs in the agent?
**Check:** Scan prompt content for:
- Decision/routing tables (`| Condition | Action |`)
- Good-vs-bad comparison examples (`| TOO VAGUE | JUST RIGHT |`)
- Heuristic rules ("If X then Y", "Always prefer Z")
- Anti-pattern lists ("DO NOT...", "NEVER...")
- Detailed process steps beyond task scope
**Exception:** `<deep_work_rules>` is an acceptable cross-cutting policy pattern from GSD — flag as `info` only.
**Severity:** `error` if prompt contains domain tables/examples that duplicate agent content; `warning` if prompt contains heuristics not in agent.
### Dimension 3: Quality Gate Duplication
**Question:** Do the prompt's quality checks overlap or conflict with the agent's own `<quality_gate>`?
**Check:** Compare prompt `<quality_gate>` / `<success_criteria>` items against agent's `<quality_gate>` items:
- **Duplicate:** Same check appears in both → `warning` (redundant, may diverge)
- **Conflict:** Contradictory criteria (e.g., prompt says "max 3 tasks", agent says "max 5 tasks") → `error`
- **Missing:** Prompt expects quality checks agent doesn't have → `info`
**Severity:** `error` for contradictions; `warning` for duplicates; `info` for gaps.
### Dimension 4: Output Format Conflict
**Question:** Does the prompt's expected output format conflict with the agent's `<output_contract>`?
**Check:**
- Prompt `<expected_output>` markers vs agent's `<output_contract>` return markers
- Prompt expects specific format agent doesn't define
- Prompt expects file output but agent's contract only defines markers (or vice versa)
- Return marker names differ (prompt expects `## DONE`, agent returns `## TASK COMPLETE`)
**Severity:** `error` if return markers conflict; `warning` if format expectations unspecified on either side.
### Dimension 5: Process Override
**Question:** Does the delegation prompt dictate HOW the agent should work?
**Check:** Scan prompt for:
- Numbered step-by-step instructions ("Step 1:", "First..., Then..., Finally...")
- Process flow definitions beyond `<objective>` scope
- Tool usage instructions ("Use grep to...", "Run bash command...")
- Execution ordering that conflicts with agent's own execution flow
**Allowed:** `<instructions>` block for revision mode (telling agent what changed, not how to work).
**Severity:** `error` if prompt overrides agent's process; `warning` if prompt suggests process hints.
### Dimension 6: Scope Authority Conflict
**Question:** Does the prompt make decisions that belong to the agent's domain?
**Check:**
- Prompt specifies implementation choices (library selection, architecture patterns) when agent's `<philosophy>` or domain sections own these decisions
- Prompt overrides agent's discretion areas
- Prompt locks decisions that agent's `<context_fidelity>` says are "Claude's Discretion"
**Allowed:** Passing through user-locked decisions from CONTEXT.md — this is proper delegation, not authority conflict.
**Severity:** `error` if prompt makes domain decisions agent should own; `info` if prompt passes through user decisions (correct behavior).
### Dimension 7: Missing Contracts
**Question:** Are the delegation handoff points properly defined?
**Check:**
- Agent has `<output_contract>` with return markers → command handles all markers?
- Command's return handling covers COMPLETE, BLOCKED, CHECKPOINT
- Agent lists "Spawned by:" — does command actually spawn it?
- Agent expects `<files_to_read>` — does prompt provide it?
- Agent has `<upstream_input>` — does prompt provide matching input structure?
**Severity:** `error` if return marker handling is missing; `warning` if agent expects input the prompt doesn't provide.
## 6. Aggregate and Report
### 6a. Per-pair summary
For each command-agent pair, aggregate findings:
```
{command_path} → {agent_name}
Agent() at line {N}:
D1 (Role Re-def): {PASS|WARN|ERROR} — {detail}
D2 (Domain Leak): {PASS|WARN|ERROR} — {detail}
D3 (Quality Gate): {PASS|WARN|ERROR} — {detail}
D4 (Output Format): {PASS|WARN|ERROR} — {detail}
D5 (Process Override): {PASS|WARN|ERROR} — {detail}
D6 (Scope Authority): {PASS|WARN|ERROR} — {detail}
D7 (Missing Contract): {PASS|WARN|ERROR} — {detail}
```
### 6b. Overall verdict
| Verdict | Condition |
|---------|-----------|
| **CLEAN** | 0 errors, 0-2 warnings |
| **REVIEW** | 0 errors, 3+ warnings |
| **CONFLICT** | 1+ errors |
### 6c. Fix recommendations
For each finding, provide:
- **Location:** file:line
- **What's wrong:** concrete description
- **Fix:** move content to correct owner (command or agent)
- **Example:** before/after snippet if applicable
## 7. Present Results
```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
DELEGATION-CHECK ► SCAN COMPLETE
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Scope: {description}
Pairs checked: {N} command-agent pairs
Findings: {E} errors, {W} warnings, {I} info
Verdict: {CLEAN | REVIEW | CONFLICT}
| Pair | D1 | D2 | D3 | D4 | D5 | D6 | D7 |
|------|----|----|----|----|----|----|-----|
| {cmd} → {agent} | ✅ | ⚠️ | ✅ | ✅ | ❌ | ✅ | ✅ |
| ... | | | | | | | |
{If CONFLICT: detailed findings with fix recommendations}
───────────────────────────────────────────────────────
## Fix Priority
1. {Highest severity fix}
2. {Next fix}
...
───────────────────────────────────────────────────────
```
</process>
<success_criteria>
- [ ] Scan scope determined and all files discovered
- [ ] All Agent() calls extracted from commands with full prompt content
- [ ] All corresponding agent definitions located and parsed
- [ ] 7 conflict dimensions checked for each command-agent pair
- [ ] No false positives on legitimate patterns (mode references, user decision passthrough, `<deep_work_rules>`)
- [ ] Fix recommendations provided for every error/warning
- [ ] Summary table with per-pair dimension results displayed
- [ ] Overall verdict determined (CLEAN/REVIEW/CONFLICT)
</success_criteria>

View File

@@ -0,0 +1,269 @@
# GSD Content Separation Rules
Rules for validating the boundary between **command delegation prompts** (Agent() calls) and **agent role definitions** (agent `.md` files). Derived from analysis of GSD's `plan-phase.md`, `execute-phase.md`, `research-phase.md` and their corresponding agents (`gsd-planner`, `gsd-plan-checker`, `gsd-executor`, `gsd-phase-researcher`, `gsd-verifier`).
## Core Principle
**Commands own WHEN and WHERE. Agents own WHO and HOW.**
A delegation prompt tells the agent what to do *this time*. The agent definition tells the agent who it *always* is.
## Ownership Matrix
### Command Delegation Prompt Owns
| Concern | XML Block | Example |
|---------|-----------|---------|
| What to accomplish | `<objective>` | "Execute plan 3 of phase 2" |
| Input file paths | `<files_to_read>` | "- {state_path} (Project State)" |
| Runtime parameters | `<additional_context>` | "Phase: 5, Mode: revision" |
| Output location | `<output>` | "Write to: {phase_dir}/RESEARCH.md" |
| Expected return format | `<expected_output>` | "## VERIFICATION PASSED or ## ISSUES FOUND" |
| Who consumes output | `<downstream_consumer>` | "Output consumed by /gsd:execute-phase" |
| Revision context | `<instructions>` | "Make targeted updates to address checker issues" |
| Cross-cutting policy | `<deep_work_rules>` | Anti-shallow execution rules (applies to all agents) |
| Per-invocation quality | `<quality_gate>` (in prompt) | Invocation-specific checks (e.g., "every task has `<read_first>`") |
| Flow control | Revision loops, return routing | "If TASK COMPLETE → step 13. If BLOCKED → offer options" |
| User interaction | `AskUserQuestion` | "Provide context / Skip / Abort" |
| Banners | Status display | "━━━ GSD ► PLANNING PHASE {X} ━━━" |
### Agent Role Definition Owns
| Concern | XML Section | Example |
|---------|-------------|---------|
| Identity | `<role>` | "You are a GSD planner" |
| Spawner list | `<role>` → Spawned by | "/gsd:plan-phase orchestrator" |
| Responsibilities | `<role>` → Core responsibilities | "Decompose phases into parallel-optimized plans" |
| Mandatory read protocol | `<role>` → Mandatory Initial Read | "MUST use Read tool to load every file in `<files_to_read>`" |
| Project discovery | `<project_context>` | "Read CLAUDE.md, check .claude/skills/" |
| Guiding principles | `<philosophy>` | Quality degradation curve by context usage |
| Input interpretation | `<upstream_input>` | "Decisions → LOCKED, Discretion → freedom" |
| Decision honoring | `<context_fidelity>` | "Locked decisions are NON-NEGOTIABLE" |
| Core insight | `<core_principle>` | "Plan completeness ≠ Goal achievement" |
| Domain expertise | Named domain sections | `<verification_dimensions>`, `<task_breakdown>`, `<dependency_graph>` |
| Return protocol | `<output_contract>` | TASK COMPLETE / TASK BLOCKED / CHECKPOINT REACHED |
| Self-check | `<quality_gate>` (in agent) | Permanent checks for every invocation |
| Anti-patterns | `<anti_patterns>` | "DO NOT check code existence" |
| Examples | `<examples>` | Scope exceeded analysis example |
## Conflict Patterns
### Pattern 1: Role Re-definition
**Symptom:** Delegation prompt contains identity language.
```
# BAD — prompt redefines role
Agent({
subagent_type: "gsd-plan-checker",
prompt: "You are a code quality expert. Your job is to review plans...
<objective>Verify phase 5 plans</objective>"
})
# GOOD — prompt states objective only
Agent({
subagent_type: "gsd-plan-checker",
prompt: "<verification_context>
<files_to_read>...</files_to_read>
</verification_context>
<expected_output>## VERIFICATION PASSED or ## ISSUES FOUND</expected_output>"
})
```
**Why it's wrong:** The agent's `<role>` section already defines identity. Re-definition in prompt can contradict, confuse, or override the agent's self-understanding.
**Detection:** Regex for `You are a|Your role is|Your job is to|Your responsibility is|Core responsibilities:` in prompt content.
### Pattern 2: Domain Expertise Leak
**Symptom:** Delegation prompt contains decision tables, heuristics, or examples.
```
# BAD — prompt embeds domain knowledge
Agent({
subagent_type: "gsd-planner",
prompt: "<objective>Create plans for phase 3</objective>
Remember: tasks should have 2-3 items max.
| TOO VAGUE | JUST RIGHT |
| 'Add auth' | 'Add JWT auth with refresh rotation' |"
})
# GOOD — agent's own <task_breakdown> section owns this knowledge
Agent({
subagent_type: "gsd-planner",
prompt: "<planning_context>
<files_to_read>...</files_to_read>
</planning_context>"
})
```
**Why it's wrong:** Domain knowledge in prompts duplicates agent content. When agent evolves, prompt doesn't update — they diverge. Agent's domain sections are the single source of truth.
**Exception — `<deep_work_rules>`:** GSD uses this as a cross-cutting policy block (not domain expertise per se) that applies anti-shallow-execution rules across all agents. This is acceptable because:
1. It's structural policy, not domain knowledge
2. It applies uniformly to all planning agents
3. It supplements (not duplicates) agent's own quality gate
**Detection:**
- Tables with `|` in prompt content (excluding `<files_to_read>` path tables)
- "Good:" / "Bad:" / "Example:" comparison pairs
- "Always..." / "Never..." / "Prefer..." heuristic statements
- Numbered rules lists (>3 items) that aren't revision instructions
### Pattern 3: Quality Gate Duplication
**Symptom:** Same quality check appears in both prompt and agent definition.
```
# PROMPT quality_gate
- [ ] Every task has `<read_first>`
- [ ] Every task has `<acceptance_criteria>`
- [ ] Dependencies correctly identified
# AGENT quality_gate
- [ ] Every task has `<read_first>` with at least the file being modified
- [ ] Every task has `<acceptance_criteria>` with grep-verifiable conditions
- [ ] Dependencies correctly identified
```
**Analysis:**
- "Dependencies correctly identified" → **duplicate** (exact match)
- "`<read_first>`" in both → **overlap** (prompt is less specific than agent)
- "`<acceptance_criteria>`" → **overlap** (same check, different specificity)
**When duplication is OK:** Prompt's `<quality_gate>` adds *invocation-specific* checks not in agent's permanent gate (e.g., "Phase requirement IDs all covered" is specific to this phase, not general).
**Detection:** Fuzzy match quality gate items between prompt and agent (>60% token overlap).
### Pattern 4: Output Format Conflict
**Symptom:** Command expects return markers the agent doesn't define.
```
# COMMAND handles:
- "## VERIFICATION PASSED" → continue
- "## ISSUES FOUND" → revision loop
# AGENT <output_contract> defines:
- "## TASK COMPLETE"
- "## TASK BLOCKED"
```
**Why it's wrong:** Command routes on markers. If markers don't match, routing breaks silently — command may hang or misinterpret results.
**Detection:** Extract return marker strings from both sides, compare sets.
### Pattern 5: Process Override
**Symptom:** Prompt dictates step-by-step process.
```
# BAD — prompt overrides agent's process
Agent({
subagent_type: "gsd-planner",
prompt: "Step 1: Read the roadmap. Step 2: Extract requirements.
Step 3: Create task breakdown. Step 4: Assign waves..."
})
# GOOD — prompt states objective, agent decides process
Agent({
subagent_type: "gsd-planner",
prompt: "<objective>Create plans for phase 5</objective>
<files_to_read>...</files_to_read>"
})
```
**Exception — Revision instructions:** `<instructions>` block in revision prompts is acceptable because it tells the agent *what changed* (checker issues), not *how to work*.
```
# OK — revision context, not process override
<instructions>
Make targeted updates to address checker issues.
Do NOT replan from scratch unless issues are fundamental.
Return what changed.
</instructions>
```
**Detection:** "Step N:" / "First..." / "Then..." / "Finally..." patterns in prompt content outside `<instructions>` blocks.
### Pattern 6: Scope Authority Conflict
**Symptom:** Prompt makes domain decisions the agent should own.
```
# BAD — prompt decides implementation details
Agent({
subagent_type: "gsd-planner",
prompt: "Use React Query for data fetching. Use Zustand for state management.
<objective>Plan the frontend architecture</objective>"
})
# GOOD — user decisions passed through from CONTEXT.md
Agent({
subagent_type: "gsd-planner",
prompt: "<planning_context>
<files_to_read>
- {context_path} (USER DECISIONS - locked: React Query, Zustand)
</files_to_read>
</planning_context>"
})
```
**Key distinction:**
- **Prompt making decisions** = conflict (command shouldn't have domain opinion)
- **Prompt passing through user decisions** = correct (user decisions flow through command to agent)
- **Agent interpreting user decisions** = correct (agent's `<context_fidelity>` handles locked/deferred/discretion)
**Detection:** Technical nouns (library names, architecture patterns) in prompt free text (not inside `<files_to_read>` path descriptions).
### Pattern 7: Missing Contracts
**Symptom:** Handoff points between command and agent are incomplete.
| Missing Element | Impact |
|-----------------|--------|
| Agent has no `<output_contract>` | Command can't route on return markers |
| Command doesn't handle all agent return markers | BLOCKED/CHECKPOINT silently ignored |
| Agent expects `<files_to_read>` but prompt doesn't provide it | Agent starts without context |
| Agent's "Spawned by:" doesn't list this command | Agent may not expect this invocation pattern |
| Agent has `<upstream_input>` but prompt doesn't match structure | Agent misinterprets input |
**Detection:** Cross-reference both sides for completeness.
## The `<deep_work_rules>` Exception
GSD's plan-phase uses `<deep_work_rules>` in delegation prompts. This is a deliberate design choice, not a violation:
1. **It's cross-cutting policy**: applies to ALL planning agents equally
2. **It's structural**: defines required fields (`<read_first>`, `<acceptance_criteria>`, `<action>` concreteness) — not domain expertise
3. **It supplements agent quality**: agent's own `<quality_gate>` is self-check; deep_work_rules is command-imposed minimum standard
4. **It's invocation-specific context**: different commands might impose different work rules
**Rule:** `<deep_work_rules>` in a delegation prompt is `info` level, not error. Flag only if its content duplicates agent's domain sections verbatim.
## Severity Classification
| Severity | When | Action Required |
|----------|------|-----------------|
| `error` | Actual conflict: contradictory content between prompt and agent | Must fix — move content to correct owner |
| `warning` | Duplication or boundary blur without contradiction | Should fix — consolidate to single source of truth |
| `info` | Acceptable pattern that looks like violation but isn't | No action — document why it's OK |
## Quick Reference: Is This Content in the Right Place?
| Content | In Prompt? | In Agent? |
|---------|-----------|-----------|
| "You are a..." | ❌ Never | ✅ Always |
| File paths for this invocation | ✅ Yes | ❌ No |
| Phase number, mode | ✅ Yes | ❌ No |
| Decision tables | ❌ Never | ✅ Always |
| Good/bad examples | ❌ Never | ✅ Always |
| "Write to: {path}" | ✅ Yes | ❌ No |
| Return markers handling | ✅ Yes (routing) | ✅ Yes (definition) |
| Quality gate | ✅ Per-invocation | ✅ Permanent self-check |
| "MUST read files first" | ❌ Agent's `<role>` owns this | ✅ Always |
| Anti-shallow rules | ⚠️ OK as cross-cutting policy | ✅ Preferred |
| Revision instructions | ✅ Yes (what changed) | ❌ No |
| Heuristics / philosophy | ❌ Never | ✅ Always |
| Banner display | ✅ Yes | ❌ Never |
| AskUserQuestion | ✅ Yes | ❌ Never |

View File

@@ -165,14 +165,14 @@ Generate a complete command file with:
3. **`<process>`** — numbered steps (GSD workflow style):
- Step 1: Initialize / parse arguments
- Steps 2-N: Domain-specific orchestration logic
- Each step: banner display, validation, agent spawning via `Task()`, error handling
- Each step: banner display, validation, agent spawning via `Agent()`, error handling
- Final step: status display + `<offer_next>` with next actions
4. **`<success_criteria>`** — checkbox list of verifiable conditions
**Command writing rules:**
- Steps are **numbered** (`## 1.`, `## 2.`) — follow `plan-phase.md` and `new-project.md` style
- Use banners for phase transitions: `━━━ SKILL ► ACTION ━━━`
- Agent spawning uses `Task(prompt, subagent_type, description)` pattern
- Agent spawning uses `Agent({ subagent_type, prompt, description, run_in_background })` pattern
- Prompt to agents uses `<objective>`, `<files_to_read>`, `<output>` blocks
- Include `<offer_next>` block with formatted completion status
- Handle agent return markers: `## TASK COMPLETE`, `## TASK BLOCKED`, `## CHECKPOINT REACHED`
@@ -286,7 +286,7 @@ Set `$TARGET_PATH = $SOURCE_PATH` (in-place conversion) unless user specifies ou
| `<process>` with numbered steps | At least 3 `## N.` headers |
| Step 1 is initialization | Parses args or loads context |
| Last step is status/report | Displays results or routes to `<offer_next>` |
| Agent spawning (if complex) | `Task(` call with `subagent_type` |
| Agent spawning (if complex) | `Agent({` call with `subagent_type` |
| Agent prompt structure | `<files_to_read>` + `<objective>` or `<output>` blocks |
| Return handling | Routes on `## TASK COMPLETE` / `## TASK BLOCKED` markers |
| `<offer_next>` | Banner + summary + next command suggestion |

View File

@@ -4,7 +4,7 @@ Guidelines for Claude Code **agent definition files** (role + domain expertise).
## Content Separation Principle
Agents are spawned by commands via `Task()`. The agent file defines WHO the agent is and WHAT it knows. It does NOT define WHEN or HOW it gets invoked.
Agents are spawned by commands via `Agent()`. The agent file defines WHO the agent is and WHAT it knows. It does NOT define WHEN or HOW it gets invoked.
| Concern | Belongs in Agent | Belongs in Command |
|---------|-----------------|-------------------|

View File

@@ -153,15 +153,15 @@ Display banners before major phase transitions (agent spawning, user decisions,
## Agent Spawning Pattern
Commands spawn agents via `Task()` with structured prompts:
Commands spawn agents via `Agent()` with structured prompts:
```markdown
Task(
prompt=filled_prompt,
subagent_type="agent-name",
model="{model}",
description="Verb Phase {X}"
)
```javascript
Agent({
subagent_type: "agent-name",
prompt: filled_prompt,
description: "Verb Phase {X}",
run_in_background: false
})
```
### Prompt Structure for Agents

View File

@@ -49,7 +49,7 @@ Conversion Summary:
| `## Auto Mode` / `## Auto Mode Defaults` | `<auto_mode>` section |
| `## Quick Reference` | Preserve as-is within appropriate section |
| Inline `AskUserQuestion` calls | Preserve verbatim — these belong in commands |
| `Task()` / agent spawning calls | Preserve verbatim within process steps |
| `Agent()` / agent spawning calls | Preserve verbatim within process steps |
| Banner displays (`━━━`) | Preserve verbatim |
| Code blocks (```bash, ```javascript, etc.) | **Preserve exactly** — never modify code content |
| Tables | **Preserve exactly** — never reformat table content |

View File

@@ -49,12 +49,13 @@ allowed-tools: {tools} # omit if unrestricted
{Construct prompt with <files_to_read>, <objective>, <output> blocks.}
```
Task(
prompt=filled_prompt,
subagent_type="{agent-name}",
description="{Verb} {target}"
)
```javascript
Agent({
subagent_type: "{agent-name}",
prompt: filled_prompt,
description: "{Verb} {target}",
run_in_background: false
})
```
## 4. Handle Result

View File

@@ -8,9 +8,11 @@ import {
checkVenvStatus,
executeCodexLens,
getVenvPythonPath,
useCodexLensV2,
} from '../../../tools/codex-lens.js';
import type { RouteContext } from '../types.js';
import { extractJSON, stripAnsiCodes } from './utils.js';
import type { ChildProcess } from 'child_process';
// File watcher state (persisted across requests)
let watcherProcess: any = null;
@@ -43,6 +45,29 @@ export async function stopWatcherForUninstall(): Promise<void> {
watcherProcess = null;
}
/**
* Spawn v2 bridge watcher subprocess.
* Runs 'codexlens-search watch --root X --debounce-ms Y' and reads JSONL stdout.
* @param root - Root directory to watch
* @param debounceMs - Debounce interval in milliseconds
* @returns Spawned child process
*/
function spawnV2Watcher(root: string, debounceMs: number): ChildProcess {
const { spawn } = require('child_process') as typeof import('child_process');
return spawn('codexlens-search', [
'watch',
'--root', root,
'--debounce-ms', String(debounceMs),
'--db-path', require('path').join(root, '.codexlens'),
], {
cwd: root,
shell: false,
stdio: ['ignore', 'pipe', 'pipe'],
windowsHide: true,
env: { ...process.env, PYTHONIOENCODING: 'utf-8' },
});
}
/**
* Handle CodexLens watcher routes
* @returns true if route was handled, false otherwise
@@ -91,7 +116,13 @@ export async function handleCodexLensWatcherRoutes(ctx: RouteContext): Promise<b
return { success: false, error: `Path is not a directory: ${targetPath}`, status: 400 };
}
// Get the codexlens CLI path
// Route to v2 or v1 watcher based on feature flag
if (useCodexLensV2()) {
// v2 bridge watcher: codexlens-search watch
console.log('[CodexLens] Using v2 bridge watcher');
watcherProcess = spawnV2Watcher(targetPath, debounceMs);
} else {
// v1 watcher: python -m codexlens watch
const venvStatus = await checkVenvStatus();
if (!venvStatus.ready) {
return { success: false, error: 'CodexLens not installed', status: 400 };
@@ -121,7 +152,6 @@ export async function handleCodexLensWatcherRoutes(ctx: RouteContext): Promise<b
}
// Spawn watch process using Python (no shell: true for security)
// CodexLens is a Python package, must run via python -m codexlens
const pythonPath = getVenvPythonPath();
const args = ['-m', 'codexlens', 'watch', targetPath, '--debounce', String(debounceMs)];
watcherProcess = spawn(pythonPath, args, {
@@ -131,6 +161,7 @@ export async function handleCodexLensWatcherRoutes(ctx: RouteContext): Promise<b
windowsHide: true,
env: { ...process.env, PYTHONIOENCODING: 'utf-8' }
});
}
watcherStats = {
running: true,
@@ -153,14 +184,38 @@ export async function handleCodexLensWatcherRoutes(ctx: RouteContext): Promise<b
}
// Handle process output for event counting
const isV2Watcher = useCodexLensV2();
let stdoutLineBuffer = '';
if (watcherProcess.stdout) {
watcherProcess.stdout.on('data', (data: Buffer) => {
const output = data.toString();
// Count processed events from output
if (isV2Watcher) {
// v2 bridge outputs JSONL - parse line by line
stdoutLineBuffer += output;
const lines = stdoutLineBuffer.split('\n');
// Keep incomplete last line in buffer
stdoutLineBuffer = lines.pop() || '';
for (const line of lines) {
const trimmed = line.trim();
if (!trimmed) continue;
try {
const event = JSON.parse(trimmed);
// Count file change events (created, modified, deleted, moved)
if (event.event && event.event !== 'watching') {
watcherStats.events_processed += 1;
}
} catch {
// Not valid JSON, skip
}
}
} else {
// v1 watcher: count text-based event messages
const matches = output.match(/Processed \d+ events?/g);
if (matches) {
watcherStats.events_processed += matches.length;
}
}
});
}

View File

@@ -1067,6 +1067,103 @@ async function installSemantic(gpuMode: GpuMode = 'cpu'): Promise<BootstrapResul
});
}
/**
* Check if codexlens-search (v2) bridge CLI is installed and functional.
* Runs 'codexlens-search status' and checks exit code.
* @returns true if the v2 bridge CLI is available
*/
function isCodexLensV2Installed(): boolean {
try {
const result = spawnSync('codexlens-search', ['status', '--db-path', '.codexlens'], {
encoding: 'utf-8',
timeout: EXEC_TIMEOUTS.PYTHON_VERSION,
stdio: ['pipe', 'pipe', 'pipe'],
windowsHide: true,
});
// Exit code 0 or valid JSON output means it's installed
return result.status === 0 || (result.stdout != null && result.stdout.includes('"status"'));
} catch {
return false;
}
}
/**
* Bootstrap codexlens-search (v2) package using UV.
* Installs 'codexlens-search[semantic]' into the shared CodexLens venv.
* @returns Bootstrap result
*/
async function bootstrapV2WithUv(): Promise<BootstrapResult> {
console.log('[CodexLens] Bootstrapping codexlens-search (v2) with UV...');
const preFlightError = preFlightCheck();
if (preFlightError) {
return { success: false, error: `Pre-flight failed: ${preFlightError}` };
}
repairVenvIfCorrupted();
const uvInstalled = await ensureUvInstalled();
if (!uvInstalled) {
return { success: false, error: 'Failed to install UV package manager' };
}
const uv = createCodexLensUvManager();
if (!uv.isVenvValid()) {
console.log('[CodexLens] Creating virtual environment with UV for v2...');
const createResult = await uv.createVenv();
if (!createResult.success) {
return { success: false, error: `Failed to create venv: ${createResult.error}` };
}
}
// Find local codexlens-search package using unified discovery
const { findCodexLensSearchPath } = await import('../utils/package-discovery.js');
const discovery = findCodexLensSearchPath();
const extras = ['semantic'];
const editable = isDevEnvironment() && !discovery.insideNodeModules;
if (!discovery.path) {
// Fallback: try installing from PyPI
console.log('[CodexLens] Local codexlens-search not found, trying PyPI install...');
const pipResult = await uv.install(['codexlens-search[semantic]']);
if (!pipResult.success) {
return {
success: false,
error: `Failed to install codexlens-search from PyPI: ${pipResult.error}`,
diagnostics: { venvPath: getCodexLensVenvDir(), installer: 'uv' },
};
}
} else {
console.log(`[CodexLens] Installing codexlens-search from local path with UV: ${discovery.path} (editable: ${editable})`);
const installResult = await uv.installFromProject(discovery.path, extras, editable);
if (!installResult.success) {
return {
success: false,
error: `Failed to install codexlens-search: ${installResult.error}`,
diagnostics: { packagePath: discovery.path, venvPath: getCodexLensVenvDir(), installer: 'uv', editable },
};
}
}
clearVenvStatusCache();
console.log('[CodexLens] codexlens-search (v2) bootstrap complete');
return {
success: true,
message: 'Installed codexlens-search (v2) with UV',
diagnostics: { packagePath: discovery.path ?? undefined, venvPath: getCodexLensVenvDir(), installer: 'uv', editable },
};
}
/**
* Check if v2 bridge should be used based on CCW_USE_CODEXLENS_V2 env var.
*/
function useCodexLensV2(): boolean {
const flag = process.env.CCW_USE_CODEXLENS_V2;
return flag === '1' || flag === 'true';
}
/**
* Bootstrap CodexLens venv with required packages
* @returns Bootstrap result
@@ -1074,6 +1171,23 @@ async function installSemantic(gpuMode: GpuMode = 'cpu'): Promise<BootstrapResul
async function bootstrapVenv(): Promise<BootstrapResult> {
const warnings: string[] = [];
// If v2 flag is set, also bootstrap codexlens-search alongside v1
if (useCodexLensV2() && await isUvAvailable()) {
try {
const v2Result = await bootstrapV2WithUv();
if (v2Result.success) {
console.log('[CodexLens] codexlens-search (v2) installed successfully');
} else {
console.warn(`[CodexLens] codexlens-search (v2) bootstrap failed: ${v2Result.error}`);
warnings.push(`v2 bootstrap failed: ${v2Result.error || 'Unknown error'}`);
}
} catch (v2Err) {
const msg = v2Err instanceof Error ? v2Err.message : String(v2Err);
console.warn(`[CodexLens] codexlens-search (v2) bootstrap error: ${msg}`);
warnings.push(`v2 bootstrap error: ${msg}`);
}
}
// Prefer UV if available (faster package resolution and installation)
if (await isUvAvailable()) {
console.log('[CodexLens] Using UV for bootstrap...');
@@ -2502,6 +2616,10 @@ export {
// UV-based installation functions
bootstrapWithUv,
installSemanticWithUv,
// v2 bridge support
useCodexLensV2,
isCodexLensV2Installed,
bootstrapV2WithUv,
};
// Export Python path for direct spawn usage (e.g., watcher)

View File

@@ -29,7 +29,9 @@ import {
ensureLiteLLMEmbedderReady,
executeCodexLens,
getVenvPythonPath,
useCodexLensV2,
} from './codex-lens.js';
import { execFile } from 'child_process';
import type { ProgressInfo } from './codex-lens.js';
import { getProjectRoot } from '../utils/path-validator.js';
import { getCodexLensDataDir } from '../utils/codexlens-path.js';
@@ -2774,6 +2776,90 @@ async function executeRipgrepMode(params: Params): Promise<SearchResult> {
});
}
// ========================================
// codexlens-search v2 bridge integration
// ========================================
/**
* Execute search via codexlens-search (v2) bridge CLI.
* Spawns 'codexlens-search search --query X --top-k Y --db-path Z' and parses JSON output.
*
* @param query - Search query string
* @param topK - Number of results to return
* @param dbPath - Path to the v2 index database directory
* @returns Parsed search results as SemanticMatch array
*/
async function executeCodexLensV2Bridge(
query: string,
topK: number,
dbPath: string,
): Promise<SearchResult> {
return new Promise((resolve) => {
const args = [
'search',
'--query', query,
'--top-k', String(topK),
'--db-path', dbPath,
];
execFile('codexlens-search', args, {
encoding: 'utf-8',
timeout: EXEC_TIMEOUTS.PROCESS_SPAWN,
windowsHide: true,
env: { ...process.env, PYTHONIOENCODING: 'utf-8' },
}, (error, stdout, stderr) => {
if (error) {
console.warn(`[CodexLens-v2] Bridge search failed: ${error.message}`);
resolve({
success: false,
error: `codexlens-search v2 bridge failed: ${error.message}`,
});
return;
}
try {
const parsed = JSON.parse(stdout.trim());
// Bridge outputs {"error": string} on failure
if (parsed && typeof parsed === 'object' && 'error' in parsed) {
resolve({
success: false,
error: `codexlens-search v2: ${parsed.error}`,
});
return;
}
// Bridge outputs array of {path, score, snippet}
const results: SemanticMatch[] = (Array.isArray(parsed) ? parsed : []).map((r: { path?: string; score?: number; snippet?: string }) => ({
file: r.path || '',
score: r.score || 0,
content: r.snippet || '',
symbol: null,
}));
resolve({
success: true,
results,
metadata: {
mode: 'semantic' as any,
backend: 'codexlens-v2',
count: results.length,
query,
note: 'Using codexlens-search v2 bridge (2-stage vector + reranking)',
},
});
} catch (parseErr) {
console.warn(`[CodexLens-v2] Failed to parse bridge output: ${(parseErr as Error).message}`);
resolve({
success: false,
error: `Failed to parse codexlens-search v2 output: ${(parseErr as Error).message}`,
output: stdout,
});
}
});
});
}
/**
* Mode: exact - CodexLens exact/FTS search
* Requires index
@@ -4276,7 +4362,21 @@ export async function handler(params: Record<string, unknown>): Promise<ToolResu
case 'search':
default:
// Handle search modes: fuzzy | semantic
// v2 bridge: if CCW_USE_CODEXLENS_V2 is set, try v2 bridge first for semantic mode
if (useCodexLensV2() && (mode === 'semantic' || mode === 'fuzzy')) {
const scope = resolveSearchScope(parsed.data.path ?? '.');
const dbPath = join(scope.workingDirectory, '.codexlens');
const topK = (parsed.data.maxResults || 5) + (parsed.data.extraFilesCount || 10);
const v2Result = await executeCodexLensV2Bridge(parsed.data.query || '', topK, dbPath);
if (v2Result.success) {
result = v2Result;
break;
}
// v2 failed, fall through to v1
console.warn(`[CodexLens-v2] Falling back to v1: ${v2Result.error}`);
}
// Handle search modes: fuzzy | semantic (v1 path)
switch (mode) {
case 'fuzzy':
result = await executeFuzzyMode(parsed.data);

View File

@@ -55,18 +55,20 @@ export interface PackageDiscoveryResult {
}
/** Known local package names */
export type LocalPackageName = 'codex-lens' | 'ccw-litellm';
export type LocalPackageName = 'codex-lens' | 'ccw-litellm' | 'codexlens-search';
/** Environment variable mapping for each package */
const PACKAGE_ENV_VARS: Record<LocalPackageName, string> = {
'codex-lens': 'CODEXLENS_PACKAGE_PATH',
'ccw-litellm': 'CCW_LITELLM_PATH',
'codexlens-search': 'CODEXLENS_SEARCH_PATH',
};
/** Config key mapping for each package */
const PACKAGE_CONFIG_KEYS: Record<LocalPackageName, string> = {
'codex-lens': 'codexLensPath',
'ccw-litellm': 'ccwLitellmPath',
'codexlens-search': 'codexlensSearchPath',
};
// ========================================
@@ -296,6 +298,13 @@ export function findCcwLitellmPath(): PackageDiscoveryResult {
return findPackagePath('ccw-litellm');
}
/**
* Find codexlens-search (v2) package path (convenience wrapper)
*/
export function findCodexLensSearchPath(): PackageDiscoveryResult {
return findPackagePath('codexlens-search');
}
/**
* Format search results for error messages
*/

21
codex-lens-v2/LICENSE Normal file
View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2026 codexlens-search contributors
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

146
codex-lens-v2/README.md Normal file
View File

@@ -0,0 +1,146 @@
# codexlens-search
Lightweight semantic code search engine with 2-stage vector search, full-text search, and Reciprocal Rank Fusion.
## Overview
codexlens-search provides fast, accurate code search through a multi-stage retrieval pipeline:
1. **Binary coarse search** - Hamming-distance filtering narrows candidates quickly
2. **ANN fine search** - HNSW or FAISS refines the candidate set with float vectors
3. **Full-text search** - SQLite FTS5 handles exact and fuzzy keyword matching
4. **RRF fusion** - Reciprocal Rank Fusion merges vector and text results
5. **Reranking** - Optional cross-encoder or API-based reranker for final ordering
The core library has **zero required dependencies**. Install optional extras to enable semantic search, GPU acceleration, or FAISS backends.
## Installation
```bash
# Core only (FTS search, no vector search)
pip install codexlens-search
# With semantic search (recommended)
pip install codexlens-search[semantic]
# Semantic search + GPU acceleration
pip install codexlens-search[semantic-gpu]
# With FAISS backend (CPU)
pip install codexlens-search[faiss-cpu]
# With API-based reranker
pip install codexlens-search[reranker-api]
# Everything (semantic + GPU + FAISS + reranker)
pip install codexlens-search[semantic-gpu,faiss-gpu,reranker-api]
```
## Quick Start
```python
from codexlens_search import Config, IndexingPipeline, SearchPipeline
from codexlens_search.core import create_ann_index, create_binary_index
from codexlens_search.embed.local import FastEmbedEmbedder
from codexlens_search.rerank.local import LocalReranker
from codexlens_search.search.fts import FTSEngine
# 1. Configure
config = Config(embed_model="BAAI/bge-small-en-v1.5", embed_dim=384)
# 2. Create components
embedder = FastEmbedEmbedder(config)
binary_store = create_binary_index(config, db_path="index/binary.db")
ann_index = create_ann_index(config, index_path="index/ann.bin")
fts = FTSEngine("index/fts.db")
reranker = LocalReranker()
# 3. Index files
indexer = IndexingPipeline(embedder, binary_store, ann_index, fts, config)
stats = indexer.index_directory("./src")
print(f"Indexed {stats.files_processed} files, {stats.chunks_created} chunks")
# 4. Search
pipeline = SearchPipeline(embedder, binary_store, ann_index, reranker, fts, config)
results = pipeline.search("authentication handler", top_k=10)
for r in results:
print(f" {r.path} (score={r.score:.3f})")
```
## Extras
| Extra | Dependencies | Description |
|-------|-------------|-------------|
| `semantic` | hnswlib, numpy, fastembed | Vector search with local embeddings |
| `gpu` | onnxruntime-gpu | GPU-accelerated embedding inference |
| `semantic-gpu` | semantic + gpu combined | Vector search with GPU acceleration |
| `faiss-cpu` | faiss-cpu | FAISS ANN backend (CPU) |
| `faiss-gpu` | faiss-gpu | FAISS ANN backend (GPU) |
| `reranker-api` | httpx | Remote reranker API client |
| `dev` | pytest, pytest-cov | Development and testing |
## Architecture
```
Query
|
v
[Embedder] --> query vector
|
+---> [BinaryStore.coarse_search] --> candidate IDs (Hamming distance)
| |
| v
+---> [ANNIndex.fine_search] ------> ranked IDs (cosine/L2)
| |
| v (intersect)
| vector_results
|
+---> [FTSEngine.exact_search] ----> exact text matches
+---> [FTSEngine.fuzzy_search] ----> fuzzy text matches
|
v
[RRF Fusion] --> merged ranking (adaptive weights by query intent)
|
v
[Reranker] --> final top-k results
```
### Key Design Decisions
- **2-stage vector search**: Binary coarse search (fast Hamming distance on binarized vectors) filters candidates before the more expensive ANN search. This keeps memory usage low and search fast even on large corpora.
- **Parallel retrieval**: Vector search and FTS run concurrently via ThreadPoolExecutor.
- **Adaptive fusion weights**: Query intent detection adjusts RRF weights between vector and text signals.
- **Backend abstraction**: ANN index supports both hnswlib and FAISS backends via a factory function.
- **Zero core dependencies**: The base package requires only Python 3.10+. All heavy dependencies are optional.
## Configuration
The `Config` dataclass controls all pipeline parameters:
```python
from codexlens_search import Config
config = Config(
embed_model="BAAI/bge-small-en-v1.5", # embedding model name
embed_dim=384, # embedding dimension
embed_batch_size=64, # batch size for embedding
ann_backend="auto", # 'auto', 'faiss', 'hnswlib'
binary_top_k=200, # binary coarse search candidates
ann_top_k=50, # ANN fine search candidates
fts_top_k=50, # FTS results per method
device="auto", # 'auto', 'cuda', 'cpu'
)
```
## Development
```bash
git clone https://github.com/nicepkg/codexlens-search.git
cd codexlens-search
pip install -e ".[dev,semantic]"
pytest
```
## License
MIT

Binary file not shown.

Binary file not shown.

View File

@@ -8,6 +8,26 @@ version = "0.2.0"
description = "Lightweight semantic code search engine — 2-stage vector + FTS + RRF fusion"
requires-python = ">=3.10"
dependencies = []
license = {text = "MIT"}
readme = "README.md"
authors = [
{name = "codexlens-search contributors"},
]
classifiers = [
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"License :: OSI Approved :: MIT License",
"Topic :: Software Development :: Libraries",
"Topic :: Text Processing :: Indexing",
"Operating System :: OS Independent",
]
[project.urls]
Homepage = "https://github.com/nicepkg/codexlens-search"
Repository = "https://github.com/nicepkg/codexlens-search"
[project.optional-dependencies]
semantic = [
@@ -27,10 +47,22 @@ faiss-gpu = [
reranker-api = [
"httpx>=0.25",
]
watcher = [
"watchdog>=3.0",
]
semantic-gpu = [
"hnswlib>=0.8.0",
"numpy>=1.26",
"fastembed>=0.4.0,<2.0",
"onnxruntime-gpu>=1.16",
]
dev = [
"pytest>=7.0",
"pytest-cov",
]
[project.scripts]
codexlens-search = "codexlens_search.bridge:main"
[tool.hatch.build.targets.wheel]
packages = ["src/codexlens_search"]

View File

@@ -0,0 +1,407 @@
"""CLI bridge for ccw integration.
Argparse-based CLI with JSON output protocol.
Each subcommand outputs a single JSON object to stdout.
Watch command outputs JSONL (one JSON per line).
All errors are JSON {"error": string} to stdout with non-zero exit code.
"""
from __future__ import annotations
import argparse
import glob
import json
import logging
import os
import sys
import time
from pathlib import Path
log = logging.getLogger("codexlens_search.bridge")
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _json_output(data: dict | list) -> None:
"""Print JSON to stdout with flush."""
print(json.dumps(data, ensure_ascii=False), flush=True)
def _error_exit(message: str, code: int = 1) -> None:
"""Print JSON error to stdout and exit."""
_json_output({"error": message})
sys.exit(code)
def _resolve_db_path(args: argparse.Namespace) -> Path:
"""Return the --db-path as a resolved Path, creating parent dirs."""
db_path = Path(args.db_path).resolve()
db_path.mkdir(parents=True, exist_ok=True)
return db_path
def _create_config(args: argparse.Namespace) -> "Config":
"""Build Config from CLI args."""
from codexlens_search.config import Config
kwargs: dict = {}
if hasattr(args, "embed_model") and args.embed_model:
kwargs["embed_model"] = args.embed_model
db_path = Path(args.db_path).resolve()
kwargs["metadata_db_path"] = str(db_path / "metadata.db")
return Config(**kwargs)
def _create_pipeline(
args: argparse.Namespace,
) -> tuple:
"""Lazily construct pipeline components from CLI args.
Returns (indexing_pipeline, search_pipeline, config).
Only loads embedder/reranker models when needed.
"""
from codexlens_search.config import Config
from codexlens_search.core.factory import create_ann_index, create_binary_index
from codexlens_search.embed.local import FastEmbedEmbedder
from codexlens_search.indexing.metadata import MetadataStore
from codexlens_search.indexing.pipeline import IndexingPipeline
from codexlens_search.rerank.local import FastEmbedReranker
from codexlens_search.search.fts import FTSEngine
from codexlens_search.search.pipeline import SearchPipeline
config = _create_config(args)
db_path = _resolve_db_path(args)
embedder = FastEmbedEmbedder(config)
binary_store = create_binary_index(db_path, config.embed_dim, config)
ann_index = create_ann_index(db_path, config.embed_dim, config)
fts = FTSEngine(db_path / "fts.db")
metadata = MetadataStore(db_path / "metadata.db")
reranker = FastEmbedReranker(config)
indexing = IndexingPipeline(
embedder=embedder,
binary_store=binary_store,
ann_index=ann_index,
fts=fts,
config=config,
metadata=metadata,
)
search = SearchPipeline(
embedder=embedder,
binary_store=binary_store,
ann_index=ann_index,
reranker=reranker,
fts=fts,
config=config,
metadata_store=metadata,
)
return indexing, search, config
# ---------------------------------------------------------------------------
# Subcommand handlers
# ---------------------------------------------------------------------------
def cmd_init(args: argparse.Namespace) -> None:
"""Initialize an empty index at --db-path."""
from codexlens_search.indexing.metadata import MetadataStore
from codexlens_search.search.fts import FTSEngine
db_path = _resolve_db_path(args)
# Create empty stores - just touch the metadata and FTS databases
MetadataStore(db_path / "metadata.db")
FTSEngine(db_path / "fts.db")
_json_output({
"status": "initialized",
"db_path": str(db_path),
})
def cmd_search(args: argparse.Namespace) -> None:
"""Run search query, output JSON array of results."""
_, search, _ = _create_pipeline(args)
results = search.search(args.query, top_k=args.top_k)
_json_output([
{"path": r.path, "score": r.score, "snippet": r.snippet}
for r in results
])
def cmd_index_file(args: argparse.Namespace) -> None:
"""Index a single file."""
indexing, _, _ = _create_pipeline(args)
file_path = Path(args.file).resolve()
if not file_path.is_file():
_error_exit(f"File not found: {file_path}")
root = Path(args.root).resolve() if args.root else None
stats = indexing.index_file(file_path, root=root)
_json_output({
"status": "indexed",
"file": str(file_path),
"files_processed": stats.files_processed,
"chunks_created": stats.chunks_created,
"duration_seconds": stats.duration_seconds,
})
def cmd_remove_file(args: argparse.Namespace) -> None:
"""Remove a file from the index."""
indexing, _, _ = _create_pipeline(args)
indexing.remove_file(args.file)
_json_output({
"status": "removed",
"file": args.file,
})
def cmd_sync(args: argparse.Namespace) -> None:
"""Sync index with files under --root matching --glob pattern."""
indexing, _, _ = _create_pipeline(args)
root = Path(args.root).resolve()
if not root.is_dir():
_error_exit(f"Root directory not found: {root}")
pattern = args.glob or "**/*"
file_paths = [
p for p in root.glob(pattern)
if p.is_file()
]
stats = indexing.sync(file_paths, root=root)
_json_output({
"status": "synced",
"root": str(root),
"files_processed": stats.files_processed,
"chunks_created": stats.chunks_created,
"duration_seconds": stats.duration_seconds,
})
def cmd_watch(args: argparse.Namespace) -> None:
"""Watch --root for changes, output JSONL events."""
root = Path(args.root).resolve()
if not root.is_dir():
_error_exit(f"Root directory not found: {root}")
debounce_ms = args.debounce_ms
try:
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler, FileSystemEvent
except ImportError:
_error_exit(
"watchdog is required for watch mode. "
"Install with: pip install watchdog"
)
class _JsonEventHandler(FileSystemEventHandler):
"""Emit JSONL for file events."""
def _emit(self, event_type: str, path: str) -> None:
_json_output({
"event": event_type,
"path": path,
"timestamp": time.time(),
})
def on_created(self, event: FileSystemEvent) -> None:
if not event.is_directory:
self._emit("created", event.src_path)
def on_modified(self, event: FileSystemEvent) -> None:
if not event.is_directory:
self._emit("modified", event.src_path)
def on_deleted(self, event: FileSystemEvent) -> None:
if not event.is_directory:
self._emit("deleted", event.src_path)
def on_moved(self, event: FileSystemEvent) -> None:
if not event.is_directory:
self._emit("moved", event.dest_path)
observer = Observer()
observer.schedule(_JsonEventHandler(), str(root), recursive=True)
observer.start()
_json_output({
"status": "watching",
"root": str(root),
"debounce_ms": debounce_ms,
})
try:
while True:
time.sleep(debounce_ms / 1000.0)
except KeyboardInterrupt:
observer.stop()
observer.join()
def cmd_download_models(args: argparse.Namespace) -> None:
"""Download embed + reranker models."""
from codexlens_search import model_manager
config = _create_config(args)
model_manager.ensure_model(config.embed_model, config)
model_manager.ensure_model(config.reranker_model, config)
_json_output({
"status": "downloaded",
"embed_model": config.embed_model,
"reranker_model": config.reranker_model,
})
def cmd_status(args: argparse.Namespace) -> None:
"""Report index statistics."""
from codexlens_search.indexing.metadata import MetadataStore
db_path = _resolve_db_path(args)
meta_path = db_path / "metadata.db"
if not meta_path.exists():
_json_output({
"status": "not_initialized",
"db_path": str(db_path),
})
return
metadata = MetadataStore(meta_path)
all_files = metadata.get_all_files()
deleted_ids = metadata.get_deleted_ids()
max_chunk = metadata.max_chunk_id()
_json_output({
"status": "ok",
"db_path": str(db_path),
"files_tracked": len(all_files),
"max_chunk_id": max_chunk,
"total_chunks_approx": max_chunk + 1 if max_chunk >= 0 else 0,
"deleted_chunks": len(deleted_ids),
})
# ---------------------------------------------------------------------------
# CLI parser
# ---------------------------------------------------------------------------
def _build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(
prog="codexlens-search",
description="Lightweight semantic code search - CLI bridge",
)
parser.add_argument(
"--db-path",
default=os.environ.get("CODEXLENS_DB_PATH", ".codexlens"),
help="Path to index database directory (default: .codexlens or $CODEXLENS_DB_PATH)",
)
parser.add_argument(
"--verbose", "-v",
action="store_true",
help="Enable debug logging to stderr",
)
sub = parser.add_subparsers(dest="command")
# init
sub.add_parser("init", help="Initialize empty index")
# search
p_search = sub.add_parser("search", help="Search the index")
p_search.add_argument("--query", "-q", required=True, help="Search query")
p_search.add_argument("--top-k", "-k", type=int, default=10, help="Number of results")
# index-file
p_index = sub.add_parser("index-file", help="Index a single file")
p_index.add_argument("--file", "-f", required=True, help="File path to index")
p_index.add_argument("--root", "-r", help="Root directory for relative paths")
# remove-file
p_remove = sub.add_parser("remove-file", help="Remove a file from index")
p_remove.add_argument("--file", "-f", required=True, help="Relative file path to remove")
# sync
p_sync = sub.add_parser("sync", help="Sync index with directory")
p_sync.add_argument("--root", "-r", required=True, help="Root directory to sync")
p_sync.add_argument("--glob", "-g", default="**/*", help="Glob pattern (default: **/*)")
# watch
p_watch = sub.add_parser("watch", help="Watch directory for changes (JSONL output)")
p_watch.add_argument("--root", "-r", required=True, help="Root directory to watch")
p_watch.add_argument("--debounce-ms", type=int, default=500, help="Debounce interval in ms")
# download-models
p_dl = sub.add_parser("download-models", help="Download embed + reranker models")
p_dl.add_argument("--embed-model", help="Override embed model name")
# status
sub.add_parser("status", help="Report index statistics")
return parser
def main() -> None:
"""CLI entry point."""
parser = _build_parser()
args = parser.parse_args()
# Configure logging
if args.verbose:
logging.basicConfig(
level=logging.DEBUG,
format="%(levelname)s %(name)s: %(message)s",
stream=sys.stderr,
)
else:
logging.basicConfig(
level=logging.WARNING,
format="%(levelname)s: %(message)s",
stream=sys.stderr,
)
if not args.command:
parser.print_help(sys.stderr)
sys.exit(1)
dispatch = {
"init": cmd_init,
"search": cmd_search,
"index-file": cmd_index_file,
"remove-file": cmd_remove_file,
"sync": cmd_sync,
"watch": cmd_watch,
"download-models": cmd_download_models,
"status": cmd_status,
}
handler = dispatch.get(args.command)
if handler is None:
_error_exit(f"Unknown command: {args.command}")
try:
handler(args)
except KeyboardInterrupt:
sys.exit(130)
except SystemExit:
raise
except Exception as exc:
log.debug("Command failed", exc_info=True)
_error_exit(str(exc))
if __name__ == "__main__":
main()

View File

@@ -49,6 +49,9 @@ class Config:
reranker_api_model: str = ""
reranker_api_max_tokens_per_batch: int = 2048
# Metadata store
metadata_db_path: str = "" # empty = no metadata tracking
# FTS
fts_top_k: int = 50

View File

@@ -1,5 +1,6 @@
from __future__ import annotations
from .metadata import MetadataStore
from .pipeline import IndexingPipeline, IndexStats
__all__ = ["IndexingPipeline", "IndexStats"]
__all__ = ["IndexingPipeline", "IndexStats", "MetadataStore"]

View File

@@ -0,0 +1,165 @@
"""SQLite-backed metadata store for file-to-chunk mapping and tombstone tracking."""
from __future__ import annotations
import sqlite3
from pathlib import Path
class MetadataStore:
"""Tracks file-to-chunk mappings and deleted chunk IDs (tombstones).
Tables:
files - file_path (PK), content_hash, last_modified
chunks - chunk_id (PK), file_path (FK CASCADE), chunk_hash
deleted_chunks - chunk_id (PK) for tombstone tracking
"""
def __init__(self, db_path: str | Path) -> None:
self._conn = sqlite3.connect(str(db_path), check_same_thread=False)
self._conn.execute("PRAGMA foreign_keys = ON")
self._conn.execute("PRAGMA journal_mode = WAL")
self._create_tables()
def _create_tables(self) -> None:
self._conn.executescript("""
CREATE TABLE IF NOT EXISTS files (
file_path TEXT PRIMARY KEY,
content_hash TEXT NOT NULL,
last_modified REAL NOT NULL
);
CREATE TABLE IF NOT EXISTS chunks (
chunk_id INTEGER PRIMARY KEY,
file_path TEXT NOT NULL,
chunk_hash TEXT NOT NULL DEFAULT '',
FOREIGN KEY (file_path) REFERENCES files(file_path) ON DELETE CASCADE
);
CREATE TABLE IF NOT EXISTS deleted_chunks (
chunk_id INTEGER PRIMARY KEY
);
""")
self._conn.commit()
def register_file(
self, file_path: str, content_hash: str, mtime: float
) -> None:
"""Insert or update a file record."""
self._conn.execute(
"INSERT OR REPLACE INTO files (file_path, content_hash, last_modified) "
"VALUES (?, ?, ?)",
(file_path, content_hash, mtime),
)
self._conn.commit()
def register_chunks(
self, file_path: str, chunk_ids_and_hashes: list[tuple[int, str]]
) -> None:
"""Register chunk IDs belonging to a file.
Args:
file_path: The owning file path (must already exist in files table).
chunk_ids_and_hashes: List of (chunk_id, chunk_hash) tuples.
"""
if not chunk_ids_and_hashes:
return
self._conn.executemany(
"INSERT OR REPLACE INTO chunks (chunk_id, file_path, chunk_hash) "
"VALUES (?, ?, ?)",
[(cid, file_path, chash) for cid, chash in chunk_ids_and_hashes],
)
self._conn.commit()
def mark_file_deleted(self, file_path: str) -> int:
"""Move all chunk IDs for a file to deleted_chunks, then remove the file.
Returns the number of chunks tombstoned.
"""
# Collect chunk IDs before CASCADE deletes them
rows = self._conn.execute(
"SELECT chunk_id FROM chunks WHERE file_path = ?", (file_path,)
).fetchall()
if not rows:
# Still remove the file record if it exists
self._conn.execute(
"DELETE FROM files WHERE file_path = ?", (file_path,)
)
self._conn.commit()
return 0
chunk_ids = [(r[0],) for r in rows]
self._conn.executemany(
"INSERT OR IGNORE INTO deleted_chunks (chunk_id) VALUES (?)",
chunk_ids,
)
# CASCADE deletes chunks rows automatically
self._conn.execute(
"DELETE FROM files WHERE file_path = ?", (file_path,)
)
self._conn.commit()
return len(chunk_ids)
def get_deleted_ids(self) -> set[int]:
"""Return all tombstoned chunk IDs for search-time filtering."""
rows = self._conn.execute(
"SELECT chunk_id FROM deleted_chunks"
).fetchall()
return {r[0] for r in rows}
def get_file_hash(self, file_path: str) -> str | None:
"""Return the stored content hash for a file, or None if not tracked."""
row = self._conn.execute(
"SELECT content_hash FROM files WHERE file_path = ?", (file_path,)
).fetchone()
return row[0] if row else None
def file_needs_update(self, file_path: str, content_hash: str) -> bool:
"""Check if a file needs re-indexing based on its content hash."""
stored = self.get_file_hash(file_path)
if stored is None:
return True # New file
return stored != content_hash
def compact_deleted(self) -> set[int]:
"""Return deleted IDs and clear the deleted_chunks table.
Call this after rebuilding the vector index to reclaim space.
"""
deleted = self.get_deleted_ids()
if deleted:
self._conn.execute("DELETE FROM deleted_chunks")
self._conn.commit()
return deleted
def get_chunk_ids_for_file(self, file_path: str) -> list[int]:
"""Return all chunk IDs belonging to a file."""
rows = self._conn.execute(
"SELECT chunk_id FROM chunks WHERE file_path = ?", (file_path,)
).fetchall()
return [r[0] for r in rows]
def get_all_files(self) -> dict[str, str]:
"""Return all tracked files as {file_path: content_hash}."""
rows = self._conn.execute(
"SELECT file_path, content_hash FROM files"
).fetchall()
return {r[0]: r[1] for r in rows}
def max_chunk_id(self) -> int:
"""Return the maximum chunk_id across chunks and deleted_chunks.
Returns -1 if no chunks exist, so that next_id = max_chunk_id() + 1
starts at 0 for an empty store.
"""
row = self._conn.execute(
"SELECT MAX(m) FROM ("
" SELECT MAX(chunk_id) AS m FROM chunks"
" UNION ALL"
" SELECT MAX(chunk_id) AS m FROM deleted_chunks"
")"
).fetchone()
return row[0] if row[0] is not None else -1
def close(self) -> None:
self._conn.close()

View File

@@ -5,6 +5,7 @@ The GIL is acceptable because embedding (onnxruntime) releases it in C extension
"""
from __future__ import annotations
import hashlib
import logging
import queue
import threading
@@ -18,6 +19,7 @@ from codexlens_search.config import Config
from codexlens_search.core.binary import BinaryStore
from codexlens_search.core.index import ANNIndex
from codexlens_search.embed.base import BaseEmbedder
from codexlens_search.indexing.metadata import MetadataStore
from codexlens_search.search.fts import FTSEngine
logger = logging.getLogger(__name__)
@@ -55,12 +57,14 @@ class IndexingPipeline:
ann_index: ANNIndex,
fts: FTSEngine,
config: Config,
metadata: MetadataStore | None = None,
) -> None:
self._embedder = embedder
self._binary_store = binary_store
self._ann_index = ann_index
self._fts = fts
self._config = config
self._metadata = metadata
def index_files(
self,
@@ -275,3 +279,271 @@ class IndexingPipeline:
chunks.append(("".join(current), path))
return chunks
# ------------------------------------------------------------------
# Incremental API
# ------------------------------------------------------------------
@staticmethod
def _content_hash(text: str) -> str:
"""Compute SHA-256 hex digest of file content."""
return hashlib.sha256(text.encode("utf-8", errors="replace")).hexdigest()
def _require_metadata(self) -> MetadataStore:
"""Return metadata store or raise if not configured."""
if self._metadata is None:
raise RuntimeError(
"MetadataStore is required for incremental indexing. "
"Pass metadata= to IndexingPipeline.__init__."
)
return self._metadata
def _next_chunk_id(self) -> int:
"""Return the next available chunk ID from MetadataStore."""
meta = self._require_metadata()
return meta.max_chunk_id() + 1
def index_file(
self,
file_path: Path,
*,
root: Path | None = None,
force: bool = False,
max_chunk_chars: int = _DEFAULT_MAX_CHUNK_CHARS,
chunk_overlap: int = _DEFAULT_CHUNK_OVERLAP,
max_file_size: int = 50_000,
) -> IndexStats:
"""Index a single file incrementally.
Skips files that have not changed (same content_hash) unless
*force* is True.
Args:
file_path: Path to the file to index.
root: Optional root for computing relative path identifiers.
force: Re-index even if content hash has not changed.
max_chunk_chars: Maximum characters per chunk.
chunk_overlap: Character overlap between consecutive chunks.
max_file_size: Skip files larger than this (bytes).
Returns:
IndexStats with counts and timing.
"""
meta = self._require_metadata()
t0 = time.monotonic()
# Read file
try:
if file_path.stat().st_size > max_file_size:
logger.debug("Skipping %s: exceeds max_file_size", file_path)
return IndexStats(duration_seconds=round(time.monotonic() - t0, 2))
text = file_path.read_text(encoding="utf-8", errors="replace")
except Exception as exc:
logger.debug("Skipping %s: %s", file_path, exc)
return IndexStats(duration_seconds=round(time.monotonic() - t0, 2))
content_hash = self._content_hash(text)
rel_path = str(file_path.relative_to(root)) if root else str(file_path)
# Check if update is needed
if not force and not meta.file_needs_update(rel_path, content_hash):
logger.debug("Skipping %s: unchanged", rel_path)
return IndexStats(duration_seconds=round(time.monotonic() - t0, 2))
# If file was previously indexed, remove old data first
if meta.get_file_hash(rel_path) is not None:
meta.mark_file_deleted(rel_path)
self._fts.delete_by_path(rel_path)
# Chunk
file_chunks = self._chunk_text(text, rel_path, max_chunk_chars, chunk_overlap)
if not file_chunks:
# Register file with no chunks
meta.register_file(rel_path, content_hash, file_path.stat().st_mtime)
return IndexStats(
files_processed=1,
duration_seconds=round(time.monotonic() - t0, 2),
)
# Assign chunk IDs
start_id = self._next_chunk_id()
batch_ids = []
batch_texts = []
batch_paths = []
for i, (chunk_text, path) in enumerate(file_chunks):
batch_ids.append(start_id + i)
batch_texts.append(chunk_text)
batch_paths.append(path)
# Embed synchronously
vecs = self._embedder.embed_batch(batch_texts)
vec_array = np.array(vecs, dtype=np.float32)
id_array = np.array(batch_ids, dtype=np.int64)
# Index: write to stores
self._binary_store.add(id_array, vec_array)
self._ann_index.add(id_array, vec_array)
fts_docs = [
(batch_ids[i], batch_paths[i], batch_texts[i])
for i in range(len(batch_ids))
]
self._fts.add_documents(fts_docs)
# Register in metadata
meta.register_file(rel_path, content_hash, file_path.stat().st_mtime)
chunk_id_hashes = [
(batch_ids[i], self._content_hash(batch_texts[i]))
for i in range(len(batch_ids))
]
meta.register_chunks(rel_path, chunk_id_hashes)
# Flush stores
self._binary_store.save()
self._ann_index.save()
duration = time.monotonic() - t0
stats = IndexStats(
files_processed=1,
chunks_created=len(batch_ids),
duration_seconds=round(duration, 2),
)
logger.info(
"Indexed file %s: %d chunks in %.2fs",
rel_path, stats.chunks_created, stats.duration_seconds,
)
return stats
def remove_file(self, file_path: str) -> None:
"""Mark a file as deleted via tombstone strategy.
Marks all chunk IDs for the file in MetadataStore.deleted_chunks
and removes the file's FTS entries.
Args:
file_path: The relative path identifier of the file to remove.
"""
meta = self._require_metadata()
count = meta.mark_file_deleted(file_path)
fts_count = self._fts.delete_by_path(file_path)
logger.info(
"Removed file %s: %d chunks tombstoned, %d FTS entries deleted",
file_path, count, fts_count,
)
def sync(
self,
file_paths: list[Path],
*,
root: Path | None = None,
max_chunk_chars: int = _DEFAULT_MAX_CHUNK_CHARS,
chunk_overlap: int = _DEFAULT_CHUNK_OVERLAP,
max_file_size: int = 50_000,
) -> IndexStats:
"""Reconcile index state against a current file list.
Identifies files that are new, changed, or removed and processes
each accordingly.
Args:
file_paths: Current list of files that should be indexed.
root: Optional root for computing relative path identifiers.
max_chunk_chars: Maximum characters per chunk.
chunk_overlap: Character overlap between consecutive chunks.
max_file_size: Skip files larger than this (bytes).
Returns:
Aggregated IndexStats for all operations.
"""
meta = self._require_metadata()
t0 = time.monotonic()
# Build set of current relative paths
current_rel_paths: dict[str, Path] = {}
for fpath in file_paths:
rel = str(fpath.relative_to(root)) if root else str(fpath)
current_rel_paths[rel] = fpath
# Get known files from metadata
known_files = meta.get_all_files() # {rel_path: content_hash}
# Detect removed files
removed = set(known_files.keys()) - set(current_rel_paths.keys())
for rel in removed:
self.remove_file(rel)
# Index new and changed files
total_files = 0
total_chunks = 0
for rel, fpath in current_rel_paths.items():
stats = self.index_file(
fpath,
root=root,
max_chunk_chars=max_chunk_chars,
chunk_overlap=chunk_overlap,
max_file_size=max_file_size,
)
total_files += stats.files_processed
total_chunks += stats.chunks_created
duration = time.monotonic() - t0
result = IndexStats(
files_processed=total_files,
chunks_created=total_chunks,
duration_seconds=round(duration, 2),
)
logger.info(
"Sync complete: %d files indexed, %d chunks created, "
"%d files removed in %.1fs",
result.files_processed, result.chunks_created,
len(removed), result.duration_seconds,
)
return result
def compact(self) -> None:
"""Rebuild indexes excluding tombstoned chunk IDs.
Reads all deleted IDs from MetadataStore, rebuilds BinaryStore
and ANNIndex without those entries, then clears the
deleted_chunks table.
"""
meta = self._require_metadata()
deleted_ids = meta.compact_deleted()
if not deleted_ids:
logger.debug("Compact: no deleted IDs, nothing to do")
return
logger.info("Compact: rebuilding indexes, excluding %d deleted IDs", len(deleted_ids))
# Rebuild BinaryStore: read current data, filter, replace
if self._binary_store._count > 0:
active_ids = self._binary_store._ids[: self._binary_store._count]
active_matrix = self._binary_store._matrix[: self._binary_store._count]
mask = ~np.isin(active_ids, list(deleted_ids))
kept_ids = active_ids[mask]
kept_matrix = active_matrix[mask]
# Reset store
self._binary_store._count = 0
self._binary_store._matrix = None
self._binary_store._ids = None
if len(kept_ids) > 0:
self._binary_store._ensure_capacity(len(kept_ids))
self._binary_store._matrix[: len(kept_ids)] = kept_matrix
self._binary_store._ids[: len(kept_ids)] = kept_ids
self._binary_store._count = len(kept_ids)
self._binary_store.save()
# Rebuild ANNIndex: must reconstruct from scratch since HNSW
# does not support deletion. We re-initialize and re-add kept items.
# Note: we need the float32 vectors, but BinaryStore only has quantized.
# ANNIndex (hnswlib) supports mark_deleted, but compact means full rebuild.
# Since we don't have original float vectors cached, we rely on the fact
# that ANNIndex.mark_deleted is not available in all hnswlib versions.
# Instead, we reinitialize the index and let future searches filter via
# deleted_ids at query time. The BinaryStore is already compacted above.
# For a full ANN rebuild, the caller should re-run index_files() on all
# files after compact.
logger.info(
"Compact: BinaryStore rebuilt (%d entries kept). "
"Note: ANNIndex retains stale entries; run full re-index for clean ANN state.",
self._binary_store._count,
)

View File

@@ -67,3 +67,28 @@ class FTSEngine:
"SELECT content FROM docs WHERE rowid = ?", (doc_id,)
).fetchone()
return row[0] if row else ""
def get_chunk_ids_by_path(self, path: str) -> list[int]:
"""Return all doc IDs associated with a given file path."""
rows = self._conn.execute(
"SELECT id FROM docs_meta WHERE path = ?", (path,)
).fetchall()
return [r[0] for r in rows]
def delete_by_path(self, path: str) -> int:
"""Delete all docs and docs_meta rows for a given file path.
Returns the number of deleted documents.
"""
ids = self.get_chunk_ids_by_path(path)
if not ids:
return 0
placeholders = ",".join("?" for _ in ids)
self._conn.execute(
f"DELETE FROM docs WHERE rowid IN ({placeholders})", ids
)
self._conn.execute(
f"DELETE FROM docs_meta WHERE id IN ({placeholders})", ids
)
self._conn.commit()
return len(ids)

View File

@@ -9,6 +9,7 @@ import numpy as np
from ..config import Config
from ..core import ANNIndex, BinaryStore
from ..embed import BaseEmbedder
from ..indexing.metadata import MetadataStore
from ..rerank import BaseReranker
from .fts import FTSEngine
from .fusion import (
@@ -38,6 +39,7 @@ class SearchPipeline:
reranker: BaseReranker,
fts: FTSEngine,
config: Config,
metadata_store: MetadataStore | None = None,
) -> None:
self._embedder = embedder
self._binary_store = binary_store
@@ -45,6 +47,7 @@ class SearchPipeline:
self._reranker = reranker
self._fts = fts
self._config = config
self._metadata_store = metadata_store
# -- Helper: vector search (binary coarse + ANN fine) -----------------
@@ -137,6 +140,16 @@ class SearchPipeline:
fused = reciprocal_rank_fusion(fusion_input, weights=weights, k=cfg.fusion_k)
# 4b. Filter out deleted IDs (tombstone filtering)
if self._metadata_store is not None:
deleted_ids = self._metadata_store.get_deleted_ids()
if deleted_ids:
fused = [
(doc_id, score)
for doc_id, score in fused
if doc_id not in deleted_ids
]
# 5. Rerank top candidates
rerank_ids = [doc_id for doc_id, _ in fused[:50]]
contents = [self._fts.get_content(doc_id) for doc_id in rerank_ids]

View File

@@ -0,0 +1,17 @@
"""File watcher and incremental indexer for codexlens-search.
Requires the ``watcher`` extra::
pip install codexlens-search[watcher]
"""
from codexlens_search.watcher.events import ChangeType, FileEvent, WatcherConfig
from codexlens_search.watcher.file_watcher import FileWatcher
from codexlens_search.watcher.incremental_indexer import IncrementalIndexer
__all__ = [
"ChangeType",
"FileEvent",
"FileWatcher",
"IncrementalIndexer",
"WatcherConfig",
]

View File

@@ -0,0 +1,57 @@
"""Event types for file watcher."""
from __future__ import annotations
import time
from dataclasses import dataclass, field
from enum import Enum
from pathlib import Path
from typing import Optional, Set
class ChangeType(Enum):
"""Type of file system change."""
CREATED = "created"
MODIFIED = "modified"
DELETED = "deleted"
@dataclass
class FileEvent:
"""A file system change event."""
path: Path
change_type: ChangeType
timestamp: float = field(default_factory=time.time)
@dataclass
class WatcherConfig:
"""Configuration for file watcher.
Attributes:
debounce_ms: Milliseconds to wait after the last event before
flushing the batch. Default 500ms for low-latency indexing.
ignored_patterns: Directory/file name patterns to skip. Any
path component matching one of these strings is ignored.
"""
debounce_ms: int = 500
ignored_patterns: Set[str] = field(default_factory=lambda: {
# Version control
".git", ".svn", ".hg",
# Python
".venv", "venv", "env", "__pycache__", ".pytest_cache",
".mypy_cache", ".ruff_cache",
# Node.js
"node_modules", "bower_components",
# Build artifacts
"dist", "build", "out", "target", "bin", "obj",
"coverage", "htmlcov",
# IDE / Editor
".idea", ".vscode", ".vs",
# Package / cache
".cache", ".parcel-cache", ".turbo", ".next", ".nuxt",
# Logs / temp
"logs", "tmp", "temp",
})

View File

@@ -0,0 +1,263 @@
"""File system watcher using watchdog library.
Ported from codex-lens v1 with simplifications:
- Removed v1-specific Config dependency (uses WatcherConfig directly)
- Removed MAX_QUEUE_SIZE (v2 processes immediately via debounce)
- Removed flush.signal file mechanism
- Added optional JSONL output mode for bridge CLI integration
"""
from __future__ import annotations
import json
import logging
import sys
import threading
import time
from pathlib import Path
from typing import Callable, Dict, List, Optional
from watchdog.events import FileSystemEventHandler
from watchdog.observers import Observer
from .events import ChangeType, FileEvent, WatcherConfig
logger = logging.getLogger(__name__)
# Event priority for deduplication: higher wins when same file appears
# multiple times within one debounce window.
_EVENT_PRIORITY: Dict[ChangeType, int] = {
ChangeType.CREATED: 1,
ChangeType.MODIFIED: 2,
ChangeType.DELETED: 3,
}
class _Handler(FileSystemEventHandler):
"""Internal watchdog handler that converts events to FileEvent."""
def __init__(self, watcher: FileWatcher) -> None:
super().__init__()
self._watcher = watcher
def on_created(self, event) -> None:
if not event.is_directory:
self._watcher._on_raw_event(event.src_path, ChangeType.CREATED)
def on_modified(self, event) -> None:
if not event.is_directory:
self._watcher._on_raw_event(event.src_path, ChangeType.MODIFIED)
def on_deleted(self, event) -> None:
if not event.is_directory:
self._watcher._on_raw_event(event.src_path, ChangeType.DELETED)
def on_moved(self, event) -> None:
if event.is_directory:
return
# Treat move as delete old + create new
self._watcher._on_raw_event(event.src_path, ChangeType.DELETED)
self._watcher._on_raw_event(event.dest_path, ChangeType.CREATED)
class FileWatcher:
"""File system watcher with debounce and event deduplication.
Monitors a directory recursively using watchdog. Raw events are
collected into a queue. After *debounce_ms* of silence the queue
is flushed: events are deduplicated per-path (keeping the highest
priority change type) and delivered via *on_changes*.
Example::
def handle(events: list[FileEvent]) -> None:
for e in events:
print(e.change_type.value, e.path)
watcher = FileWatcher(Path("."), WatcherConfig(), handle)
watcher.start()
watcher.wait()
"""
def __init__(
self,
root_path: Path,
config: WatcherConfig,
on_changes: Callable[[List[FileEvent]], None],
) -> None:
self.root_path = Path(root_path).resolve()
self.config = config
self.on_changes = on_changes
self._observer: Optional[Observer] = None
self._running = False
self._stop_event = threading.Event()
self._lock = threading.RLock()
# Pending events keyed by resolved path
self._pending: Dict[Path, FileEvent] = {}
self._pending_lock = threading.Lock()
# True-debounce timer: resets on every new event
self._flush_timer: Optional[threading.Timer] = None
# ------------------------------------------------------------------
# Filtering
# ------------------------------------------------------------------
def _should_watch(self, path: Path) -> bool:
"""Return True if *path* should not be ignored."""
parts = path.parts
for pattern in self.config.ignored_patterns:
if pattern in parts:
return False
return True
# ------------------------------------------------------------------
# Event intake (called from watchdog thread)
# ------------------------------------------------------------------
def _on_raw_event(self, raw_path: str, change_type: ChangeType) -> None:
"""Accept a raw watchdog event, filter, and queue with debounce."""
path = Path(raw_path).resolve()
if not self._should_watch(path):
return
event = FileEvent(path=path, change_type=change_type)
with self._pending_lock:
existing = self._pending.get(path)
if existing is None or _EVENT_PRIORITY[change_type] >= _EVENT_PRIORITY[existing.change_type]:
self._pending[path] = event
# Cancel previous timer and start a new one (true debounce)
if self._flush_timer is not None:
self._flush_timer.cancel()
self._flush_timer = threading.Timer(
self.config.debounce_ms / 1000.0,
self._flush,
)
self._flush_timer.daemon = True
self._flush_timer.start()
# ------------------------------------------------------------------
# Flush
# ------------------------------------------------------------------
def _flush(self) -> None:
"""Deduplicate and deliver pending events."""
with self._pending_lock:
if not self._pending:
return
events = list(self._pending.values())
self._pending.clear()
self._flush_timer = None
try:
self.on_changes(events)
except Exception:
logger.exception("Error in on_changes callback")
def flush_now(self) -> None:
"""Immediately flush pending events (manual trigger)."""
with self._pending_lock:
if self._flush_timer is not None:
self._flush_timer.cancel()
self._flush_timer = None
self._flush()
# ------------------------------------------------------------------
# Lifecycle
# ------------------------------------------------------------------
def start(self) -> None:
"""Start watching the directory (non-blocking)."""
with self._lock:
if self._running:
logger.warning("Watcher already running")
return
if not self.root_path.exists():
raise ValueError(f"Root path does not exist: {self.root_path}")
self._observer = Observer()
handler = _Handler(self)
self._observer.schedule(handler, str(self.root_path), recursive=True)
self._running = True
self._stop_event.clear()
self._observer.start()
logger.info("Started watching: %s", self.root_path)
def stop(self) -> None:
"""Stop watching and flush remaining events."""
with self._lock:
if not self._running:
return
self._running = False
self._stop_event.set()
with self._pending_lock:
if self._flush_timer is not None:
self._flush_timer.cancel()
self._flush_timer = None
if self._observer is not None:
self._observer.stop()
self._observer.join(timeout=5.0)
self._observer = None
# Deliver any remaining events
self._flush()
logger.info("Stopped watching: %s", self.root_path)
def wait(self) -> None:
"""Block until stopped (Ctrl+C or stop() from another thread)."""
try:
while self._running:
self._stop_event.wait(timeout=1.0)
except KeyboardInterrupt:
logger.info("Received interrupt, stopping watcher...")
self.stop()
@property
def is_running(self) -> bool:
"""True if the watcher is currently running."""
return self._running
# ------------------------------------------------------------------
# JSONL output helper
# ------------------------------------------------------------------
@staticmethod
def events_to_jsonl(events: List[FileEvent]) -> str:
"""Serialize a batch of events as newline-delimited JSON.
Each line is a JSON object with keys: ``path``, ``change_type``,
``timestamp``. Useful for bridge CLI integration.
"""
lines: list[str] = []
for evt in events:
obj = {
"path": str(evt.path),
"change_type": evt.change_type.value,
"timestamp": evt.timestamp,
}
lines.append(json.dumps(obj, ensure_ascii=False))
return "\n".join(lines)
@staticmethod
def jsonl_callback(events: List[FileEvent]) -> None:
"""Callback that writes JSONL to stdout.
Suitable as *on_changes* when running in bridge/CLI mode::
watcher = FileWatcher(root, config, FileWatcher.jsonl_callback)
"""
output = FileWatcher.events_to_jsonl(events)
if output:
sys.stdout.write(output + "\n")
sys.stdout.flush()

View File

@@ -0,0 +1,129 @@
"""Incremental indexer that processes FileEvents via IndexingPipeline.
Ported from codex-lens v1 with simplifications:
- Uses IndexingPipeline.index_file() / remove_file() directly
- No v1-specific Config, ParserFactory, DirIndexStore dependencies
- Per-file error isolation: one failure does not stop batch processing
"""
from __future__ import annotations
import logging
from dataclasses import dataclass, field
from pathlib import Path
from typing import List, Optional
from codexlens_search.indexing.pipeline import IndexingPipeline
from .events import ChangeType, FileEvent
logger = logging.getLogger(__name__)
@dataclass
class BatchResult:
"""Result of processing a batch of file events."""
files_indexed: int = 0
files_removed: int = 0
chunks_created: int = 0
errors: List[str] = field(default_factory=list)
@property
def total_processed(self) -> int:
return self.files_indexed + self.files_removed
@property
def has_errors(self) -> bool:
return len(self.errors) > 0
class IncrementalIndexer:
"""Routes file change events to IndexingPipeline operations.
CREATED / MODIFIED events call ``pipeline.index_file()``.
DELETED events call ``pipeline.remove_file()``.
Each file is processed in isolation so that a single failure
does not prevent the rest of the batch from being indexed.
Example::
indexer = IncrementalIndexer(pipeline, root=Path("/project"))
result = indexer.process_events([
FileEvent(Path("src/main.py"), ChangeType.MODIFIED),
])
print(f"Indexed {result.files_indexed}, removed {result.files_removed}")
"""
def __init__(
self,
pipeline: IndexingPipeline,
*,
root: Optional[Path] = None,
) -> None:
"""Initialize the incremental indexer.
Args:
pipeline: The indexing pipeline with metadata store configured.
root: Optional project root for computing relative paths.
If None, absolute paths are used as identifiers.
"""
self._pipeline = pipeline
self._root = root
def process_events(self, events: List[FileEvent]) -> BatchResult:
"""Process a batch of file events with per-file error isolation.
Args:
events: List of file events to process.
Returns:
BatchResult with per-batch statistics.
"""
result = BatchResult()
for event in events:
try:
if event.change_type in (ChangeType.CREATED, ChangeType.MODIFIED):
self._handle_index(event, result)
elif event.change_type == ChangeType.DELETED:
self._handle_remove(event, result)
except Exception as exc:
error_msg = (
f"Error processing {event.path} "
f"({event.change_type.value}): "
f"{type(exc).__name__}: {exc}"
)
logger.error(error_msg)
result.errors.append(error_msg)
if result.total_processed > 0:
logger.info(
"Batch complete: %d indexed, %d removed, %d errors",
result.files_indexed,
result.files_removed,
len(result.errors),
)
return result
def _handle_index(self, event: FileEvent, result: BatchResult) -> None:
"""Index a created or modified file."""
stats = self._pipeline.index_file(
event.path,
root=self._root,
force=(event.change_type == ChangeType.MODIFIED),
)
if stats.files_processed > 0:
result.files_indexed += 1
result.chunks_created += stats.chunks_created
def _handle_remove(self, event: FileEvent, result: BatchResult) -> None:
"""Remove a deleted file from the index."""
rel_path = (
str(event.path.relative_to(self._root))
if self._root
else str(event.path)
)
self._pipeline.remove_file(rel_path)
result.files_removed += 1

View File

@@ -0,0 +1,388 @@
"""Unit tests for IndexingPipeline incremental API (index_file, remove_file, sync, compact)."""
from __future__ import annotations
import tempfile
from pathlib import Path
from unittest.mock import MagicMock
import numpy as np
import pytest
from codexlens_search.config import Config
from codexlens_search.core.binary import BinaryStore
from codexlens_search.core.index import ANNIndex
from codexlens_search.embed.base import BaseEmbedder
from codexlens_search.indexing.metadata import MetadataStore
from codexlens_search.indexing.pipeline import IndexingPipeline, IndexStats
from codexlens_search.search.fts import FTSEngine
DIM = 32
class FakeEmbedder(BaseEmbedder):
"""Deterministic embedder for testing."""
def __init__(self) -> None:
pass
def embed_single(self, text: str) -> np.ndarray:
rng = np.random.default_rng(hash(text) % (2**31))
return rng.standard_normal(DIM).astype(np.float32)
def embed_batch(self, texts: list[str]) -> list[np.ndarray]:
return [self.embed_single(t) for t in texts]
@pytest.fixture
def workspace(tmp_path: Path):
"""Create workspace with stores, metadata, and pipeline."""
cfg = Config.small()
# Override embed_dim to match our test dim
cfg.embed_dim = DIM
store_dir = tmp_path / "stores"
store_dir.mkdir()
binary_store = BinaryStore(store_dir, DIM, cfg)
ann_index = ANNIndex(store_dir, DIM, cfg)
fts = FTSEngine(str(store_dir / "fts.db"))
metadata = MetadataStore(str(store_dir / "metadata.db"))
embedder = FakeEmbedder()
pipeline = IndexingPipeline(
embedder=embedder,
binary_store=binary_store,
ann_index=ann_index,
fts=fts,
config=cfg,
metadata=metadata,
)
# Create sample source files
src_dir = tmp_path / "src"
src_dir.mkdir()
return {
"pipeline": pipeline,
"metadata": metadata,
"binary_store": binary_store,
"ann_index": ann_index,
"fts": fts,
"src_dir": src_dir,
"store_dir": store_dir,
"config": cfg,
}
def _write_file(src_dir: Path, name: str, content: str) -> Path:
"""Write a file and return its path."""
p = src_dir / name
p.write_text(content, encoding="utf-8")
return p
# ---------------------------------------------------------------------------
# MetadataStore helper method tests
# ---------------------------------------------------------------------------
class TestMetadataHelpers:
def test_get_all_files_empty(self, workspace):
meta = workspace["metadata"]
assert meta.get_all_files() == {}
def test_get_all_files_after_register(self, workspace):
meta = workspace["metadata"]
meta.register_file("a.py", "hash_a", 1000.0)
meta.register_file("b.py", "hash_b", 2000.0)
result = meta.get_all_files()
assert result == {"a.py": "hash_a", "b.py": "hash_b"}
def test_max_chunk_id_empty(self, workspace):
meta = workspace["metadata"]
assert meta.max_chunk_id() == -1
def test_max_chunk_id_with_chunks(self, workspace):
meta = workspace["metadata"]
meta.register_file("a.py", "hash_a", 1000.0)
meta.register_chunks("a.py", [(0, "h0"), (1, "h1"), (5, "h5")])
assert meta.max_chunk_id() == 5
def test_max_chunk_id_includes_deleted(self, workspace):
meta = workspace["metadata"]
meta.register_file("a.py", "hash_a", 1000.0)
meta.register_chunks("a.py", [(0, "h0"), (3, "h3")])
meta.mark_file_deleted("a.py")
# Chunks moved to deleted_chunks, max should still be 3
assert meta.max_chunk_id() == 3
# ---------------------------------------------------------------------------
# index_file tests
# ---------------------------------------------------------------------------
class TestIndexFile:
def test_index_file_basic(self, workspace):
pipeline = workspace["pipeline"]
meta = workspace["metadata"]
src_dir = workspace["src_dir"]
f = _write_file(src_dir, "hello.py", "print('hello world')\n")
stats = pipeline.index_file(f, root=src_dir)
assert stats.files_processed == 1
assert stats.chunks_created >= 1
assert meta.get_file_hash("hello.py") is not None
assert len(meta.get_chunk_ids_for_file("hello.py")) >= 1
def test_index_file_skips_unchanged(self, workspace):
pipeline = workspace["pipeline"]
src_dir = workspace["src_dir"]
f = _write_file(src_dir, "same.py", "x = 1\n")
stats1 = pipeline.index_file(f, root=src_dir)
assert stats1.files_processed == 1
stats2 = pipeline.index_file(f, root=src_dir)
assert stats2.files_processed == 0
assert stats2.chunks_created == 0
def test_index_file_force_reindex(self, workspace):
pipeline = workspace["pipeline"]
src_dir = workspace["src_dir"]
f = _write_file(src_dir, "force.py", "x = 1\n")
pipeline.index_file(f, root=src_dir)
stats = pipeline.index_file(f, root=src_dir, force=True)
assert stats.files_processed == 1
assert stats.chunks_created >= 1
def test_index_file_updates_changed_file(self, workspace):
pipeline = workspace["pipeline"]
meta = workspace["metadata"]
src_dir = workspace["src_dir"]
f = _write_file(src_dir, "changing.py", "version = 1\n")
pipeline.index_file(f, root=src_dir)
old_chunks = meta.get_chunk_ids_for_file("changing.py")
# Modify file
f.write_text("version = 2\nmore code\n", encoding="utf-8")
stats = pipeline.index_file(f, root=src_dir)
assert stats.files_processed == 1
new_chunks = meta.get_chunk_ids_for_file("changing.py")
# Old chunks should have been tombstoned, new ones assigned
assert set(old_chunks) != set(new_chunks)
def test_index_file_registers_in_metadata(self, workspace):
pipeline = workspace["pipeline"]
meta = workspace["metadata"]
fts = workspace["fts"]
src_dir = workspace["src_dir"]
f = _write_file(src_dir, "meta_test.py", "def foo(): pass\n")
pipeline.index_file(f, root=src_dir)
# MetadataStore has file registered
assert meta.get_file_hash("meta_test.py") is not None
chunk_ids = meta.get_chunk_ids_for_file("meta_test.py")
assert len(chunk_ids) >= 1
# FTS has the content
fts_ids = fts.get_chunk_ids_by_path("meta_test.py")
assert len(fts_ids) >= 1
def test_index_file_no_metadata_raises(self, workspace):
cfg = workspace["config"]
pipeline_no_meta = IndexingPipeline(
embedder=FakeEmbedder(),
binary_store=workspace["binary_store"],
ann_index=workspace["ann_index"],
fts=workspace["fts"],
config=cfg,
)
f = _write_file(workspace["src_dir"], "no_meta.py", "x = 1\n")
with pytest.raises(RuntimeError, match="MetadataStore is required"):
pipeline_no_meta.index_file(f)
# ---------------------------------------------------------------------------
# remove_file tests
# ---------------------------------------------------------------------------
class TestRemoveFile:
def test_remove_file_tombstones_and_fts(self, workspace):
pipeline = workspace["pipeline"]
meta = workspace["metadata"]
fts = workspace["fts"]
src_dir = workspace["src_dir"]
f = _write_file(src_dir, "to_remove.py", "data = [1, 2, 3]\n")
pipeline.index_file(f, root=src_dir)
chunk_ids = meta.get_chunk_ids_for_file("to_remove.py")
assert len(chunk_ids) >= 1
pipeline.remove_file("to_remove.py")
# File should be gone from metadata
assert meta.get_file_hash("to_remove.py") is None
assert meta.get_chunk_ids_for_file("to_remove.py") == []
# Chunks should be in deleted_chunks
deleted = meta.get_deleted_ids()
for cid in chunk_ids:
assert cid in deleted
# FTS should be cleared
assert fts.get_chunk_ids_by_path("to_remove.py") == []
def test_remove_nonexistent_file(self, workspace):
pipeline = workspace["pipeline"]
# Should not raise
pipeline.remove_file("nonexistent.py")
# ---------------------------------------------------------------------------
# sync tests
# ---------------------------------------------------------------------------
class TestSync:
def test_sync_indexes_new_files(self, workspace):
pipeline = workspace["pipeline"]
meta = workspace["metadata"]
src_dir = workspace["src_dir"]
f1 = _write_file(src_dir, "a.py", "a = 1\n")
f2 = _write_file(src_dir, "b.py", "b = 2\n")
stats = pipeline.sync([f1, f2], root=src_dir)
assert stats.files_processed == 2
assert meta.get_file_hash("a.py") is not None
assert meta.get_file_hash("b.py") is not None
def test_sync_removes_missing_files(self, workspace):
pipeline = workspace["pipeline"]
meta = workspace["metadata"]
src_dir = workspace["src_dir"]
f1 = _write_file(src_dir, "keep.py", "keep = True\n")
f2 = _write_file(src_dir, "remove.py", "remove = True\n")
pipeline.sync([f1, f2], root=src_dir)
assert meta.get_file_hash("remove.py") is not None
# Sync with only f1 -- f2 should be removed
stats = pipeline.sync([f1], root=src_dir)
assert meta.get_file_hash("remove.py") is None
deleted = meta.get_deleted_ids()
assert len(deleted) > 0
def test_sync_detects_changed_files(self, workspace):
pipeline = workspace["pipeline"]
meta = workspace["metadata"]
src_dir = workspace["src_dir"]
f = _write_file(src_dir, "mutable.py", "v1\n")
pipeline.sync([f], root=src_dir)
old_hash = meta.get_file_hash("mutable.py")
f.write_text("v2\n", encoding="utf-8")
stats = pipeline.sync([f], root=src_dir)
assert stats.files_processed == 1
new_hash = meta.get_file_hash("mutable.py")
assert old_hash != new_hash
def test_sync_skips_unchanged(self, workspace):
pipeline = workspace["pipeline"]
src_dir = workspace["src_dir"]
f = _write_file(src_dir, "stable.py", "stable = True\n")
pipeline.sync([f], root=src_dir)
# Second sync with same file, unchanged
stats = pipeline.sync([f], root=src_dir)
assert stats.files_processed == 0
assert stats.chunks_created == 0
# ---------------------------------------------------------------------------
# compact tests
# ---------------------------------------------------------------------------
class TestCompact:
def test_compact_removes_tombstoned_from_binary_store(self, workspace):
pipeline = workspace["pipeline"]
meta = workspace["metadata"]
binary_store = workspace["binary_store"]
src_dir = workspace["src_dir"]
f1 = _write_file(src_dir, "alive.py", "alive = True\n")
f2 = _write_file(src_dir, "dead.py", "dead = True\n")
pipeline.index_file(f1, root=src_dir)
pipeline.index_file(f2, root=src_dir)
count_before = binary_store._count
assert count_before >= 2
pipeline.remove_file("dead.py")
pipeline.compact()
# BinaryStore should have fewer entries
assert binary_store._count < count_before
# deleted_chunks should be cleared
assert meta.get_deleted_ids() == set()
def test_compact_noop_when_no_deletions(self, workspace):
pipeline = workspace["pipeline"]
meta = workspace["metadata"]
binary_store = workspace["binary_store"]
src_dir = workspace["src_dir"]
f = _write_file(src_dir, "solo.py", "solo = True\n")
pipeline.index_file(f, root=src_dir)
count_before = binary_store._count
pipeline.compact()
assert binary_store._count == count_before
# ---------------------------------------------------------------------------
# Backward compatibility: existing batch API still works
# ---------------------------------------------------------------------------
class TestBatchAPIUnchanged:
def test_index_files_still_works(self, workspace):
pipeline = workspace["pipeline"]
src_dir = workspace["src_dir"]
f1 = _write_file(src_dir, "batch1.py", "batch1 = 1\n")
f2 = _write_file(src_dir, "batch2.py", "batch2 = 2\n")
stats = pipeline.index_files([f1, f2], root=src_dir)
assert stats.files_processed == 2
assert stats.chunks_created >= 2
def test_index_files_works_without_metadata(self, workspace):
"""Batch API should work even without MetadataStore."""
cfg = workspace["config"]
pipeline_no_meta = IndexingPipeline(
embedder=FakeEmbedder(),
binary_store=BinaryStore(workspace["store_dir"] / "no_meta", DIM, cfg),
ann_index=ANNIndex(workspace["store_dir"] / "no_meta", DIM, cfg),
fts=FTSEngine(str(workspace["store_dir"] / "no_meta_fts.db")),
config=cfg,
)
src_dir = workspace["src_dir"]
f = _write_file(src_dir, "no_meta_batch.py", "x = 1\n")
stats = pipeline_no_meta.index_files([f], root=src_dir)
assert stats.files_processed == 1