Implement search and reranking functionality with FTS and embedding support

- Added BaseReranker abstract class for defining reranking interfaces. - Implemented FastEmbedReranker using fastembed's TextCrossEncoder for scoring document-query pairs. - Introduced FTSEngine for full-text search capabilities using SQLite FTS5. - Developed SearchPipeline to integrate embedding, binary search, ANN indexing, FTS, and reranking. - Added fusion methods for combining results from different search strategies using Reciprocal Rank Fusion. - Created unit and integration tests for the new search and reranking components. - Established configuration management for search parameters and models.
2026-03-18 18:48:48 +08:00 · 2026-03-16 23:03:17 +08:00
parent 5a4b18d9b1
commit de4158597b
41 changed files with 2655 additions and 1848 deletions
--- a/.ccw/workflows/context-tools.md
+++ b/.ccw/workflows/context-tools.md
@@ -1,76 +0,0 @@
-## Context Acquisition (MCP Tools Priority)
-
-**For task context gathering and analysis, ALWAYS prefer MCP tools**:
-
-1. **mcp__ace-tool__search_context** - HIGHEST PRIORITY for code discovery
-   - Semantic search with real-time codebase index
-   - Use for: finding implementations, understanding architecture, locating patterns
-   - Example: `mcp__ace-tool__search_context(project_root_path="/path", query="authentication logic")`
-
-2. **smart_search** - Fallback for structured search
-   - Use `smart_search(query="...")` for keyword/regex search
-   - Use `smart_search(action="find_files", pattern="*.ts")` for file discovery
-   - Supports modes: `auto`, `hybrid`, `exact`, `ripgrep`
-
-3. **read_file** - Batch file reading
-   - Read multiple files in parallel: `read_file(path="file1.ts")`, `read_file(path="file2.ts")`
-   - Supports glob patterns: `read_file(path="src/**/*.config.ts")`
-
-**Priority Order**:
-```
-ACE search_context (semantic) → smart_search (structured) → read_file (batch read) → shell commands (fallback)
-```
-
-**NEVER** use shell commands (`cat`, `find`, `grep`) when MCP tools are available.
-### read_file - Read File Contents
-
-**When**: Read files found by smart_search
-
-**How**:
-```javascript
-read_file(path="/path/to/file.ts")                   // Single file
-read_file(path="/src/**/*.config.ts")                // Pattern matching
-```
-
---
-
-### edit_file - Modify Files
-
-**When**: Built-in Edit tool fails or need advanced features
-
-**How**:
-```javascript
-edit_file(path="/file.ts", old_string="...", new_string="...", mode="update")
-edit_file(path="/file.ts", line=10, content="...", mode="insert_after")
-```
-
-**Modes**: `update` (replace text), `insert_after`, `insert_before`, `delete_line`
-
---
-
-### write_file - Create/Overwrite Files
-
-**When**: Create new files or completely replace content
-
-**How**:
-```javascript
-write_file(path="/new-file.ts", content="...")
-```
-
---
-
-### Exa - External Search
-
-**When**: Find documentation/examples outside codebase
-
-**How**:
-```javascript
-mcp__exa__search(query="React hooks 2025 documentation")
-mcp__exa__search(query="FastAPI auth example", numResults=10)
-mcp__exa__search(query="latest API docs", livecrawl="always")
-```
-
-**Parameters**:
- `query` (required): Search query string
- `numResults` (optional): Number of results to return (default: 5)
- `livecrawl` (optional): `"always"` or `"fallback"` for live crawling
--- a/.ccw/workflows/file-modification.md
+++ b/.ccw/workflows/file-modification.md
@@ -1,64 +0,0 @@
-# File Modification
-
-Before modifying files, always:
- Try built-in Edit tool first
- Escalate to MCP tools when built-ins fail
- Use write_file only as last resort
-
-## MCP Tools Usage
-
-### edit_file - Modify Files
-
-**When**: Built-in Edit fails, need dry-run preview, or need line-based operations
-
-**How**:
-```javascript
-edit_file(path="/file.ts", oldText="old", newText="new")              // Replace text
-edit_file(path="/file.ts", oldText="old", newText="new", dryRun=true) // Preview diff
-edit_file(path="/file.ts", oldText="old", newText="new", replaceAll=true) // Replace all
-edit_file(path="/file.ts", mode="line", operation="insert_after", line=10, text="new line")
-edit_file(path="/file.ts", mode="line", operation="delete", line=5, end_line=8)
-```
-
-**Modes**: `update` (replace text, default), `line` (line-based operations)
-
-**Operations** (line mode): `insert_before`, `insert_after`, `replace`, `delete`
-
---
-
-### write_file - Create/Overwrite Files
-
-**When**: Create new files, completely replace content, or edit_file still fails
-
-**How**:
-```javascript
-write_file(path="/new-file.ts", content="file content here")
-write_file(path="/existing.ts", content="...", backup=true)  // Create backup first
-```
-
---
-
-## Priority Logic
-
-> **Note**: Search priority is defined in `context-tools.md` - smart_search has HIGHEST PRIORITY for all discovery tasks.
-
-**Search & Discovery** (defer to context-tools.md):
-1. **smart_search FIRST** for any code/file discovery
-2. Built-in Grep only for single-file exact line search (location already confirmed)
-3. Exa for external/public knowledge
-
-**File Reading**:
-1. Unknown location → **smart_search first**, then Read
-2. Known confirmed file → Built-in Read directly
-3. Pattern matching → smart_search (action="find_files")
-
-**File Editing**:
-1. Always try built-in Edit first
-2. Fails 1+ times → edit_file (MCP)
-3. Still fails → write_file (MCP)
-
-## Decision Triggers
-
-**Search tasks** → Always start with smart_search (per context-tools.md)
-**Known file edits** → Start with built-in Edit, escalate to MCP if fails
-**External knowledge** → Use Exa
--- a/.ccw/workflows/review-directory-specification.md
+++ b/.ccw/workflows/review-directory-specification.md
@@ -1,336 +0,0 @@
-# Review Directory Specification
-
-## Overview
-
-Unified directory structure for all review commands (session-based and module-based) within workflow sessions.
-
-## Core Principles
-
-1. **Session-Based**: All reviews run within a workflow session context
-2. **Unified Structure**: Same directory layout for all review types
-3. **Type Differentiation**: Review type indicated by metadata, not directory structure
-4. **Progressive Creation**: Directories created on-demand during review execution
-5. **Archive Support**: Reviews archived with their parent session
-
-## Directory Structure
-
-### Base Location
-```
-.workflow/active/WFS-{session-id}/.review/
-```
-
-### Complete Structure
-```
-.workflow/active/WFS-{session-id}/.review/
-├── review-state.json              # Review orchestrator state machine
-├── review-progress.json           # Real-time progress for dashboard polling
-├── review-metadata.json           # Review configuration and scope
-├── dimensions/                    # Per-dimension analysis results
-│   ├── security.json
-│   ├── architecture.json
-│   ├── quality.json
-│   ├── action-items.json
-│   ├── performance.json
-│   ├── maintainability.json
-│   └── best-practices.json
-├── iterations/                    # Deep-dive iteration results
-│   ├── iteration-1-finding-{uuid}.json
-│   ├── iteration-2-finding-{uuid}.json
-│   └── ...
-├── reports/                       # Human-readable reports
-│   ├── security-analysis.md
-│   ├── security-cli-output.txt
-│   ├── architecture-analysis.md
-│   ├── architecture-cli-output.txt
-│   ├── ...
-│   ├── deep-dive-1-{uuid}.md
-│   └── deep-dive-2-{uuid}.md
-├── REVIEW-SUMMARY.md              # Final consolidated summary
-└── dashboard.html                 # Interactive review dashboard
-```
-
-## Review Metadata Schema
-
-**File**: `review-metadata.json`
-
-```json
-{
-  "review_id": "review-20250125-143022",
-  "review_type": "module|session",
-  "session_id": "WFS-auth-system",
-  "created_at": "2025-01-25T14:30:22Z",
-  "scope": {
-    "type": "module|session",
-    "module_scope": {
-      "target_pattern": "src/auth/**",
-      "resolved_files": [
-        "src/auth/service.ts",
-        "src/auth/validator.ts"
-      ],
-      "file_count": 2
-    },
-    "session_scope": {
-      "commit_range": "abc123..def456",
-      "changed_files": [
-        "src/auth/service.ts",
-        "src/payment/processor.ts"
-      ],
-      "file_count": 2
-    }
-  },
-  "dimensions": ["security", "architecture", "quality", "action-items", "performance", "maintainability", "best-practices"],
-  "max_iterations": 3,
-  "cli_tools": {
-    "primary": "gemini",
-    "fallback": ["qwen", "codex"]
-  }
-}
-```
-
-## Review State Schema
-
-**File**: `review-state.json`
-
-```json
-{
-  "review_id": "review-20250125-143022",
-  "phase": "init|parallel|aggregate|iterate|complete",
-  "current_iteration": 1,
-  "dimensions_status": {
-    "security": "pending|in_progress|completed|failed",
-    "architecture": "completed",
-    "quality": "in_progress",
-    "action-items": "pending",
-    "performance": "pending",
-    "maintainability": "pending",
-    "best-practices": "pending"
-  },
-  "severity_distribution": {
-    "critical": 2,
-    "high": 5,
-    "medium": 12,
-    "low": 8
-  },
-  "critical_files": [
-    "src/auth/service.ts",
-    "src/payment/processor.ts"
-  ],
-  "iterations": [
-    {
-      "iteration": 1,
-      "findings_selected": ["uuid-1", "uuid-2", "uuid-3"],
-      "completed_at": "2025-01-25T15:30:00Z"
-    }
-  ],
-  "completion_criteria": {
-    "critical_count": 0,
-    "high_count_threshold": 5,
-    "max_iterations": 3
-  },
-  "next_action": "execute_parallel_reviews|aggregate_findings|execute_deep_dive|generate_final_report|complete"
-}
-```
-
-## Session Integration
-
-### Session Discovery
-
-**review-session-cycle** (auto-discover):
-```bash
-# Auto-detect active session
-/workflow:review-session-cycle
-
-# Or specify session explicitly
-/workflow:review-session-cycle WFS-auth-system
-```
-
-**review-module-cycle** (require session):
-```bash
-# Must have active session or specify one
-/workflow:review-module-cycle src/auth/** --session WFS-auth-system
-
-# Or use active session
-/workflow:review-module-cycle src/auth/**
-```
-
-### Session Creation Logic
-
-**For review-module-cycle**:
-
-1. **Check Active Session**: Search `.workflow/active/WFS-*`
-2. **If Found**: Use active session's `.review/` directory
-3. **If Not Found**:
-   - **Option A** (Recommended): Prompt user to create session first
-   - **Option B**: Auto-create review-only session: `WFS-review-{pattern-hash}`
-
-**Recommended Flow**:
-```bash
-# Step 1: Start session
-/workflow:session:start --new "Review auth module"
-# Creates: .workflow/active/WFS-review-auth-module/
-
-# Step 2: Run review
-/workflow:review-module-cycle src/auth/**
-# Creates: .workflow/active/WFS-review-auth-module/.review/
-```
-
-## Command Phase 1 Requirements
-
-### Both Commands Must:
-
-1. **Session Discovery**:
-   ```javascript
-   // Check for active session
-   const sessions = Glob('.workflow/active/WFS-*');
-   if (sessions.length === 0) {
-     // Prompt user to create session first
-     error("No active session found. Please run /workflow:session:start first");
-   }
-   const sessionId = sessions[0].match(/WFS-[^/]+/)[0];
-   ```
-
-2. **Create .review/ Structure**:
-   ```javascript
-   const reviewDir = `.workflow/active/${sessionId}/.review/`;
-
-   // Create directory structure
-   Bash(`mkdir -p ${reviewDir}/dimensions`);
-   Bash(`mkdir -p ${reviewDir}/iterations`);
-   Bash(`mkdir -p ${reviewDir}/reports`);
-   ```
-
-3. **Initialize Metadata**:
-   ```javascript
-   // Write review-metadata.json
-   Write(`${reviewDir}/review-metadata.json`, JSON.stringify({
-     review_id: `review-${timestamp}`,
-     review_type: "module|session",
-     session_id: sessionId,
-     created_at: new Date().toISOString(),
-     scope: {...},
-     dimensions: [...],
-     max_iterations: 3,
-     cli_tools: {...}
-   }));
-
-   // Write review-state.json
-   Write(`${reviewDir}/review-state.json`, JSON.stringify({
-     review_id: `review-${timestamp}`,
-     phase: "init",
-     current_iteration: 0,
-     dimensions_status: {},
-     severity_distribution: {},
-     critical_files: [],
-     iterations: [],
-     completion_criteria: {},
-     next_action: "execute_parallel_reviews"
-   }));
-   ```
-
-4. **Generate Dashboard**:
-   ```javascript
-   const template = Read('~/.claude/templates/review-cycle-dashboard.html');
-   const dashboard = template
-     .replace('{{SESSION_ID}}', sessionId)
-     .replace('{{REVIEW_TYPE}}', reviewType)
-     .replace('{{REVIEW_DIR}}', reviewDir);
-   Write(`${reviewDir}/dashboard.html`, dashboard);
-
-   // Output to user
-   console.log(`📊 Review Dashboard: file://${absolutePath(reviewDir)}/dashboard.html`);
-   console.log(`📂 Review Output: ${reviewDir}`);
-   ```
-
-## Archive Strategy
-
-### On Session Completion
-
-When `/workflow:session:complete` is called:
-
-1. **Preserve Review Directory**:
-   ```javascript
-   // Move entire session including .review/
-   Bash(`mv .workflow/active/${sessionId} .workflow/archives/${sessionId}`);
-   ```
-
-2. **Review Archive Structure**:
-   ```
-   .workflow/archives/WFS-auth-system/
-   ├── workflow-session.json
-   ├── IMPL_PLAN.md
-   ├── TODO_LIST.md
-   ├── .task/
-   ├── .summaries/
-   └── .review/                    # Review results preserved
-       ├── review-metadata.json
-       ├── REVIEW-SUMMARY.md
-       └── dashboard.html
-   ```
-
-3. **Access Archived Reviews**:
-   ```bash
-   # Open archived dashboard
-   start .workflow/archives/WFS-auth-system/.review/dashboard.html
-   ```
-
-## Benefits
-
-### 1. Unified Structure
- Same directory layout for all review types
- Consistent file naming and schemas
- Easier maintenance and tooling
-
-### 2. Session Integration
- Review history tracked with implementation
- Easy correlation between code changes and reviews
- Simplified archiving and retrieval
-
-### 3. Progressive Creation
- Directories created only when needed
- No upfront overhead
- Clean session initialization
-
-### 4. Type Flexibility
- Module-based and session-based reviews in same structure
- Type indicated by metadata, not directory layout
- Easy to add new review types
-
-### 5. Dashboard Consistency
- Same dashboard template for both types
- Unified progress tracking
- Consistent user experience
-
-## Migration Path
-
-### For Existing Commands
-
-**review-session-cycle**:
-1. Change output from `.workflow/.reviews/session-{id}/` to `.workflow/active/{session-id}/.review/`
-2. Update Phase 1 to use session discovery
-3. Add review-metadata.json creation
-
-**review-module-cycle**:
-1. Add session requirement (or auto-create)
-2. Change output from `.workflow/.reviews/module-{hash}/` to `.workflow/active/{session-id}/.review/`
-3. Update Phase 1 to use session discovery
-4. Add review-metadata.json creation
-
-### Backward Compatibility
-
-**For existing standalone reviews** in `.workflow/.reviews/`:
- Keep for reference
- Document migration in README
- Provide migration script if needed
-
-## Implementation Checklist
-
- [ ] Update workflow-architecture.md with .review/ structure
- [ ] Update review-session-cycle.md command specification
- [ ] Update review-module-cycle.md command specification
- [ ] Update review-cycle-dashboard.html template
- [ ] Create review-metadata.json schema validation
- [ ] Update /workflow:session:complete to preserve .review/
- [ ] Update documentation examples
- [ ] Test both review types with new structure
- [ ] Validate dashboard compatibility
- [ ] Document migration path for existing reviews
--- a/.ccw/workflows/task-core.md
+++ b/.ccw/workflows/task-core.md
@@ -1,214 +0,0 @@
-# Task System Core Reference
-
-## Overview
-Task commands provide single-execution workflow capabilities with full context awareness, hierarchical organization, and agent orchestration.
-
-## Task JSON Schema
-All task files use this simplified 5-field schema:
-
-```json
-{
-  "id": "IMPL-1.2",
-  "title": "Implement JWT authentication",
-  "status": "pending|active|completed|blocked|container",
-
-  "meta": {
-    "type": "feature|bugfix|refactor|test-gen|test-fix|docs",
-    "agent": "@code-developer|@action-planning-agent|@test-fix-agent|@universal-executor"
-  },
-
-  "context": {
-    "requirements": ["JWT authentication", "OAuth2 support"],
-    "focus_paths": ["src/auth", "tests/auth", "config/auth.json"],
-    "acceptance": ["JWT validation works", "OAuth flow complete"],
-    "parent": "IMPL-1",
-    "depends_on": ["IMPL-1.1"],
-    "inherited": {
-      "from": "IMPL-1",
-      "context": ["Authentication system design completed"]
-    },
-    "shared_context": {
-      "auth_strategy": "JWT with refresh tokens"
-    }
-  },
-
-  "flow_control": {
-    "pre_analysis": [
-      {
-        "step": "gather_context",
-        "action": "Read dependency summaries",
-        "command": "bash(cat .workflow/*/summaries/IMPL-1.1-summary.md)",
-        "output_to": "auth_design_context",
-        "on_error": "skip_optional"
-      }
-    ],
-    "implementation_approach": [
-      {
-        "step": 1,
-        "title": "Implement JWT authentication system",
-        "description": "Implement comprehensive JWT authentication system with token generation, validation, and refresh logic",
-        "modification_points": ["Add JWT token generation", "Implement token validation middleware", "Create refresh token logic"],
-        "logic_flow": ["User login request → validate credentials", "Generate JWT access and refresh tokens", "Store refresh token securely", "Return tokens to client"],
-        "depends_on": [],
-        "output": "jwt_implementation"
-      }
-    ],
-    "target_files": [
-      "src/auth/login.ts:handleLogin:75-120",
-      "src/middleware/auth.ts:validateToken",
-      "src/auth/PasswordReset.ts"
-    ]
-  }
-}
-```
-
-## Field Structure Details
-
-### focus_paths Field (within context)
-**Purpose**: Specifies concrete project paths relevant to task implementation
-
-**Format**:
- **Array of strings**: `["folder1", "folder2", "specific_file.ts"]`
- **Concrete paths**: Use actual directory/file names without wildcards
- **Mixed types**: Can include both directories and specific files
- **Relative paths**: From project root (e.g., `src/auth`, not `./src/auth`)
-
-**Examples**:
-```json
-// Authentication system task
-"focus_paths": ["src/auth", "tests/auth", "config/auth.json", "src/middleware/auth.ts"]
-
-// UI component task
-"focus_paths": ["src/components/Button", "src/styles", "tests/components"]
-```
-
-### flow_control Field Structure
-**Purpose**: Universal process manager for task execution
-
-**Components**:
- **pre_analysis**: Array of sequential process steps
- **implementation_approach**: Task execution strategy
- **target_files**: Files to modify/create - existing files in `file:function:lines` format, new files as `file` only
-
-**Step Structure**:
-```json
-{
-  "step": "gather_context",
-  "action": "Human-readable description",
-  "command": "bash(executable command with [variables])",
-  "output_to": "variable_name",
-  "on_error": "skip_optional|fail|retry_once|manual_intervention"
-}
-```
-
-## Hierarchical System
-
-### Task Hierarchy Rules
- **Format**: IMPL-N (main), IMPL-N.M (subtasks) - uppercase required
- **Maximum Depth**: 2 levels only
- **10-Task Limit**: Hard limit enforced across all tasks
- **Container Tasks**: Parents with subtasks (not executable)
- **Leaf Tasks**: No subtasks (executable)
- **File Cohesion**: Related files must stay in same task
-
-### Task Complexity Classifications
- **Simple**: ≤5 tasks, single-level tasks, direct execution
- **Medium**: 6-10 tasks, two-level hierarchy, context coordination
- **Over-scope**: >10 tasks requires project re-scoping into iterations
-
-### Complexity Assessment Rules
- **Creation**: System evaluates and assigns complexity
- **10-task limit**: Hard limit enforced - exceeding requires re-scoping
- **Execution**: Can upgrade (Simple→Medium→Over-scope), triggers re-scoping
- **Override**: Users can manually specify complexity within 10-task limit
-
-### Status Rules
- **pending**: Ready for execution
- **active**: Currently being executed
- **completed**: Successfully finished
- **blocked**: Waiting for dependencies
- **container**: Has subtasks (parent only)
-
-## Session Integration
-
-### Active Session Detection
-```bash
-# Check for active session in sessions directory
-active_session=$(find .workflow/active/ -name 'WFS-*' -type d 2>/dev/null | head -1)
-```
-
-### Workflow Context Inheritance
-Tasks inherit from:
-1. `workflow-session.json` - Session metadata
-2. Parent task context (for subtasks)
-3. `IMPL_PLAN.md` - Planning document
-
-### File Locations
- **Task JSON**: `.workflow/active/WFS-[topic]/.task/IMPL-*.json` (uppercase required)
- **Session State**: `.workflow/active/WFS-[topic]/workflow-session.json`
- **Planning Doc**: `.workflow/active/WFS-[topic]/IMPL_PLAN.md`
- **Progress**: `.workflow/active/WFS-[topic]/TODO_LIST.md`
-
-## Agent Mapping
-
-### Automatic Agent Selection
- **@code-developer**: Implementation tasks, coding, test writing
- **@action-planning-agent**: Design, architecture planning
- **@test-fix-agent**: Test execution, failure diagnosis, code fixing
- **@universal-executor**: Optional manual review (only when explicitly requested)
-
-### Agent Context Filtering
-Each agent receives tailored context:
- **@code-developer**: Complete implementation details, test requirements
- **@action-planning-agent**: High-level requirements, risks, architecture
- **@test-fix-agent**: Test execution, failure diagnosis, code fixing
- **@universal-executor**: Quality standards, security considerations (when requested)
-
-## Deprecated Fields
-
-### Legacy paths Field
-**Deprecated**: The semicolon-separated `paths` field has been replaced by `context.focus_paths` array.
-
-**Old Format** (no longer used):
-```json
-"paths": "src/auth;tests/auth;config/auth.json;src/middleware/auth.ts"
-```
-
-**New Format** (use this instead):
-```json
-"context": {
-  "focus_paths": ["src/auth", "tests/auth", "config/auth.json", "src/middleware/auth.ts"]
-}
-```
-
-## Validation Rules
-
-### Pre-execution Checks
-1. Task exists and is valid JSON
-2. Task status allows operation
-3. Dependencies are met
-4. Active workflow session exists
-5. All 5 core fields present (id, title, status, meta, context, flow_control)
-6. Total task count ≤ 10 (hard limit)
-7. File cohesion maintained in focus_paths
-
-### Hierarchy Validation
- Parent-child relationships valid
- Maximum depth not exceeded
- Container tasks have subtasks
- No circular dependencies
-
-## Error Handling Patterns
-
-### Common Errors
- **Task not found**: Check ID format and session
- **Invalid status**: Verify task can be operated on
- **Missing session**: Ensure active workflow exists
- **Max depth exceeded**: Restructure hierarchy
- **Missing implementation**: Complete required fields
-
-### Recovery Strategies
- Session validation with clear guidance
- Automatic ID correction suggestions
- Implementation field completion prompts
- Hierarchy restructuring options
--- a/.ccw/workflows/tool-strategy.md
+++ b/.ccw/workflows/tool-strategy.md
@@ -1,216 +0,0 @@
-# Tool Strategy - When to Use What
-
-> **Focus**: Decision triggers and selection logic, NOT syntax (already registered with Claude)
-
-## Quick Decision Tree
-
-```
-Need context?
-├─ Exa available? → Use Exa (fastest, most comprehensive)
-├─ Large codebase (>500 files)? → codex_lens
-├─ Known files (<5)? → Read tool
-└─ Unknown files? → smart_search → Read tool
-
-Need to modify files?
-├─ Built-in Edit fails? → mcp__ccw-tools__edit_file
-└─ Still fails? → mcp__ccw-tools__write_file
-
-Need to search?
-├─ Semantic/concept search? → smart_search (mode=semantic)
-├─ Exact pattern match? → Grep tool
-└─ Multiple search modes needed? → smart_search (mode=auto)
-```
-
---
-
-## 1. Context Gathering Tools
-
-### Exa (`mcp__exa__get_code_context_exa`)
-
-**Use When**:
- ✅ Researching external APIs, libraries, frameworks
- ✅ Need recent documentation (post-cutoff knowledge)
- ✅ Looking for implementation examples in public repos
- ✅ Comparing architectural patterns across projects
-
-**Don't Use When**:
- ❌ Searching internal codebase (use smart_search/codex_lens)
- ❌ Files already in working directory (use Read)
-
-**Trigger Indicators**:
- User mentions specific library/framework names
- Questions about "best practices", "how does X work"
- Need to verify current API signatures
-
---
-
-### read_file (`mcp__ccw-tools__read_file`)
-
-**Use When**:
- ✅ Reading multiple related files at once (batch reading)
- ✅ Need directory traversal with pattern matching
- ✅ Searching file content with regex (`contentPattern`)
- ✅ Want to limit depth/file count for large directories
-
-**Don't Use When**:
- ❌ Single file read → Use built-in Read tool (faster)
- ❌ Unknown file locations → Use smart_search first
- ❌ Need semantic search → Use smart_search or codex_lens
-
-**Trigger Indicators**:
- Need to read "all TypeScript files in src/"
- Need to find "files containing TODO comments"
- Want to read "up to 20 config files"
-
-**Advantages over Built-in Read**:
- Batch operation (multiple files in one call)
- Pattern-based filtering (glob + content regex)
- Directory traversal with depth control
-
---
-
-### codex_lens (`mcp__ccw-tools__codex_lens`)
-
-**Use When**:
- ✅ Large codebase (>500 files) requiring repeated searches
- ✅ Need semantic understanding of code relationships
- ✅ Working across multiple sessions (persistent index)
- ✅ Symbol-level navigation needed
-
-**Don't Use When**:
- ❌ Small project (<100 files) → Use smart_search (no indexing overhead)
- ❌ One-time search → Use smart_search or Grep
- ❌ Files change frequently → Indexing overhead not worth it
-
-**Trigger Indicators**:
- "Find all implementations of interface X"
- "What calls this function across the codebase?"
- Multi-session workflow on same codebase
-
-**Action Selection**:
- `init`: First time in new codebase
- `search`: Find code patterns
- `search_files`: Find files by path/name pattern
- `symbol`: Get symbols in specific file
- `status`: Check if index exists/is stale
- `clean`: Remove stale index
-
---
-
-### smart_search (`mcp__ccw-tools__smart_search`)
-
-**Use When**:
- ✅ Don't know exact file locations
- ✅ Need concept/semantic search ("authentication logic")
- ✅ Medium-sized codebase (100-500 files)
- ✅ One-time or infrequent searches
-
-**Don't Use When**:
- ❌ Known exact file path → Use Read directly
- ❌ Large codebase + repeated searches → Use codex_lens
- ❌ Exact pattern match → Use Grep (faster)
-
-**Mode Selection**:
- `auto`: Let tool decide (default, safest)
- `exact`: Know exact pattern, need fast results
- `fuzzy`: Typo-tolerant file/symbol names
- `semantic`: Concept-based ("error handling", "data validation")
- `graph`: Dependency/relationship analysis
-
-**Trigger Indicators**:
- "Find files related to user authentication"
- "Where is the payment processing logic?"
- "Locate database connection setup"
-
---
-
-## 2. File Modification Tools
-
-### edit_file (`mcp__ccw-tools__edit_file`)
-
-**Use When**:
- ✅ Built-in Edit tool failed 1+ times
- ✅ Need dry-run preview before applying
- ✅ Need line-based operations (insert_after, insert_before)
- ✅ Need to replace all occurrences
-
-**Don't Use When**:
- ❌ Built-in Edit hasn't failed yet → Try built-in first
- ❌ Need to create new file → Use write_file
-
-**Trigger Indicators**:
- Built-in Edit returns "old_string not found"
- Built-in Edit fails due to whitespace/formatting
- Need to verify changes before applying (dryRun=true)
-
-**Mode Selection**:
- `mode=update`: Replace text (similar to built-in Edit)
- `mode=line`: Line-based operations (insert_after, insert_before, delete)
-
---
-
-### write_file (`mcp__ccw-tools__write_file`)
-
-**Use When**:
- ✅ Creating brand new files
- ✅ MCP edit_file still fails (last resort)
- ✅ Need to completely replace file content
- ✅ Need backup before overwriting
-
-**Don't Use When**:
- ❌ File exists + small change → Use Edit tools
- ❌ Built-in Edit hasn't been tried → Try built-in Edit first
-
-**Trigger Indicators**:
- All Edit attempts failed
- Need to create new file with specific content
- User explicitly asks to "recreate file"
-
---
-
-## 3. Decision Logic
-
-### File Reading Priority
-
-```
-1. Known single file? → Built-in Read
-2. Multiple files OR pattern matching? → mcp__ccw-tools__read_file
-3. Unknown location? → smart_search, then Read
-4. Large codebase + repeated access? → codex_lens
-```
-
-### File Editing Priority
-
-```
-1. Always try built-in Edit first
-2. Fails 1+ times? → mcp__ccw-tools__edit_file
-3. Still fails? → mcp__ccw-tools__write_file (last resort)
-```
-
-### Search Tool Priority
-
-```
-1. External knowledge? → Exa
-2. Exact pattern in small codebase? → Built-in Grep
-3. Semantic/unknown location? → smart_search
-4. Large codebase + repeated searches? → codex_lens
-```
-
---
-
-## 4. Anti-Patterns
-
-**Don't**:
- Use codex_lens for one-time searches in small projects
- Use smart_search when file path is already known
- Use write_file before trying Edit tools
- Use Exa for internal codebase searches
- Use read_file for single file when Read tool works
-
-**Do**:
- Start with simplest tool (Read, Edit, Grep)
- Escalate to MCP tools when built-ins fail
- Use semantic search (smart_search) for exploratory tasks
- Use indexed search (codex_lens) for large, stable codebases
- Use Exa for external/public knowledge
-
--- a/.ccw/workflows/workflow-architecture.md
+++ b/.ccw/workflows/workflow-architecture.md
@@ -1,942 +0,0 @@
-# Workflow Architecture
-
-## Overview
-
-This document defines the complete workflow system architecture using a **JSON-only data model**, **marker-based session management**, and **unified file structure** with dynamic task decomposition.
-
-## Core Architecture
-
-### JSON-Only Data Model
-**JSON files (.task/IMPL-*.json) are the only authoritative source of task state. All markdown documents are read-only generated views.**
-
- **Task State**: Stored exclusively in JSON files
- **Documents**: Generated on-demand from JSON data
- **No Synchronization**: Eliminates bidirectional sync complexity
- **Performance**: Direct JSON access without parsing overhead
-
-### Key Design Decisions
- **JSON files are the single source of truth** - All markdown documents are read-only generated views
- **Marker files for session tracking** - Ultra-simple active session management
- **Unified file structure definition** - Same structure template for all workflows, created on-demand
- **Dynamic task decomposition** - Subtasks created as needed during execution
- **On-demand file creation** - Directories and files created only when required
- **Agent-agnostic task definitions** - Complete context preserved for autonomous execution
-
-## Session Management
-
-### Directory-Based Session Management
-**Simple Location-Based Tracking**: Sessions in `.workflow/active/` directory
-
-```bash
-.workflow/
-├── active/
-│   ├── WFS-oauth-integration/         # Active session directory
-│   ├── WFS-user-profile/             # Active session directory
-│   └── WFS-bug-fix-123/              # Active session directory
-└── archives/
-    └── WFS-old-feature/              # Archived session (completed)
-```
-
-
-### Session Operations
-
-#### Detect Active Session(s)
-```bash
-active_sessions=$(find .workflow/active/ -name "WFS-*" -type d 2>/dev/null)
-count=$(echo "$active_sessions" | wc -l)
-
-if [ -z "$active_sessions" ]; then
-  echo "No active session"
-elif [ "$count" -eq 1 ]; then
-  session_name=$(basename "$active_sessions")
-  echo "Active session: $session_name"
-else
-  echo "Multiple sessions found:"
-  echo "$active_sessions" | while read session_dir; do
-    session=$(basename "$session_dir")
-    echo "  - $session"
-  done
-  echo "Please specify which session to work with"
-fi
-```
-
-#### Archive Session
-```bash
-mv .workflow/active/WFS-feature .workflow/archives/WFS-feature
-```
-
-### Session State Tracking
-Each session directory contains `workflow-session.json`:
-
-```json
-{
-  "session_id": "WFS-[topic-slug]",
-  "project": "feature description",
-  "type": "simple|medium|complex",
-  "current_phase": "PLAN|IMPLEMENT|REVIEW",
-  "status": "active|paused|completed",
-  "progress": {
-    "completed_phases": ["PLAN"],
-    "current_tasks": ["IMPL-1", "IMPL-2"]
-  }
-}
-```
-
-## Task System
-
-### Hierarchical Task Structure
-**Maximum Depth**: 2 levels (IMPL-N.M format)
-
-```
-IMPL-1              # Main task
-IMPL-1.1            # Subtask of IMPL-1 (dynamically created)
-IMPL-1.2            # Another subtask of IMPL-1
-IMPL-2              # Another main task
-IMPL-2.1            # Subtask of IMPL-2 (dynamically created)
-```
-
-**Task Status Rules**:
- **Container tasks**: Parent tasks with subtasks (cannot be directly executed)
- **Leaf tasks**: Only these can be executed directly
- **Status inheritance**: Parent status derived from subtask completion
-
-### Enhanced Task JSON Schema
-All task files use this unified 6-field schema with optional artifacts enhancement:
-
-```json
-{
-  "id": "IMPL-1.2",
-  "title": "Implement JWT authentication",
-  "status": "pending|active|completed|blocked|container",
-  "context_package_path": ".workflow/WFS-session/.process/context-package.json",
-
-  "meta": {
-    "type": "feature|bugfix|refactor|test-gen|test-fix|docs",
-    "agent": "@code-developer|@action-planning-agent|@test-fix-agent|@universal-executor"
-  },
-
-  "context": {
-    "requirements": ["JWT authentication", "OAuth2 support"],
-    "focus_paths": ["src/auth", "tests/auth", "config/auth.json"],
-    "acceptance": ["JWT validation works", "OAuth flow complete"],
-    "parent": "IMPL-1",
-    "depends_on": ["IMPL-1.1"],
-    "inherited": {
-      "from": "IMPL-1",
-      "context": ["Authentication system design completed"]
-    },
-    "shared_context": {
-      "auth_strategy": "JWT with refresh tokens"
-    },
-    "artifacts": [
-      {
-        "type": "role_analyses",
-        "source": "brainstorm_clarification",
-        "path": ".workflow/WFS-session/.brainstorming/*/analysis*.md",
-        "priority": "highest",
-        "contains": "role_specific_requirements_and_design"
-      }
-    ]
-  },
-
-  "flow_control": {
-    "pre_analysis": [
-      {
-        "step": "check_patterns",
-        "action": "Analyze existing patterns",
-        "command": "bash(rg 'auth' [focus_paths] | head -10)",
-        "output_to": "patterns"
-      },
-      {
-        "step": "analyze_architecture",
-        "action": "Review system architecture",
-        "command": "gemini \"analyze patterns: [patterns]\"",
-        "output_to": "design"
-      },
-      {
-        "step": "check_deps",
-        "action": "Check dependencies",
-        "command": "bash(echo [depends_on] | xargs cat)",
-        "output_to": "context"
-      }
-    ],
-    "implementation_approach": [
-      {
-        "step": 1,
-        "title": "Set up authentication infrastructure",
-        "description": "Install JWT library and create auth config following [design] patterns from [parent]",
-        "modification_points": [
-          "Add JWT library dependencies to package.json",
-          "Create auth configuration file using [parent] patterns"
-        ],
-        "logic_flow": [
-          "Install jsonwebtoken library via npm",
-          "Configure JWT secret and expiration from [inherited]",
-          "Export auth config for use by [jwt_generator]"
-        ],
-        "depends_on": [],
-        "output": "auth_config"
-      },
-      {
-        "step": 2,
-        "title": "Implement JWT generation",
-        "description": "Create JWT token generation logic using [auth_config] and [inherited] validation patterns",
-        "modification_points": [
-          "Add JWT generation function in auth service",
-          "Implement token signing with [auth_config]"
-        ],
-        "logic_flow": [
-          "User login → validate credentials with [inherited]",
-          "Generate JWT payload with user data",
-          "Sign JWT using secret from [auth_config]",
-          "Return signed token"
-        ],
-        "depends_on": [1],
-        "output": "jwt_generator"
-      },
-      {
-        "step": 3,
-        "title": "Implement JWT validation middleware",
-        "description": "Create middleware to validate JWT tokens using [auth_config] and [shared] rules",
-        "modification_points": [
-          "Create validation middleware using [jwt_generator]",
-          "Add token verification using [shared] rules",
-          "Implement user attachment to request object"
-        ],
-        "logic_flow": [
-          "Protected route → extract JWT from Authorization header",
-          "Validate token signature using [auth_config]",
-          "Check token expiration and [shared] rules",
-          "Decode payload and attach user to request",
-          "Call next() or return 401 error"
-        ],
-        "command": "bash(npm test -- middleware.test.ts)",
-        "depends_on": [1, 2],
-        "output": "auth_middleware"
-      }
-    ],
-    "target_files": [
-      "src/auth/login.ts:handleLogin:75-120",
-      "src/middleware/auth.ts:validateToken",
-      "src/auth/PasswordReset.ts"
-    ]
-  }
-}
-```
-
-### Focus Paths & Context Management
-
-#### Context Package Path (Top-Level Field)
-The **context_package_path** field provides the location of the smart context package:
- **Location**: Top-level field (not in `artifacts` array)
- **Path**: `.workflow/WFS-session/.process/context-package.json`
- **Purpose**: References the comprehensive context package containing project structure, dependencies, and brainstorming artifacts catalog
- **Usage**: Loaded in `pre_analysis` steps via `Read({{context_package_path}})`
-
-#### Focus Paths Format
-The **focus_paths** field specifies concrete project paths for task implementation:
- **Array of strings**: `["folder1", "folder2", "specific_file.ts"]`
- **Concrete paths**: Use actual directory/file names without wildcards
- **Mixed types**: Can include both directories and specific files
- **Relative paths**: From project root (e.g., `src/auth`, not `./src/auth`)
-
-#### Artifacts Field ⚠️ NEW FIELD
-Optional field referencing brainstorming outputs for task execution:
-
-```json
-"artifacts": [
-  {
-    "type": "role_analyses|topic_framework|individual_role_analysis",
-    "source": "brainstorm_clarification|brainstorm_framework|brainstorm_roles",
-    "path": ".workflow/WFS-session/.brainstorming/document.md",
-    "priority": "highest|high|medium|low"
-  }
-]
-```
-
-**Types & Priority**: role_analyses (highest) → topic_framework (medium) → individual_role_analysis (low)
-
-#### Flow Control Configuration
-The **flow_control** field manages task execution through structured sequential steps. For complete format specifications and usage guidelines, see [Flow Control Format Guide](#flow-control-format-guide) below.
-
-**Quick Reference**:
- **pre_analysis**: Context gathering steps (supports multiple command types)
- **implementation_approach**: Implementation steps array with dependency management
- **target_files**: Target files for modification (file:function:lines format)
- **Variable references**: Use `[variable_name]` to reference step outputs
- **Tool integration**: Supports Gemini, Codex, Bash commands, and MCP tools
-
-## Flow Control Format Guide
-
-The `[FLOW_CONTROL]` marker indicates that a task or prompt contains flow control steps for sequential execution. There are **two distinct formats** used in different scenarios:
-
-### Format Comparison Matrix
-
-| Aspect | Inline Format | JSON Format |
-|--------|--------------|-------------|
-| **Used In** | Brainstorm workflows | Implementation tasks |
-| **Agent** | conceptual-planning-agent | code-developer, test-fix-agent, doc-generator |
-| **Location** | Task() prompt (markdown) | .task/IMPL-*.json file |
-| **Persistence** | Temporary (prompt-only) | Persistent (file storage) |
-| **Complexity** | Simple (3-5 steps) | Complex (10+ steps) |
-| **Dependencies** | None | Full `depends_on` support |
-| **Purpose** | Load brainstorming context | Implement task with preparation |
-
-### Inline Format (Brainstorm)
-
-**Marker**: `[FLOW_CONTROL]` written directly in Task() prompt
-
-**Structure**: Markdown list format
-
-**Used By**: Brainstorm commands (`auto-parallel.md`, role commands)
-
-**Agent**: `conceptual-planning-agent`
-
-**Example**:
-```markdown
-[FLOW_CONTROL]
-
-### Flow Control Steps
-**AGENT RESPONSIBILITY**: Execute these pre_analysis steps sequentially with context accumulation:
-
-1. **load_topic_framework**
-   - Action: Load structured topic discussion framework
-   - Command: Read(.workflow/WFS-{session}/.brainstorming/guidance-specification.md)
-   - Output: topic_framework
-
-2. **load_role_template**
-   - Action: Load role-specific planning template
-   - Command: bash($(cat "~/.ccw/workflows/cli-templates/planning-roles/{role}.md"))
-   - Output: role_template
-
-3. **load_session_metadata**
-   - Action: Load session metadata and topic description
-   - Command: bash(cat .workflow/WFS-{session}/workflow-session.json 2>/dev/null || echo '{}')
-   - Output: session_metadata
-```
-
-**Characteristics**:
- 3-5 simple context loading steps
- Written directly in prompt (not persistent)
- No dependency management between steps
- Used for temporary context preparation
- Variables: `[variable_name]` for output references
-
-### JSON Format (Implementation)
-
-**Marker**: `[FLOW_CONTROL]` used in TodoWrite or documentation to indicate task has flow control
-
-**Structure**: Complete JSON structure in task file
-
-**Used By**: Implementation tasks (IMPL-*.json)
-
-**Agents**: `code-developer`, `test-fix-agent`, `doc-generator`
-
-**Example**:
-```json
-"flow_control": {
-  "pre_analysis": [
-    {
-      "step": "load_role_analyses",
-      "action": "Load role analysis documents from brainstorming",
-      "commands": [
-        "bash(ls .workflow/WFS-{session}/.brainstorming/*/analysis*.md 2>/dev/null || echo 'not found')",
-        "Glob(.workflow/WFS-{session}/.brainstorming/*/analysis*.md)",
-        "Read(each discovered role analysis file)"
-      ],
-      "output_to": "role_analyses",
-      "on_error": "skip_optional"
-    },
-    {
-      "step": "local_codebase_exploration",
-      "action": "Explore codebase using local search",
-      "commands": [
-        "bash(rg '^(function|class|interface).*auth' --type ts -n --max-count 15)",
-        "bash(find . -name '*auth*' -type f | grep -v node_modules | head -10)"
-      ],
-      "output_to": "codebase_structure"
-    }
-  ],
-  "implementation_approach": [
-    {
-      "step": 1,
-      "title": "Setup infrastructure",
-      "description": "Install JWT library and create config following [role_analyses]",
-      "modification_points": [
-        "Add JWT library dependencies to package.json",
-        "Create auth configuration file"
-      ],
-      "logic_flow": [
-        "Install jsonwebtoken library via npm",
-        "Configure JWT secret from [role_analyses]",
-        "Export auth config for use by [jwt_generator]"
-      ],
-      "depends_on": [],
-      "output": "auth_config"
-    },
-    {
-      "step": 2,
-      "title": "Implement JWT generation",
-      "description": "Create JWT token generation logic using [auth_config]",
-      "modification_points": [
-        "Add JWT generation function in auth service",
-        "Implement token signing with [auth_config]"
-      ],
-      "logic_flow": [
-        "User login → validate credentials",
-        "Generate JWT payload with user data",
-        "Sign JWT using secret from [auth_config]",
-        "Return signed token"
-      ],
-      "depends_on": [1],
-      "output": "jwt_generator"
-    }
-  ],
-  "target_files": [
-    "src/auth/login.ts:handleLogin:75-120",
-    "src/middleware/auth.ts:validateToken"
-  ]
-}
-```
-
-**Characteristics**:
- Persistent storage in .task/IMPL-*.json files
- Complete dependency management (`depends_on` arrays)
- Two-phase structure: `pre_analysis` + `implementation_approach`
- Error handling strategies (`on_error` field)
- Target file specifications
- Variables: `[variable_name]` for cross-step references
-
-### JSON Format Field Specifications
-
-#### pre_analysis Field
-**Purpose**: Context gathering phase before implementation
-
-**Structure**: Array of step objects with sequential execution
-
-**Step Fields**:
- **step**: Step identifier (string, e.g., "load_role_analyses")
- **action**: Human-readable description of the step
- **command** or **commands**: Single command string or array of command strings
- **output_to**: Variable name for storing step output
- **on_error**: Error handling strategy (`skip_optional`, `fail`, `retry_once`, `manual_intervention`)
-
-**Command Types Supported**:
- **Bash commands**: `bash(command)` - Any shell command
- **Tool calls**: `Read(file)`, `Glob(pattern)`, `Grep(pattern)`
- **MCP tools**: `mcp__exa__get_code_context_exa()`, `mcp__exa__web_search_exa()`
- **CLI commands**: `gemini`, `qwen`, `codex --full-auto exec`
-
-**Example**:
-```json
-{
-  "step": "load_context",
-  "action": "Load project context and patterns",
-  "commands": [
-    "bash(ccw tool exec get_modules_by_depth '{}')",
-    "Read(CLAUDE.md)"
-  ],
-  "output_to": "project_structure",
-  "on_error": "skip_optional"
-}
-```
-
-#### implementation_approach Field
-**Purpose**: Define implementation steps with dependency management
-
-**Structure**: Array of step objects (NOT object format)
-
-**Step Fields (All Required)**:
- **step**: Unique step number (1, 2, 3, ...) - serves as step identifier
- **title**: Brief step title
- **description**: Comprehensive implementation description with context variable references
- **modification_points**: Array of specific code modification targets
- **logic_flow**: Array describing business logic execution sequence
- **depends_on**: Array of step numbers this step depends on (e.g., `[1]`, `[1, 2]`) - empty array `[]` for independent steps
- **output**: Output variable name that can be referenced by subsequent steps via `[output_name]`
-
-**Optional Fields**:
- **command**: Command for step execution (supports any shell command or CLI tool)
-  - When omitted: Agent interprets modification_points and logic_flow to execute
-  - When specified: Command executes the step directly
-
-**Execution Modes**:
- **Default (without command)**: Agent executes based on modification_points and logic_flow
- **With command**: Specified command handles execution
-
-**Command Field Usage**:
- **Default approach**: Omit command field - let agent execute autonomously
- **CLI tools (codex/gemini/qwen)**: Add ONLY when user explicitly requests CLI tool usage
- **Simple commands**: Can include bash commands, test commands, validation scripts
- **Complex workflows**: Use command for multi-step operations or tool coordination
-
-**Command Format Examples** (only when explicitly needed):
-```json
-// Simple Bash
-"command": "bash(npm install package)"
-"command": "bash(npm test)"
-
-// Validation
-"command": "bash(test -f config.ts && grep -q 'JWT_SECRET' config.ts)"
-
-// Codex (user requested)
-"command": "codex -C path --full-auto exec \"task\" --skip-git-repo-check -s danger-full-access"
-
-// Codex Resume (user requested, maintains context)
-"command": "codex --full-auto exec \"task\" resume --last --skip-git-repo-check -s danger-full-access"
-
-// Gemini (user requested)
-"command": "gemini \"analyze [context]\""
-
-// Qwen (fallback for Gemini)
-"command": "qwen \"analyze [context]\""
-```
-
-**Example Step**:
-```json
-{
-  "step": 2,
-  "title": "Implement JWT generation",
-  "description": "Create JWT token generation logic using [auth_config]",
-  "modification_points": [
-    "Add JWT generation function in auth service",
-    "Implement token signing with [auth_config]"
-  ],
-  "logic_flow": [
-    "User login → validate credentials",
-    "Generate JWT payload with user data",
-    "Sign JWT using secret from [auth_config]",
-    "Return signed token"
-  ],
-  "depends_on": [1],
-  "output": "jwt_generator"
-}
-```
-
-#### target_files Field
-**Purpose**: Specify files to be modified or created
-
-**Format**: Array of strings
- **Existing files**: `"file:function:lines"` (e.g., `"src/auth/login.ts:handleLogin:75-120"`)
- **New files**: `"path/to/NewFile.ts"` (file path only)
-
-### Tool Reference
-
-**Available Command Types**:
-
-**Gemini CLI**:
-```bash
-gemini "prompt"
-gemini --approval-mode yolo "prompt"  # For write mode
-```
-
-**Qwen CLI** (Gemini fallback):
-```bash
-qwen "prompt"
-qwen --approval-mode yolo "prompt"  # For write mode
-```
-
-**Codex CLI**:
-```bash
-codex -C directory --full-auto exec "task" --skip-git-repo-check -s danger-full-access
-codex --full-auto exec "task" resume --last --skip-git-repo-check -s danger-full-access
-```
-
-**Built-in Tools**:
- `Read(file_path)` - Read file contents
- `Glob(pattern)` - Find files by pattern
- `Grep(pattern)` - Search content with regex
- `bash(command)` - Execute bash command
-
-**MCP Tools**:
- `mcp__exa__get_code_context_exa(query="...")` - Get code context from Exa
- `mcp__exa__web_search_exa(query="...")` - Web search via Exa
-
-**Bash Commands**:
-```bash
-bash(rg 'pattern' src/)
-bash(find . -name "*.ts")
-bash(npm test)
-bash(git log --oneline | head -5)
-```
-
-### Variable System & Context Flow
-
-**Variable Reference Syntax**:
-Both formats use `[variable_name]` syntax for referencing outputs from previous steps.
-
-**Variable Types**:
- **Step outputs**: `[step_output_name]` - Reference any pre_analysis step output
- **Task properties**: `[task_property]` - Reference any task context field
- **Previous results**: `[analysis_result]` - Reference accumulated context
- **Implementation outputs**: Reference outputs from previous implementation steps
-
-**Examples**:
-```json
-// Reference pre_analysis output
-"description": "Install JWT library following [role_analyses]"
-
-// Reference previous step output
-"description": "Create middleware using [auth_config] and [jwt_generator]"
-
-// Reference task context
-"command": "bash(cd [focus_paths] && npm test)"
-```
-
-**Context Accumulation Process**:
-1. **Structure Analysis**: `get_modules_by_depth.sh` → project hierarchy
-2. **Pattern Analysis**: Tool-specific commands → existing patterns
-3. **Dependency Mapping**: Previous task summaries → inheritance context
-4. **Task Context Generation**: Combined analysis → task.context fields
-
-**Context Inheritance Rules**:
- **Parent → Child**: Container tasks pass context via `context.inherited`
- **Dependency → Dependent**: Previous task summaries via `context.depends_on`
- **Session → Task**: Global session context included in all tasks
- **Module → Feature**: Module patterns inform feature implementation
-
-### Agent Processing Rules
-
-**conceptual-planning-agent** (Inline Format):
- Parses markdown list from prompt
- Executes 3-5 simple loading steps
- No dependency resolution needed
- Accumulates context in variables
- Used only in brainstorm workflows
-
-**code-developer, test-fix-agent** (JSON Format):
- Loads complete task JSON from file
- Executes `pre_analysis` steps sequentially
- Processes `implementation_approach` with dependency resolution
- Handles complex variable substitution
- Updates task status in JSON file
-
-### Usage Guidelines
-
-**Use Inline Format When**:
- Running brainstorm workflows
- Need 3-5 simple context loading steps
- No persistence required
- No dependencies between steps
- Temporary context preparation
-
-**Use JSON Format When**:
- Implementing features or tasks
- Need 10+ complex execution steps
- Require dependency management
- Need persistent task definitions
- Complex variable flow between steps
- Error handling strategies needed
-
-### Variable Reference Syntax
-
-Both formats use `[variable_name]` syntax for referencing outputs:
-
-**Inline Format**:
-```markdown
-2. **analyze_context**
-   - Action: Analyze using [topic_framework] and [role_template]
-   - Output: analysis_results
-```
-
-**JSON Format**:
-```json
-{
-  "step": 2,
-  "description": "Implement following [role_analyses] and [codebase_structure]",
-  "depends_on": [1],
-  "output": "implementation"
-}
-```
-
-### Task Validation Rules
-1. **ID Uniqueness**: All task IDs must be unique
-2. **Hierarchical Format**: Must follow IMPL-N[.M] pattern (maximum 2 levels)
-3. **Parent References**: All parent IDs must exist as JSON files
-4. **Status Consistency**: Status values from defined enumeration
-5. **Required Fields**: All 5 core fields must be present (id, title, status, meta, context, flow_control)
-6. **Focus Paths Structure**: context.focus_paths must contain concrete paths (no wildcards)
-7. **Flow Control Format**: pre_analysis must be array with required fields
-8. **Dependency Integrity**: All task-level depends_on references must exist as JSON files
-9. **Artifacts Structure**: context.artifacts (optional) must use valid type, priority, and path format
-10. **Implementation Steps Array**: implementation_approach must be array of step objects
-11. **Step Number Uniqueness**: All step numbers within a task must be unique and sequential (1, 2, 3, ...)
-12. **Step Dependencies**: All step-level depends_on numbers must reference valid steps within same task
-13. **Step Sequence**: Step numbers should match array order (first item step=1, second item step=2, etc.)
-14. **Step Required Fields**: Each step must have step, title, description, modification_points, logic_flow, depends_on, output
-15. **Step Optional Fields**: command field is optional - when omitted, agent executes based on modification_points and logic_flow
-
-## Workflow Structure
-
-### Unified File Structure
-All workflows use the same file structure definition regardless of complexity. **Directories and files are created on-demand as needed**, not all at once during initialization.
-
-#### Complete Structure Reference
-```
-.workflow/
-├── [.scratchpad/]              # Non-session-specific outputs (created when needed)
-│   ├── analyze-*-[timestamp].md        # One-off analysis results
-│   ├── chat-*-[timestamp].md           # Standalone chat sessions
-│   ├── plan-*-[timestamp].md           # Ad-hoc planning notes
-│   ├── bug-index-*-[timestamp].md      # Quick bug analyses
-│   ├── code-analysis-*-[timestamp].md  # Standalone code analysis
-│   ├── execute-*-[timestamp].md        # Ad-hoc implementation logs
-│   └── codex-execute-*-[timestamp].md  # Multi-stage execution logs
-│
-├── [design-run-*/]             # Standalone UI design outputs (created when needed)
-│   └── (timestamped)/          # Timestamped design runs without session
-│       ├── .intermediates/     # Intermediate analysis files
-│       │   ├── style-analysis/ # Style analysis data
-│       │   │   ├── computed-styles.json        # Extracted CSS values
-│       │   │   └── design-space-analysis.json  # Design directions
-│       │   └── layout-analysis/ # Layout analysis data
-│       │       ├── dom-structure-{target}.json # DOM extraction
-│       │       └── inspirations/               # Layout research
-│       │           └── {target}-layout-ideas.txt
-│       ├── style-extraction/   # Final design systems
-│       │   ├── style-1/        # design-tokens.json, style-guide.md
-│       │   └── style-N/
-│       ├── layout-extraction/  # Layout templates
-│       │   └── layout-templates.json
-│       ├── prototypes/         # Generated HTML/CSS prototypes
-│       │   ├── {target}-style-{s}-layout-{l}.html  # Final prototypes
-│       │   ├── compare.html    # Interactive matrix view
-│       │   └── index.html      # Navigation page
-│       └── .run-metadata.json  # Run configuration
-│
-├── active/                          # Active workflow sessions
-│   └── WFS-[topic-slug]/
-│       ├── workflow-session.json        # Session metadata and state (REQUIRED)
-│       ├── [.brainstorming/]           # Optional brainstorming phase (created when needed)
-│       ├── [.chat/]                    # CLI interaction sessions (created when analysis is run)
-│       │   ├── chat-*.md              # Saved chat sessions
-│       │   └── analysis-*.md          # Analysis results
-│       ├── [.process/]                 # Planning analysis results (created by /workflow-plan)
-│       │   └── ANALYSIS_RESULTS.md    # Analysis results and planning artifacts
-│       ├── IMPL_PLAN.md                # Planning document (REQUIRED)
-│       ├── TODO_LIST.md                # Progress tracking (REQUIRED)
-│       ├── [.summaries/]               # Task completion summaries (created when tasks complete)
-│       │   ├── IMPL-*-summary.md      # Main task summaries
-│       │   └── IMPL-*.*-summary.md    # Subtask summaries
-│       ├── [.review/]                  # Code review results (created by review commands)
-│       │   ├── review-metadata.json    # Review configuration and scope
-│       │   ├── review-state.json       # Review state machine
-│       │   ├── review-progress.json    # Real-time progress tracking
-│       │   ├── dimensions/             # Per-dimension analysis results
-│       │   ├── iterations/             # Deep-dive iteration results
-│       │   ├── reports/                # Human-readable reports and CLI outputs
-│       │   ├── REVIEW-SUMMARY.md       # Final consolidated summary
-│       │   └── dashboard.html          # Interactive review dashboard
-│       ├── [design-*/]                 # UI design outputs (created by ui-design workflows)
-│       │   ├── .intermediates/         # Intermediate analysis files
-│       │   │   ├── style-analysis/     # Style analysis data
-│       │   │   │   ├── computed-styles.json        # Extracted CSS values
-│       │   │   │   └── design-space-analysis.json  # Design directions
-│       │   │   └── layout-analysis/    # Layout analysis data
-│       │   │       ├── dom-structure-{target}.json # DOM extraction
-│       │   │       └── inspirations/               # Layout research
-│       │   │           └── {target}-layout-ideas.txt
-│       │   ├── style-extraction/       # Final design systems
-│       │   │   ├── style-1/            # design-tokens.json, style-guide.md
-│       │   │   └── style-N/
-│       │   ├── layout-extraction/      # Layout templates
-│       │   │   └── layout-templates.json
-│       │   ├── prototypes/             # Generated HTML/CSS prototypes
-│       │   │   ├── {target}-style-{s}-layout-{l}.html  # Final prototypes
-│       │   │   ├── compare.html        # Interactive matrix view
-│       │   │   └── index.html          # Navigation page
-│       │   └── .run-metadata.json      # Run configuration
-│       └── .task/                      # Task definitions (REQUIRED)
-│           ├── IMPL-*.json             # Main task definitions
-│           └── IMPL-*.*.json           # Subtask definitions (created dynamically)
-└── archives/                       # Completed workflow sessions
-    └── WFS-[completed-topic]/      # Archived session directories
-```
-
-#### Creation Strategy
- **Initial Setup**: Create only `workflow-session.json`, `IMPL_PLAN.md`, `TODO_LIST.md`, and `.task/` directory
- **On-Demand Creation**: Other directories created when first needed
- **Dynamic Files**: Subtask JSON files created during task decomposition
- **Scratchpad Usage**: `.scratchpad/` created when CLI commands run without active session
- **Design Usage**: `design-{timestamp}/` created by UI design workflows in `.workflow/` directly for standalone design runs
- **Review Usage**: `.review/` created by review commands (`/workflow:review-module-cycle`, `/workflow:review-session-cycle`) for comprehensive code quality analysis
- **Intermediate Files**: `.intermediates/` contains analysis data (style/layout) separate from final deliverables
- **Layout Templates**: `layout-extraction/layout-templates.json` contains structural templates for UI assembly
-
-#### Scratchpad Directory (.scratchpad/)
-**Purpose**: Centralized location for non-session-specific CLI outputs
-
-**When to Use**:
-1. **No Active Session**: CLI analysis/chat commands run without an active workflow session
-2. **Unrelated Analysis**: Quick analysis not related to current active session
-3. **Exploratory Work**: Ad-hoc investigation before creating formal workflow
-4. **One-Off Queries**: Standalone questions or debugging without workflow context
-
-**Output Routing Logic**:
- **IF** active session exists in `.workflow/active/` AND command is session-relevant:
-  - Save to `.workflow/active/WFS-[id]/.chat/[command]-[timestamp].md`
- **ELSE** (no session OR one-off analysis):
-  - Save to `.workflow/.scratchpad/[command]-[description]-[timestamp].md`
-
-**File Naming Pattern**: `[command-type]-[brief-description]-[timestamp].md`
-
-**Examples**:
-
-*Workflow Commands (lightweight):*
- `/workflow-lite-plan "feature idea"` (exploratory) → `.scratchpad/lite-plan-feature-idea-20250105-143110.md`
- `/workflow:lite-fix "bug description"` (bug fixing) → `.scratchpad/lite-fix-bug-20250105-143130.md`
-
-> **Note**: Direct CLI commands (`/cli:analyze`, `/cli:execute`, etc.) have been replaced by semantic invocation and workflow commands.
-
-**Maintenance**:
- Periodically review and clean up old scratchpad files
- Promote useful analyses to formal workflow sessions if needed
- No automatic cleanup - manual management recommended
-
-### File Naming Conventions
-
-#### Session Identifiers
-**Format**: `WFS-[topic-slug]`
-
-**WFS Prefix Meaning**:
- `WFS` = **W**ork**F**low **S**ession
- Identifies directories as workflow session containers
- Distinguishes workflow sessions from other project directories
-
-**Naming Rules**:
- Convert topic to lowercase with hyphens (e.g., "User Auth System" → `WFS-user-auth-system`)
- Add `-NNN` suffix only if conflicts exist (e.g., `WFS-payment-integration-002`)
- Maximum length: 50 characters including WFS- prefix
-
-#### Document Naming
- `workflow-session.json` - Session state (required)
- `IMPL_PLAN.md` - Planning document (required)
- `TODO_LIST.md` - Progress tracking (auto-generated when needed)
- Chat sessions: `chat-analysis-*.md`
- Task summaries: `IMPL-[task-id]-summary.md`
-
-### Document Templates
-
-#### TODO_LIST.md Template
-```markdown
-# Tasks: [Session Topic]
-
-## Task Progress
-▸ **IMPL-001**: [Main Task Group] → [📋](./.task/IMPL-001.json)
-  - [ ] **IMPL-001.1**: [Subtask] → [📋](./.task/IMPL-001.1.json)
-  - [x] **IMPL-001.2**: [Subtask] → [📋](./.task/IMPL-001.2.json) | [✅](./.summaries/IMPL-001.2-summary.md)
-
- [x] **IMPL-002**: [Simple Task] → [📋](./.task/IMPL-002.json) | [✅](./.summaries/IMPL-002-summary.md)
-
-## Status Legend
- `▸` = Container task (has subtasks)
- `- [ ]` = Pending leaf task
- `- [x]` = Completed leaf task
- Maximum 2 levels: Main tasks and subtasks only
-```
-
-## Operations Guide
-
-### Session Management
-```bash
-# Create minimal required structure
-mkdir -p .workflow/active/WFS-topic-slug/.task
-echo '{"session_id":"WFS-topic-slug",...}' > .workflow/active/WFS-topic-slug/workflow-session.json
-echo '# Implementation Plan' > .workflow/active/WFS-topic-slug/IMPL_PLAN.md
-echo '# Tasks' > .workflow/active/WFS-topic-slug/TODO_LIST.md
-```
-
-### Task Operations
-```bash
-# Create task
-echo '{"id":"IMPL-1","title":"New task",...}' > .task/IMPL-1.json
-
-# Update task status
-jq '.status = "active"' .task/IMPL-1.json > temp && mv temp .task/IMPL-1.json
-
-# Generate TODO list from JSON state
-generate_todo_list_from_json .task/
-```
-
-### Directory Creation (On-Demand)
-```bash
-mkdir -p .brainstorming     # When brainstorming is initiated
-mkdir -p .chat              # When analysis commands are run
-mkdir -p .summaries         # When first task completes
-```
-
-### Session Consistency Checks & Recovery
-```bash
-# Validate session directory structure
-if [ -d ".workflow/active/" ]; then
-  for session_dir in .workflow/active/WFS-*; do
-    if [ ! -f "$session_dir/workflow-session.json" ]; then
-      echo "⚠️ Missing workflow-session.json in $session_dir"
-    fi
-  done
-fi
-```
-
-**Recovery Strategies**:
- **Missing Session File**: Recreate workflow-session.json from template
- **Corrupted Session File**: Restore from template with basic metadata
- **Broken Task Hierarchy**: Reconstruct parent-child relationships from task JSON files
- **Orphaned Sessions**: Move incomplete sessions to archives/
-
-## Complexity Classification
-
-### Task Complexity Rules
-**Complexity is determined by task count and decomposition needs:**
-
-| Complexity | Task Count | Hierarchy Depth | Decomposition Behavior |
-|------------|------------|----------------|----------------------|
-| **Simple** | <5 tasks | 1 level (IMPL-N) | Direct execution, minimal decomposition |
-| **Medium** | 5-15 tasks | 2 levels (IMPL-N.M) | Moderate decomposition, context coordination |
-| **Complex** | >15 tasks | 2 levels (IMPL-N.M) | Frequent decomposition, multi-agent orchestration |
-
-### Workflow Characteristics & Tool Guidance
-
-#### Simple Workflows
- **Examples**: Bug fixes, small feature additions, configuration changes
- **Task Decomposition**: Usually single-level tasks, minimal breakdown needed
- **Agent Coordination**: Direct execution without complex orchestration
- **Tool Strategy**: `bash()` commands, `grep()` for pattern matching
-
-#### Medium Workflows
- **Examples**: New features, API endpoints with integration, database schema changes
- **Task Decomposition**: Two-level hierarchy when decomposition is needed
- **Agent Coordination**: Context coordination between related tasks
- **Tool Strategy**: `gemini` for pattern analysis, `codex --full-auto` for implementation
-
-#### Complex Workflows
- **Examples**: Major features, architecture refactoring, security implementations, multi-service deployments
- **Task Decomposition**: Frequent use of two-level hierarchy with dynamic subtask creation
- **Agent Coordination**: Multi-agent orchestration with deep context analysis
- **Tool Strategy**: `gemini` for architecture analysis, `codex --full-auto` for complex problem solving, `bash()` commands for flexible analysis
-
-### Assessment & Upgrades
- **During Creation**: System evaluates requirements and assigns complexity
- **During Execution**: Can upgrade (Simple→Medium→Complex) but never downgrade
- **Override Allowed**: Users can specify higher complexity manually
-
-## Agent Integration
-
-### Agent Assignment
-Based on task type and title keywords:
- **Planning tasks** → @action-planning-agent
- **Implementation** → @code-developer (code + tests)
- **Test execution/fixing** → @test-fix-agent
- **Review** → @universal-executor (optional, only when explicitly requested)
-
-### Execution Context
-Agents receive complete task JSON plus workflow context:
-```json
-{
-  "task": { /* complete task JSON */ },
-  "workflow": {
-    "session": "WFS-user-auth",
-    "phase": "IMPLEMENT"
-  }
-}
-```
-
--- a/.gitignore
+++ b/.gitignore
@@ -143,3 +143,21 @@ ccw/.tmp-ccw-auth-home/
 docs/node_modules/
 docs/.vitepress/dist/
 docs/.vitepress/cache/
+codex-lens/.cache/huggingface/hub/models--Xenova--ms-marco-MiniLM-L-6-v2/refs/main
+codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/.gitattributes
+codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/config.json
+codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/quantize_config.json
+codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/README.md
+codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/special_tokens_map.json
+codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/tokenizer_config.json
+codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/tokenizer.json
+codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/vocab.txt
+codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/onnx/model_bnb4.onnx
+codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/onnx/model_fp16.onnx
+codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/onnx/model_int8.onnx
+codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/onnx/model_q4.onnx
+codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/onnx/model_q4f16.onnx
+codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/onnx/model_quantized.onnx
+codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/onnx/model_uint8.onnx
+codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/onnx/model.onnx
+codex-lens/data/registry.db
--- a/codex-lens-v2/conftest.py
+++ b/codex-lens-v2/conftest.py
@@ -0,0 +1,5 @@
+import sys
+import os
+
+# Ensure the local src directory takes precedence over any installed codexlens package
+sys.path.insert(0, os.path.join(os.path.dirname(__file__), "src"))
--- a/codex-lens-v2/pyproject.toml
+++ b/codex-lens-v2/pyproject.toml
@@ -0,0 +1,36 @@
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[project]
+name = "codex-lens-v2"
+version = "0.1.0"
+description = "Minimal code semantic search library with 2-stage pipeline"
+requires-python = ">=3.10"
+dependencies = []
+
+[project.optional-dependencies]
+semantic = [
+    "hnswlib>=0.8.0",
+    "numpy>=1.26",
+    "fastembed>=0.4.0,<2.0",
+]
+gpu = [
+    "onnxruntime-gpu>=1.16",
+]
+faiss-cpu = [
+    "faiss-cpu>=1.7.4",
+]
+faiss-gpu = [
+    "faiss-gpu>=1.7.4",
+]
+reranker-api = [
+    "httpx>=0.25",
+]
+dev = [
+    "pytest>=7.0",
+    "pytest-cov",
+]
+
+[tool.hatch.build.targets.wheel]
+packages = ["src/codexlens"]
--- a/codex-lens-v2/scripts/index_and_search.py
+++ b/codex-lens-v2/scripts/index_and_search.py
@@ -0,0 +1,128 @@
+"""
+对 D:/Claude_dms3 仓库进行索引并测试搜索。
+用法: python scripts/index_and_search.py
+"""
+import sys
+import time
+from pathlib import Path
+
+# 确保 src 可被导入
+sys.path.insert(0, str(Path(__file__).parent.parent / "src"))
+
+from codexlens.config import Config
+from codexlens.core.factory import create_ann_index, create_binary_index
+from codexlens.embed.local import FastEmbedEmbedder
+from codexlens.indexing import IndexingPipeline
+from codexlens.rerank.local import FastEmbedReranker
+from codexlens.search.fts import FTSEngine
+from codexlens.search.pipeline import SearchPipeline
+
+# ─── 配置 ──────────────────────────────────────────────────────────────────
+REPO_ROOT = Path("D:/Claude_dms3")
+INDEX_DIR = Path("D:/Claude_dms3/codex-lens-v2/.index_cache")
+EXTENSIONS = {".py", ".ts", ".js", ".md"}
+MAX_FILE_SIZE = 50_000   # bytes
+MAX_CHUNK_CHARS = 800    # 每个 chunk 的最大字符数
+CHUNK_OVERLAP = 100
+
+# ─── 文件收集 ───────────────────────────────────────────────────────────────
+SKIP_DIRS = {
+    ".git", "node_modules", "__pycache__", ".pytest_cache",
+    "dist", "build", ".venv", "venv", ".cache", ".index_cache",
+    "codex-lens-v2",  # 不索引自身
+}
+
+def collect_files(root: Path) -> list[Path]:
+    files = []
+    for p in root.rglob("*"):
+        if any(part in SKIP_DIRS for part in p.parts):
+            continue
+        if p.is_file() and p.suffix in EXTENSIONS:
+            if p.stat().st_size <= MAX_FILE_SIZE:
+                files.append(p)
+    return files
+
+# ─── 主流程 ─────────────────────────────────────────────────────────────────
+def main():
+    INDEX_DIR.mkdir(parents=True, exist_ok=True)
+
+    # 1. 使用小 profile 加快速度
+    config = Config(
+        embed_model="BAAI/bge-small-en-v1.5",
+        embed_dim=384,
+        embed_batch_size=32,
+        hnsw_ef=100,
+        hnsw_M=16,
+        binary_top_k=100,
+        ann_top_k=30,
+        reranker_top_k=10,
+    )
+
+    print("=== codex-lens-v2 索引测试 ===\n")
+
+    # 2. 收集文件
+    print(f"[1/4] 扫描 {REPO_ROOT} ...")
+    files = collect_files(REPO_ROOT)
+    print(f"      找到 {len(files)} 个文件")
+
+    # 3. 初始化组件
+    print(f"\n[2/4] 加载嵌入模型 (bge-small-en-v1.5, dim=384) ...")
+    embedder = FastEmbedEmbedder(config)
+    binary_store = create_binary_index(INDEX_DIR, config.embed_dim, config)
+    ann_index = create_ann_index(INDEX_DIR, config.embed_dim, config)
+    fts = FTSEngine(":memory:")   # 内存 FTS，不持久化
+
+    # 4. 使用 IndexingPipeline 并行索引 (chunk -> embed -> index)
+    print(f"[3/4] 并行索引 {len(files)} 个文件 ...")
+    pipeline = IndexingPipeline(
+        embedder=embedder,
+        binary_store=binary_store,
+        ann_index=ann_index,
+        fts=fts,
+        config=config,
+    )
+    stats = pipeline.index_files(
+        files,
+        root=REPO_ROOT,
+        max_chunk_chars=MAX_CHUNK_CHARS,
+        chunk_overlap=CHUNK_OVERLAP,
+        max_file_size=MAX_FILE_SIZE,
+    )
+    print(f"      索引完成: {stats.files_processed} 文件, {stats.chunks_created} chunks ({stats.duration_seconds:.1f}s)")
+
+    # 5. 搜索测试
+    print(f"\n[4/4] 构建 SearchPipeline ...")
+    reranker = FastEmbedReranker(config)
+    pipeline = SearchPipeline(
+        embedder=embedder,
+        binary_store=binary_store,
+        ann_index=ann_index,
+        reranker=reranker,
+        fts=fts,
+        config=config,
+    )
+
+    queries = [
+        "authentication middleware function",
+        "def embed_single",
+        "RRF fusion weights",
+        "fastembed TextCrossEncoder reranker",
+        "how to search code semantic",
+    ]
+
+    print("\n" + "=" * 60)
+    for query in queries:
+        t0 = time.time()
+        results = pipeline.search(query, top_k=5)
+        elapsed = time.time() - t0
+        print(f"\nQuery: {query!r}  ({elapsed*1000:.0f}ms)")
+        if results:
+            for r in results:
+                print(f"  [{r.score:.3f}] {r.path}")
+        else:
+            print("  (无结果)")
+    print("=" * 60)
+    print("\n测试完成 ✓")
+
+if __name__ == "__main__":
+    main()
--- a/codex-lens-v2/src/codexlens/init.py
+++ b/codex-lens-v2/src/codexlens/init.py
--- a/codex-lens-v2/src/codexlens/config.py
+++ b/codex-lens-v2/src/codexlens/config.py
@@ -0,0 +1,99 @@
+from __future__ import annotations
+import logging
+from dataclasses import dataclass, field
+
+log = logging.getLogger(__name__)
+
+
+@dataclass
+class Config:
+    # Embedding
+    embed_model: str = "jinaai/jina-embeddings-v2-base-code"
+    embed_dim: int = 768
+    embed_batch_size: int = 64
+
+    # GPU / execution providers
+    device: str = "auto"  # 'auto', 'cuda', 'cpu'
+    embed_providers: list[str] | None = None  # explicit ONNX providers override
+
+    # Backend selection: 'auto', 'faiss', 'hnswlib'
+    ann_backend: str = "auto"
+    binary_backend: str = "auto"
+
+    # Indexing pipeline
+    index_workers: int = 2  # number of parallel indexing workers
+
+    # HNSW index (ANNIndex)
+    hnsw_ef: int = 150
+    hnsw_M: int = 32
+    hnsw_ef_construction: int = 200
+
+    # Binary coarse search (BinaryStore)
+    binary_top_k: int = 200
+
+    # ANN fine search
+    ann_top_k: int = 50
+
+    # Reranker
+    reranker_model: str = "BAAI/bge-reranker-v2-m3"
+    reranker_top_k: int = 20
+    reranker_batch_size: int = 32
+
+    # API reranker (optional)
+    reranker_api_url: str = ""
+    reranker_api_key: str = ""
+    reranker_api_model: str = ""
+    reranker_api_max_tokens_per_batch: int = 2048
+
+    # FTS
+    fts_top_k: int = 50
+
+    # Fusion
+    fusion_k: int = 60  # RRF k parameter
+    fusion_weights: dict = field(default_factory=lambda: {
+        "exact": 0.25,
+        "fuzzy": 0.10,
+        "vector": 0.50,
+        "graph": 0.15,
+    })
+
+    def resolve_embed_providers(self) -> list[str]:
+        """Return ONNX execution providers based on device config.
+
+        Priority: explicit embed_providers > device setting > auto-detect.
+        """
+        if self.embed_providers is not None:
+            return list(self.embed_providers)
+
+        if self.device == "cuda":
+            return ["CUDAExecutionProvider", "CPUExecutionProvider"]
+
+        if self.device == "cpu":
+            return ["CPUExecutionProvider"]
+
+        # auto-detect
+        try:
+            import onnxruntime
+            available = onnxruntime.get_available_providers()
+            if "CUDAExecutionProvider" in available:
+                log.info("CUDA detected via onnxruntime, using GPU for embedding")
+                return ["CUDAExecutionProvider", "CPUExecutionProvider"]
+        except ImportError:
+            pass
+
+        return ["CPUExecutionProvider"]
+
+    @classmethod
+    def defaults(cls) -> "Config":
+        return cls()
+
+    @classmethod
+    def small(cls) -> "Config":
+        """Smaller config for testing or small corpora."""
+        return cls(
+            hnsw_ef=50,
+            hnsw_M=16,
+            binary_top_k=50,
+            ann_top_k=20,
+            reranker_top_k=10,
+        )
--- a/codex-lens-v2/src/codexlens/core/init.py
+++ b/codex-lens-v2/src/codexlens/core/init.py
@@ -0,0 +1,13 @@
+from .base import BaseANNIndex, BaseBinaryIndex
+from .binary import BinaryStore
+from .factory import create_ann_index, create_binary_index
+from .index import ANNIndex
+
+__all__ = [
+    "BaseANNIndex",
+    "BaseBinaryIndex",
+    "ANNIndex",
+    "BinaryStore",
+    "create_ann_index",
+    "create_binary_index",
+]
--- a/codex-lens-v2/src/codexlens/core/base.py
+++ b/codex-lens-v2/src/codexlens/core/base.py
@@ -0,0 +1,83 @@
+from __future__ import annotations
+
+from abc import ABC, abstractmethod
+
+import numpy as np
+
+
+class BaseANNIndex(ABC):
+    """Abstract base class for approximate nearest neighbor indexes."""
+
+    @abstractmethod
+    def add(self, ids: np.ndarray, vectors: np.ndarray) -> None:
+        """Add float32 vectors with corresponding IDs.
+
+        Args:
+            ids: shape (N,) int64
+            vectors: shape (N, dim) float32
+        """
+
+    @abstractmethod
+    def fine_search(
+        self, query_vec: np.ndarray, top_k: int | None = None
+    ) -> tuple[np.ndarray, np.ndarray]:
+        """Search for nearest neighbors.
+
+        Args:
+            query_vec: float32 vector of shape (dim,)
+            top_k: number of results
+
+        Returns:
+            (ids, distances) as numpy arrays
+        """
+
+    @abstractmethod
+    def save(self) -> None:
+        """Persist index to disk."""
+
+    @abstractmethod
+    def load(self) -> None:
+        """Load index from disk."""
+
+    @abstractmethod
+    def __len__(self) -> int:
+        """Return the number of indexed items."""
+
+
+class BaseBinaryIndex(ABC):
+    """Abstract base class for binary vector indexes (Hamming distance)."""
+
+    @abstractmethod
+    def add(self, ids: np.ndarray, vectors: np.ndarray) -> None:
+        """Add float32 vectors (will be binary-quantized internally).
+
+        Args:
+            ids: shape (N,) int64
+            vectors: shape (N, dim) float32
+        """
+
+    @abstractmethod
+    def coarse_search(
+        self, query_vec: np.ndarray, top_k: int | None = None
+    ) -> tuple[np.ndarray, np.ndarray]:
+        """Search by Hamming distance.
+
+        Args:
+            query_vec: float32 vector of shape (dim,)
+            top_k: number of results
+
+        Returns:
+            (ids, distances) sorted ascending by distance
+        """
+
+    @abstractmethod
+    def save(self) -> None:
+        """Persist store to disk."""
+
+    @abstractmethod
+    def load(self) -> None:
+        """Load store from disk."""
+
+    @abstractmethod
+    def __len__(self) -> int:
+        """Return the number of stored items."""
--- a/codex-lens-v2/src/codexlens/core/binary.py
+++ b/codex-lens-v2/src/codexlens/core/binary.py
@@ -0,0 +1,173 @@
+from __future__ import annotations
+
+import logging
+import math
+from pathlib import Path
+
+import numpy as np
+
+from codexlens.config import Config
+from codexlens.core.base import BaseBinaryIndex
+
+logger = logging.getLogger(__name__)
+
+
+class BinaryStore(BaseBinaryIndex):
+    """Persistent binary vector store using numpy memmap.
+
+    Stores binary-quantized float32 vectors as packed uint8 arrays on disk.
+    Supports fast coarse search via XOR + popcount Hamming distance.
+    """
+
+    def __init__(self, path: str | Path, dim: int, config: Config) -> None:
+        self._dir = Path(path)
+        self._dim = dim
+        self._config = config
+        self._packed_bytes = math.ceil(dim / 8)
+
+        self._bin_path = self._dir / "binary_store.bin"
+        self._ids_path = self._dir / "binary_store_ids.npy"
+
+        self._matrix: np.ndarray | None = None  # shape (N, packed_bytes), uint8
+        self._ids: np.ndarray | None = None      # shape (N,), int64
+        self._count: int = 0
+
+        if self._bin_path.exists() and self._ids_path.exists():
+            self.load()
+
+    # ------------------------------------------------------------------
+    # Internal helpers
+    # ------------------------------------------------------------------
+
+    def _quantize(self, vectors: np.ndarray) -> np.ndarray:
+        """Convert float32 vectors (N, dim) to packed uint8 (N, packed_bytes)."""
+        binary = (vectors > 0).astype(np.uint8)
+        packed = np.packbits(binary, axis=1)
+        return packed
+
+    def _quantize_single(self, vec: np.ndarray) -> np.ndarray:
+        """Convert a single float32 vector (dim,) to packed uint8 (packed_bytes,)."""
+        binary = (vec > 0).astype(np.uint8)
+        return np.packbits(binary)
+
+    # ------------------------------------------------------------------
+    # Public API
+    # ------------------------------------------------------------------
+
+    def _ensure_capacity(self, needed: int) -> None:
+        """Grow pre-allocated matrix/ids arrays to fit *needed* total items."""
+        if self._matrix is not None and self._matrix.shape[0] >= needed:
+            return
+
+        new_cap = max(1024, needed)
+        # Double until large enough
+        if self._matrix is not None:
+            cur_cap = self._matrix.shape[0]
+            new_cap = max(cur_cap, 1024)
+            while new_cap < needed:
+                new_cap *= 2
+
+        new_matrix = np.zeros((new_cap, self._packed_bytes), dtype=np.uint8)
+        new_ids = np.zeros(new_cap, dtype=np.int64)
+
+        if self._matrix is not None and self._count > 0:
+            new_matrix[: self._count] = self._matrix[: self._count]
+            new_ids[: self._count] = self._ids[: self._count]
+
+        self._matrix = new_matrix
+        self._ids = new_ids
+
+    def add(self, ids: np.ndarray, vectors: np.ndarray) -> None:
+        """Add float32 vectors and their ids.
+
+        Does NOT call save() internally -- callers must call save()
+        explicitly after batch indexing.
+
+        Args:
+            ids: shape (N,) int64
+            vectors: shape (N, dim) float32
+        """
+        if len(ids) == 0:
+            return
+
+        packed = self._quantize(vectors)  # (N, packed_bytes)
+        n = len(ids)
+
+        self._ensure_capacity(self._count + n)
+        self._matrix[self._count : self._count + n] = packed
+        self._ids[self._count : self._count + n] = ids.astype(np.int64)
+        self._count += n
+
+    def coarse_search(
+        self, query_vec: np.ndarray, top_k: int | None = None
+    ) -> tuple[np.ndarray, np.ndarray]:
+        """Search by Hamming distance.
+
+        Args:
+            query_vec: float32 vector of shape (dim,)
+            top_k: number of results; defaults to config.binary_top_k
+
+        Returns:
+            (ids, distances) sorted ascending by Hamming distance
+        """
+        if self._matrix is None or self._count == 0:
+            return np.array([], dtype=np.int64), np.array([], dtype=np.int32)
+
+        k = top_k if top_k is not None else self._config.binary_top_k
+        k = min(k, self._count)
+
+        query_bin = self._quantize_single(query_vec)  # (packed_bytes,)
+
+        # Slice to active region (matrix may be pre-allocated larger)
+        active_matrix = self._matrix[: self._count]
+        active_ids = self._ids[: self._count]
+
+        # XOR then popcount via unpackbits
+        xor = np.bitwise_xor(active_matrix, query_bin[np.newaxis, :])  # (N, packed_bytes)
+        dists = np.unpackbits(xor, axis=1).sum(axis=1).astype(np.int32)  # (N,)
+
+        if k >= self._count:
+            order = np.argsort(dists)
+        else:
+            part = np.argpartition(dists, k)[:k]
+            order = part[np.argsort(dists[part])]
+
+        return active_ids[order], dists[order]
+
+    def save(self) -> None:
+        """Flush binary store to disk."""
+        if self._matrix is None or self._count == 0:
+            return
+        self._dir.mkdir(parents=True, exist_ok=True)
+        # Write only the occupied portion of the pre-allocated matrix
+        active_matrix = self._matrix[: self._count]
+        mm = np.memmap(
+            str(self._bin_path),
+            dtype=np.uint8,
+            mode="w+",
+            shape=active_matrix.shape,
+        )
+        mm[:] = active_matrix
+        mm.flush()
+        del mm
+        np.save(str(self._ids_path), self._ids[: self._count])
+
+    def load(self) -> None:
+        """Reload binary store from disk."""
+        ids = np.load(str(self._ids_path))
+        n = len(ids)
+        if n == 0:
+            return
+        mm = np.memmap(
+            str(self._bin_path),
+            dtype=np.uint8,
+            mode="r",
+            shape=(n, self._packed_bytes),
+        )
+        self._matrix = np.array(mm)  # copy into RAM for mutation support
+        del mm
+        self._ids = ids.astype(np.int64)
+        self._count = n
+
+    def __len__(self) -> int:
+        return self._count
--- a/codex-lens-v2/src/codexlens/core/factory.py
+++ b/codex-lens-v2/src/codexlens/core/factory.py
@@ -0,0 +1,116 @@
+from __future__ import annotations
+
+import logging
+from pathlib import Path
+
+from codexlens.config import Config
+from codexlens.core.base import BaseANNIndex, BaseBinaryIndex
+
+logger = logging.getLogger(__name__)
+
+try:
+    import faiss as _faiss  # noqa: F401
+    _FAISS_AVAILABLE = True
+except ImportError:
+    _FAISS_AVAILABLE = False
+
+try:
+    import hnswlib as _hnswlib  # noqa: F401
+    _HNSWLIB_AVAILABLE = True
+except ImportError:
+    _HNSWLIB_AVAILABLE = False
+
+
+def _has_faiss_gpu() -> bool:
+    """Check whether faiss-gpu is available (has GPU resources)."""
+    if not _FAISS_AVAILABLE:
+        return False
+    try:
+        import faiss
+        res = faiss.StandardGpuResources()  # noqa: F841
+        return True
+    except (AttributeError, RuntimeError):
+        return False
+
+
+def create_ann_index(path: str | Path, dim: int, config: Config) -> BaseANNIndex:
+    """Create an ANN index based on config.ann_backend.
+
+    Fallback chain for 'auto': faiss-gpu -> faiss-cpu -> hnswlib.
+
+    Args:
+        path: directory for index persistence
+        dim: vector dimensionality
+        config: project configuration
+
+    Returns:
+        A BaseANNIndex implementation
+
+    Raises:
+        ImportError: if no suitable backend is available
+    """
+    backend = config.ann_backend
+
+    if backend == "faiss":
+        from codexlens.core.faiss_index import FAISSANNIndex
+        return FAISSANNIndex(path, dim, config)
+
+    if backend == "hnswlib":
+        from codexlens.core.index import ANNIndex
+        return ANNIndex(path, dim, config)
+
+    # auto: try faiss first, then hnswlib
+    if _FAISS_AVAILABLE:
+        from codexlens.core.faiss_index import FAISSANNIndex
+        gpu_tag = " (GPU available)" if _has_faiss_gpu() else " (CPU)"
+        logger.info("Auto-selected FAISS ANN backend%s", gpu_tag)
+        return FAISSANNIndex(path, dim, config)
+
+    if _HNSWLIB_AVAILABLE:
+        from codexlens.core.index import ANNIndex
+        logger.info("Auto-selected hnswlib ANN backend")
+        return ANNIndex(path, dim, config)
+
+    raise ImportError(
+        "No ANN backend available. Install faiss-cpu, faiss-gpu, or hnswlib."
+    )
+
+
+def create_binary_index(
+    path: str | Path, dim: int, config: Config
+) -> BaseBinaryIndex:
+    """Create a binary index based on config.binary_backend.
+
+    Fallback chain for 'auto': faiss -> numpy BinaryStore.
+
+    Args:
+        path: directory for index persistence
+        dim: vector dimensionality
+        config: project configuration
+
+    Returns:
+        A BaseBinaryIndex implementation
+
+    Raises:
+        ImportError: if no suitable backend is available
+    """
+    backend = config.binary_backend
+
+    if backend == "faiss":
+        from codexlens.core.faiss_index import FAISSBinaryIndex
+        return FAISSBinaryIndex(path, dim, config)
+
+    if backend == "hnswlib":
+        from codexlens.core.binary import BinaryStore
+        return BinaryStore(path, dim, config)
+
+    # auto: try faiss first, then numpy-based BinaryStore
+    if _FAISS_AVAILABLE:
+        from codexlens.core.faiss_index import FAISSBinaryIndex
+        logger.info("Auto-selected FAISS binary backend")
+        return FAISSBinaryIndex(path, dim, config)
+
+    # numpy BinaryStore is always available (no extra deps)
+    from codexlens.core.binary import BinaryStore
+    logger.info("Auto-selected numpy BinaryStore backend")
+    return BinaryStore(path, dim, config)
--- a/codex-lens-v2/src/codexlens/core/faiss_index.py
+++ b/codex-lens-v2/src/codexlens/core/faiss_index.py
@@ -0,0 +1,275 @@
+from __future__ import annotations
+
+import logging
+import math
+import threading
+from pathlib import Path
+
+import numpy as np
+
+from codexlens.config import Config
+from codexlens.core.base import BaseANNIndex, BaseBinaryIndex
+
+logger = logging.getLogger(__name__)
+
+try:
+    import faiss
+    _FAISS_AVAILABLE = True
+except ImportError:
+    faiss = None  # type: ignore[assignment]
+    _FAISS_AVAILABLE = False
+
+
+def _try_gpu_index(index: "faiss.Index") -> "faiss.Index":
+    """Transfer a FAISS index to GPU if faiss-gpu is available.
+
+    Returns the GPU index on success, or the original CPU index on failure.
+    """
+    try:
+        res = faiss.StandardGpuResources()
+        gpu_index = faiss.index_cpu_to_gpu(res, 0, index)
+        logger.info("FAISS index transferred to GPU 0")
+        return gpu_index
+    except (AttributeError, RuntimeError) as exc:
+        logger.debug("GPU transfer unavailable, staying on CPU: %s", exc)
+        return index
+
+
+def _to_cpu_for_save(index: "faiss.Index") -> "faiss.Index":
+    """Convert a GPU index back to CPU for serialization."""
+    try:
+        return faiss.index_gpu_to_cpu(index)
+    except (AttributeError, RuntimeError):
+        return index
+
+
+class FAISSANNIndex(BaseANNIndex):
+    """FAISS-based ANN index using IndexHNSWFlat with optional GPU.
+
+    Uses Inner Product space with L2-normalized vectors for cosine similarity.
+    Thread-safe via RLock.
+    """
+
+    def __init__(self, path: str | Path, dim: int, config: Config) -> None:
+        if not _FAISS_AVAILABLE:
+            raise ImportError(
+                "faiss is required. Install with: pip install faiss-cpu "
+                "or pip install faiss-gpu"
+            )
+
+        self._path = Path(path)
+        self._index_path = self._path / "faiss_ann.index"
+        self._dim = dim
+        self._config = config
+        self._lock = threading.RLock()
+        self._index: faiss.Index | None = None
+
+    def _ensure_loaded(self) -> None:
+        """Load or initialize the index (caller holds lock)."""
+        if self._index is not None:
+            return
+        self.load()
+
+    def load(self) -> None:
+        """Load index from disk or initialize a fresh one."""
+        with self._lock:
+            if self._index_path.exists():
+                idx = faiss.read_index(str(self._index_path))
+                logger.debug(
+                    "Loaded FAISS ANN index from %s (%d items)",
+                    self._index_path, idx.ntotal,
+                )
+            else:
+                # HNSW with flat storage, M=32 by default
+                m = self._config.hnsw_M
+                idx = faiss.IndexHNSWFlat(self._dim, m, faiss.METRIC_INNER_PRODUCT)
+                idx.hnsw.efConstruction = self._config.hnsw_ef_construction
+                idx.hnsw.efSearch = self._config.hnsw_ef
+                logger.debug(
+                    "Initialized fresh FAISS HNSW index (dim=%d, M=%d)",
+                    self._dim, m,
+                )
+            self._index = _try_gpu_index(idx)
+
+    def add(self, ids: np.ndarray, vectors: np.ndarray) -> None:
+        """Add L2-normalized float32 vectors.
+
+        Vectors are normalized before insertion so that Inner Product
+        distance equals cosine similarity.
+
+        Args:
+            ids: shape (N,) int64 -- currently unused by FAISS flat index
+                 but kept for API compatibility. FAISS uses sequential IDs.
+            vectors: shape (N, dim) float32
+        """
+        if len(ids) == 0:
+            return
+
+        vecs = np.ascontiguousarray(vectors, dtype=np.float32)
+        # Normalize for cosine similarity via Inner Product
+        faiss.normalize_L2(vecs)
+
+        with self._lock:
+            self._ensure_loaded()
+            self._index.add(vecs)
+
+    def fine_search(
+        self, query_vec: np.ndarray, top_k: int | None = None
+    ) -> tuple[np.ndarray, np.ndarray]:
+        """Search for nearest neighbors.
+
+        Args:
+            query_vec: float32 vector of shape (dim,)
+            top_k: number of results; defaults to config.ann_top_k
+
+        Returns:
+            (ids, distances) as numpy arrays. For IP space, higher = more
+            similar, but distances are returned as-is for consumer handling.
+        """
+        k = top_k if top_k is not None else self._config.ann_top_k
+
+        with self._lock:
+            self._ensure_loaded()
+
+            count = self._index.ntotal
+            if count == 0:
+                return np.array([], dtype=np.int64), np.array([], dtype=np.float32)
+
+            k = min(k, count)
+            # Set efSearch for HNSW accuracy
+            try:
+                self._index.hnsw.efSearch = max(self._config.hnsw_ef, k)
+            except AttributeError:
+                pass  # GPU index may not expose hnsw attribute directly
+
+            q = np.ascontiguousarray(query_vec, dtype=np.float32).reshape(1, -1)
+            faiss.normalize_L2(q)
+            distances, labels = self._index.search(q, k)
+            return labels[0].astype(np.int64), distances[0].astype(np.float32)
+
+    def save(self) -> None:
+        """Save index to disk."""
+        with self._lock:
+            if self._index is None:
+                return
+            self._path.mkdir(parents=True, exist_ok=True)
+            cpu_index = _to_cpu_for_save(self._index)
+            faiss.write_index(cpu_index, str(self._index_path))
+
+    def __len__(self) -> int:
+        with self._lock:
+            if self._index is None:
+                return 0
+            return self._index.ntotal
+
+
+class FAISSBinaryIndex(BaseBinaryIndex):
+    """FAISS-based binary index using IndexBinaryFlat for Hamming distance.
+
+    Vectors are binary-quantized (sign bit) before insertion.
+    Thread-safe via RLock.
+    """
+
+    def __init__(self, path: str | Path, dim: int, config: Config) -> None:
+        if not _FAISS_AVAILABLE:
+            raise ImportError(
+                "faiss is required. Install with: pip install faiss-cpu "
+                "or pip install faiss-gpu"
+            )
+
+        self._path = Path(path)
+        self._index_path = self._path / "faiss_binary.index"
+        self._dim = dim
+        self._config = config
+        self._packed_bytes = math.ceil(dim / 8)
+        self._lock = threading.RLock()
+        self._index: faiss.IndexBinary | None = None
+
+    def _ensure_loaded(self) -> None:
+        if self._index is not None:
+            return
+        self.load()
+
+    def _quantize(self, vectors: np.ndarray) -> np.ndarray:
+        """Convert float32 vectors (N, dim) to packed uint8 (N, packed_bytes)."""
+        binary = (vectors > 0).astype(np.uint8)
+        return np.packbits(binary, axis=1)
+
+    def _quantize_single(self, vec: np.ndarray) -> np.ndarray:
+        """Convert a single float32 vector (dim,) to packed uint8 (1, packed_bytes)."""
+        binary = (vec > 0).astype(np.uint8)
+        return np.packbits(binary).reshape(1, -1)
+
+    def load(self) -> None:
+        """Load binary index from disk or initialize a fresh one."""
+        with self._lock:
+            if self._index_path.exists():
+                idx = faiss.read_index_binary(str(self._index_path))
+                logger.debug(
+                    "Loaded FAISS binary index from %s (%d items)",
+                    self._index_path, idx.ntotal,
+                )
+            else:
+                # IndexBinaryFlat takes dimension in bits
+                idx = faiss.IndexBinaryFlat(self._dim)
+                logger.debug(
+                    "Initialized fresh FAISS binary index (dim_bits=%d)", self._dim,
+                )
+            self._index = idx
+
+    def add(self, ids: np.ndarray, vectors: np.ndarray) -> None:
+        """Add float32 vectors (binary-quantized internally).
+
+        Args:
+            ids: shape (N,) int64 -- kept for API compatibility
+            vectors: shape (N, dim) float32
+        """
+        if len(ids) == 0:
+            return
+
+        packed = self._quantize(vectors)
+        packed = np.ascontiguousarray(packed, dtype=np.uint8)
+
+        with self._lock:
+            self._ensure_loaded()
+            self._index.add(packed)
+
+    def coarse_search(
+        self, query_vec: np.ndarray, top_k: int | None = None
+    ) -> tuple[np.ndarray, np.ndarray]:
+        """Search by Hamming distance.
+
+        Args:
+            query_vec: float32 vector of shape (dim,)
+            top_k: number of results; defaults to config.binary_top_k
+
+        Returns:
+            (ids, distances) sorted ascending by Hamming distance
+        """
+        with self._lock:
+            self._ensure_loaded()
+
+            if self._index.ntotal == 0:
+                return np.array([], dtype=np.int64), np.array([], dtype=np.int32)
+
+            k = top_k if top_k is not None else self._config.binary_top_k
+            k = min(k, self._index.ntotal)
+
+            q = self._quantize_single(query_vec)
+            q = np.ascontiguousarray(q, dtype=np.uint8)
+            distances, labels = self._index.search(q, k)
+            return labels[0].astype(np.int64), distances[0].astype(np.int32)
+
+    def save(self) -> None:
+        """Save binary index to disk."""
+        with self._lock:
+            if self._index is None:
+                return
+            self._path.mkdir(parents=True, exist_ok=True)
+            faiss.write_index_binary(self._index, str(self._index_path))
+
+    def __len__(self) -> int:
+        with self._lock:
+            if self._index is None:
+                return 0
+            return self._index.ntotal
--- a/codex-lens-v2/src/codexlens/core/index.py
+++ b/codex-lens-v2/src/codexlens/core/index.py
@@ -0,0 +1,136 @@
+from __future__ import annotations
+
+import logging
+import threading
+from pathlib import Path
+
+import numpy as np
+
+from codexlens.config import Config
+from codexlens.core.base import BaseANNIndex
+
+logger = logging.getLogger(__name__)
+
+try:
+    import hnswlib
+    _HNSWLIB_AVAILABLE = True
+except ImportError:
+    _HNSWLIB_AVAILABLE = False
+
+
+class ANNIndex(BaseANNIndex):
+    """HNSW-based approximate nearest neighbor index.
+
+    Lazy-loads on first use, thread-safe via RLock.
+    """
+
+    def __init__(self, path: str | Path, dim: int, config: Config) -> None:
+        if not _HNSWLIB_AVAILABLE:
+            raise ImportError("hnswlib is required. Install with: pip install hnswlib")
+
+        self._path = Path(path)
+        self._hnsw_path = self._path / "ann_index.hnsw"
+        self._dim = dim
+        self._config = config
+        self._lock = threading.RLock()
+        self._index: hnswlib.Index | None = None
+
+    # ------------------------------------------------------------------
+    # Internal helpers
+    # ------------------------------------------------------------------
+
+    def _ensure_loaded(self) -> None:
+        """Load or initialize the index (caller holds lock)."""
+        if self._index is not None:
+            return
+        self.load()
+
+    # ------------------------------------------------------------------
+    # Public API
+    # ------------------------------------------------------------------
+
+    def load(self) -> None:
+        """Load index from disk or initialize a fresh one."""
+        with self._lock:
+            idx = hnswlib.Index(space="cosine", dim=self._dim)
+            if self._hnsw_path.exists():
+                idx.load_index(str(self._hnsw_path), max_elements=0)
+                idx.set_ef(self._config.hnsw_ef)
+                logger.debug("Loaded HNSW index from %s (%d items)", self._hnsw_path, idx.get_current_count())
+            else:
+                idx.init_index(
+                    max_elements=1000,
+                    ef_construction=self._config.hnsw_ef_construction,
+                    M=self._config.hnsw_M,
+                )
+                idx.set_ef(self._config.hnsw_ef)
+                logger.debug("Initialized fresh HNSW index (dim=%d)", self._dim)
+            self._index = idx
+
+    def add(self, ids: np.ndarray, vectors: np.ndarray) -> None:
+        """Add float32 vectors.
+
+        Does NOT call save() internally -- callers must call save()
+        explicitly after batch indexing.
+
+        Args:
+            ids: shape (N,) int64
+            vectors: shape (N, dim) float32
+        """
+        if len(ids) == 0:
+            return
+
+        vecs = np.ascontiguousarray(vectors, dtype=np.float32)
+
+        with self._lock:
+            self._ensure_loaded()
+            # Expand capacity if needed
+            current = self._index.get_current_count()
+            max_el = self._index.get_max_elements()
+            needed = current + len(ids)
+            if needed > max_el:
+                new_cap = max(max_el * 2, needed + 100)
+                self._index.resize_index(new_cap)
+            self._index.add_items(vecs, ids.astype(np.int64))
+
+    def fine_search(
+        self, query_vec: np.ndarray, top_k: int | None = None
+    ) -> tuple[np.ndarray, np.ndarray]:
+        """Search for nearest neighbors.
+
+        Args:
+            query_vec: float32 vector of shape (dim,)
+            top_k: number of results; defaults to config.ann_top_k
+
+        Returns:
+            (ids, distances) as numpy arrays
+        """
+        k = top_k if top_k is not None else self._config.ann_top_k
+
+        with self._lock:
+            self._ensure_loaded()
+
+            count = self._index.get_current_count()
+            if count == 0:
+                return np.array([], dtype=np.int64), np.array([], dtype=np.float32)
+
+            k = min(k, count)
+            self._index.set_ef(max(self._config.hnsw_ef, k))
+
+            q = np.ascontiguousarray(query_vec, dtype=np.float32).reshape(1, -1)
+            labels, distances = self._index.knn_query(q, k=k)
+            return labels[0].astype(np.int64), distances[0].astype(np.float32)
+
+    def save(self) -> None:
+        """Save index to disk (caller may or may not hold lock)."""
+        with self._lock:
+            if self._index is None:
+                return
+            self._path.mkdir(parents=True, exist_ok=True)
+            self._index.save_index(str(self._hnsw_path))
+
+    def __len__(self) -> int:
+        with self._lock:
+            if self._index is None:
+                return 0
+            return self._index.get_current_count()
--- a/codex-lens-v2/src/codexlens/embed/init.py
+++ b/codex-lens-v2/src/codexlens/embed/init.py
@@ -0,0 +1,4 @@
+from .base import BaseEmbedder
+from .local import FastEmbedEmbedder, EMBED_PROFILES
+
+__all__ = ["BaseEmbedder", "FastEmbedEmbedder", "EMBED_PROFILES"]
--- a/codex-lens-v2/src/codexlens/embed/base.py
+++ b/codex-lens-v2/src/codexlens/embed/base.py
@@ -0,0 +1,13 @@
+from __future__ import annotations
+from abc import ABC, abstractmethod
+import numpy as np
+
+
+class BaseEmbedder(ABC):
+    @abstractmethod
+    def embed_single(self, text: str) -> np.ndarray:
+        """Embed a single text, returns float32 ndarray shape (dim,)."""
+
+    @abstractmethod
+    def embed_batch(self, texts: list[str]) -> list[np.ndarray]:
+        """Embed a list of texts, returns list of float32 ndarrays."""
--- a/codex-lens-v2/src/codexlens/embed/local.py
+++ b/codex-lens-v2/src/codexlens/embed/local.py
@@ -0,0 +1,53 @@
+from __future__ import annotations
+
+import numpy as np
+
+from ..config import Config
+from .base import BaseEmbedder
+
+EMBED_PROFILES = {
+    "small": "BAAI/bge-small-en-v1.5",                    # 384d
+    "base": "BAAI/bge-base-en-v1.5",                      # 768d
+    "large": "BAAI/bge-large-en-v1.5",                    # 1024d
+    "code": "jinaai/jina-embeddings-v2-base-code",         # 768d
+}
+
+
+class FastEmbedEmbedder(BaseEmbedder):
+    """Embedder backed by fastembed.TextEmbedding with lazy model loading."""
+
+    def __init__(self, config: Config) -> None:
+        self._config = config
+        self._model = None
+
+    def _load(self) -> None:
+        """Lazy-load the fastembed TextEmbedding model on first use."""
+        if self._model is not None:
+            return
+        from fastembed import TextEmbedding
+        providers = self._config.resolve_embed_providers()
+        try:
+            self._model = TextEmbedding(
+                model_name=self._config.embed_model,
+                providers=providers,
+            )
+        except TypeError:
+            # Older fastembed versions may not accept providers kwarg
+            self._model = TextEmbedding(model_name=self._config.embed_model)
+
+    def embed_single(self, text: str) -> np.ndarray:
+        """Embed a single text, returns float32 ndarray of shape (dim,)."""
+        self._load()
+        result = list(self._model.embed([text]))
+        return result[0].astype(np.float32)
+
+    def embed_batch(self, texts: list[str]) -> list[np.ndarray]:
+        """Embed a list of texts in batches, returns list of float32 ndarrays."""
+        self._load()
+        batch_size = self._config.embed_batch_size
+        results: list[np.ndarray] = []
+        for start in range(0, len(texts), batch_size):
+            batch = texts[start : start + batch_size]
+            for vec in self._model.embed(batch):
+                results.append(vec.astype(np.float32))
+        return results
--- a/codex-lens-v2/src/codexlens/indexing/init.py
+++ b/codex-lens-v2/src/codexlens/indexing/init.py
@@ -0,0 +1,5 @@
+from __future__ import annotations
+
+from .pipeline import IndexingPipeline, IndexStats
+
+__all__ = ["IndexingPipeline", "IndexStats"]
--- a/codex-lens-v2/src/codexlens/indexing/pipeline.py
+++ b/codex-lens-v2/src/codexlens/indexing/pipeline.py
@@ -0,0 +1,277 @@
+"""Three-stage parallel indexing pipeline: chunk -> embed -> index.
+
+Uses threading.Thread with queue.Queue for producer-consumer handoff.
+The GIL is acceptable because embedding (onnxruntime) releases it in C extensions.
+"""
+from __future__ import annotations
+
+import logging
+import queue
+import threading
+import time
+from dataclasses import dataclass
+from pathlib import Path
+
+import numpy as np
+
+from codexlens.config import Config
+from codexlens.core.binary import BinaryStore
+from codexlens.core.index import ANNIndex
+from codexlens.embed.base import BaseEmbedder
+from codexlens.search.fts import FTSEngine
+
+logger = logging.getLogger(__name__)
+
+# Sentinel value to signal worker shutdown
+_SENTINEL = None
+
+# Defaults for chunking (can be overridden via index_files kwargs)
+_DEFAULT_MAX_CHUNK_CHARS = 800
+_DEFAULT_CHUNK_OVERLAP = 100
+
+
+@dataclass
+class IndexStats:
+    """Statistics returned after indexing completes."""
+    files_processed: int = 0
+    chunks_created: int = 0
+    duration_seconds: float = 0.0
+
+
+class IndexingPipeline:
+    """Parallel 3-stage indexing pipeline with queue-based handoff.
+
+    Stage 1 (main thread): Read files, chunk text, push to embed_queue.
+    Stage 2 (embed worker): Pull text batches, call embed_batch(), push vectors to index_queue.
+    Stage 3 (index worker): Pull vectors+ids, call BinaryStore.add(), ANNIndex.add(), FTS.add_documents().
+
+    After all stages complete, save() is called on BinaryStore and ANNIndex exactly once.
+    """
+
+    def __init__(
+        self,
+        embedder: BaseEmbedder,
+        binary_store: BinaryStore,
+        ann_index: ANNIndex,
+        fts: FTSEngine,
+        config: Config,
+    ) -> None:
+        self._embedder = embedder
+        self._binary_store = binary_store
+        self._ann_index = ann_index
+        self._fts = fts
+        self._config = config
+
+    def index_files(
+        self,
+        files: list[Path],
+        *,
+        root: Path | None = None,
+        max_chunk_chars: int = _DEFAULT_MAX_CHUNK_CHARS,
+        chunk_overlap: int = _DEFAULT_CHUNK_OVERLAP,
+        max_file_size: int = 50_000,
+    ) -> IndexStats:
+        """Run the 3-stage pipeline on the given files.
+
+        Args:
+            files: List of file paths to index.
+            root: Optional root for computing relative paths. If None, uses
+                  each file's absolute path as its identifier.
+            max_chunk_chars: Maximum characters per chunk.
+            chunk_overlap: Character overlap between consecutive chunks.
+            max_file_size: Skip files larger than this (bytes).
+
+        Returns:
+            IndexStats with counts and timing.
+        """
+        if not files:
+            return IndexStats()
+
+        t0 = time.monotonic()
+
+        embed_queue: queue.Queue = queue.Queue(maxsize=4)
+        index_queue: queue.Queue = queue.Queue(maxsize=4)
+
+        # Track errors from workers
+        worker_errors: list[Exception] = []
+        error_lock = threading.Lock()
+
+        def _record_error(exc: Exception) -> None:
+            with error_lock:
+                worker_errors.append(exc)
+
+        # --- Start workers ---
+        embed_thread = threading.Thread(
+            target=self._embed_worker,
+            args=(embed_queue, index_queue, _record_error),
+            daemon=True,
+            name="indexing-embed",
+        )
+        index_thread = threading.Thread(
+            target=self._index_worker,
+            args=(index_queue, _record_error),
+            daemon=True,
+            name="indexing-index",
+        )
+        embed_thread.start()
+        index_thread.start()
+
+        # --- Stage 1: chunk files (main thread) ---
+        chunk_id = 0
+        files_processed = 0
+        chunks_created = 0
+
+        for fpath in files:
+            try:
+                if fpath.stat().st_size > max_file_size:
+                    continue
+                text = fpath.read_text(encoding="utf-8", errors="replace")
+            except Exception as exc:
+                logger.debug("Skipping %s: %s", fpath, exc)
+                continue
+
+            rel_path = str(fpath.relative_to(root)) if root else str(fpath)
+            file_chunks = self._chunk_text(text, rel_path, max_chunk_chars, chunk_overlap)
+
+            if not file_chunks:
+                continue
+
+            files_processed += 1
+
+            # Assign sequential IDs and push batch to embed queue
+            batch_ids = []
+            batch_texts = []
+            batch_paths = []
+            for chunk_text, path in file_chunks:
+                batch_ids.append(chunk_id)
+                batch_texts.append(chunk_text)
+                batch_paths.append(path)
+                chunk_id += 1
+
+            chunks_created += len(batch_ids)
+            embed_queue.put((batch_ids, batch_texts, batch_paths))
+
+        # Signal embed worker: no more data
+        embed_queue.put(_SENTINEL)
+
+        # Wait for workers to finish
+        embed_thread.join()
+        index_thread.join()
+
+        # --- Final flush ---
+        self._binary_store.save()
+        self._ann_index.save()
+
+        duration = time.monotonic() - t0
+        stats = IndexStats(
+            files_processed=files_processed,
+            chunks_created=chunks_created,
+            duration_seconds=round(duration, 2),
+        )
+
+        logger.info(
+            "Indexing complete: %d files, %d chunks in %.1fs",
+            stats.files_processed,
+            stats.chunks_created,
+            stats.duration_seconds,
+        )
+
+        # Raise first worker error if any occurred
+        if worker_errors:
+            raise worker_errors[0]
+
+        return stats
+
+    # ------------------------------------------------------------------
+    # Workers
+    # ------------------------------------------------------------------
+
+    def _embed_worker(
+        self,
+        in_q: queue.Queue,
+        out_q: queue.Queue,
+        on_error: callable,
+    ) -> None:
+        """Stage 2: Pull chunk batches, embed, push (ids, vecs, docs) to index queue."""
+        try:
+            while True:
+                item = in_q.get()
+                if item is _SENTINEL:
+                    break
+
+                batch_ids, batch_texts, batch_paths = item
+                try:
+                    vecs = self._embedder.embed_batch(batch_texts)
+                    vec_array = np.array(vecs, dtype=np.float32)
+                    id_array = np.array(batch_ids, dtype=np.int64)
+                    out_q.put((id_array, vec_array, batch_texts, batch_paths))
+                except Exception as exc:
+                    logger.error("Embed worker error: %s", exc)
+                    on_error(exc)
+        finally:
+            # Signal index worker: no more data
+            out_q.put(_SENTINEL)
+
+    def _index_worker(
+        self,
+        in_q: queue.Queue,
+        on_error: callable,
+    ) -> None:
+        """Stage 3: Pull (ids, vecs, texts, paths), write to stores."""
+        while True:
+            item = in_q.get()
+            if item is _SENTINEL:
+                break
+
+            id_array, vec_array, texts, paths = item
+            try:
+                self._binary_store.add(id_array, vec_array)
+                self._ann_index.add(id_array, vec_array)
+
+                fts_docs = [
+                    (int(id_array[i]), paths[i], texts[i])
+                    for i in range(len(id_array))
+                ]
+                self._fts.add_documents(fts_docs)
+            except Exception as exc:
+                logger.error("Index worker error: %s", exc)
+                on_error(exc)
+
+    # ------------------------------------------------------------------
+    # Chunking
+    # ------------------------------------------------------------------
+
+    @staticmethod
+    def _chunk_text(
+        text: str,
+        path: str,
+        max_chars: int,
+        overlap: int,
+    ) -> list[tuple[str, str]]:
+        """Split file text into overlapping chunks.
+
+        Returns list of (chunk_text, path) tuples.
+        """
+        if not text.strip():
+            return []
+
+        chunks: list[tuple[str, str]] = []
+        lines = text.splitlines(keepends=True)
+        current: list[str] = []
+        current_len = 0
+
+        for line in lines:
+            if current_len + len(line) > max_chars and current:
+                chunk = "".join(current)
+                chunks.append((chunk, path))
+                # overlap: keep last N characters
+                tail = "".join(current)[-overlap:]
+                current = [tail] if tail else []
+                current_len = len(tail)
+            current.append(line)
+            current_len += len(line)
+
+        if current:
+            chunks.append(("".join(current), path))
+
+        return chunks
--- a/codex-lens-v2/src/codexlens/rerank/init.py
+++ b/codex-lens-v2/src/codexlens/rerank/init.py
@@ -0,0 +1,5 @@
+from .base import BaseReranker
+from .local import FastEmbedReranker
+from .api import APIReranker
+
+__all__ = ["BaseReranker", "FastEmbedReranker", "APIReranker"]
--- a/codex-lens-v2/src/codexlens/rerank/api.py
+++ b/codex-lens-v2/src/codexlens/rerank/api.py
@@ -0,0 +1,103 @@
+from __future__ import annotations
+
+import logging
+import time
+
+import httpx
+
+from codexlens.config import Config
+from .base import BaseReranker
+
+logger = logging.getLogger(__name__)
+
+
+class APIReranker(BaseReranker):
+    """Reranker backed by a remote HTTP API (SiliconFlow/Cohere/Jina format)."""
+
+    def __init__(self, config: Config) -> None:
+        self._config = config
+        self._client = httpx.Client(
+            headers={
+                "Authorization": f"Bearer {config.reranker_api_key}",
+                "Content-Type": "application/json",
+            },
+        )
+
+    def score_pairs(self, query: str, documents: list[str]) -> list[float]:
+        if not documents:
+            return []
+        max_tokens = self._config.reranker_api_max_tokens_per_batch
+        batches = self._split_batches(documents, max_tokens)
+        scores = [0.0] * len(documents)
+        for batch in batches:
+            batch_scores = self._call_api_with_retry(query, batch)
+            for orig_idx, score in batch_scores.items():
+                scores[orig_idx] = score
+        return scores
+
+    def _split_batches(
+        self, documents: list[str], max_tokens: int
+    ) -> list[list[tuple[int, str]]]:
+        batches: list[list[tuple[int, str]]] = []
+        current_batch: list[tuple[int, str]] = []
+        current_tokens = 0
+
+        for idx, text in enumerate(documents):
+            doc_tokens = len(text) // 4
+            if current_tokens + doc_tokens > max_tokens and current_batch:
+                batches.append(current_batch)
+                current_batch = []
+                current_tokens = 0
+            current_batch.append((idx, text))
+            current_tokens += doc_tokens
+
+        if current_batch:
+            batches.append(current_batch)
+
+        return batches
+
+    def _call_api_with_retry(
+        self,
+        query: str,
+        docs: list[tuple[int, str]],
+        max_retries: int = 3,
+    ) -> dict[int, float]:
+        url = self._config.reranker_api_url.rstrip("/") + "/rerank"
+        payload = {
+            "model": self._config.reranker_api_model,
+            "query": query,
+            "documents": [t for _, t in docs],
+        }
+
+        last_exc: Exception | None = None
+        for attempt in range(max_retries):
+            try:
+                response = self._client.post(url, json=payload)
+            except Exception as exc:
+                last_exc = exc
+                time.sleep((2 ** attempt) * 0.5)
+                continue
+
+            if response.status_code in (429, 503):
+                logger.warning(
+                    "API reranker returned HTTP %s (attempt %d/%d), retrying...",
+                    response.status_code,
+                    attempt + 1,
+                    max_retries,
+                )
+                time.sleep((2 ** attempt) * 0.5)
+                continue
+
+            response.raise_for_status()
+            data = response.json()
+            results = data.get("results", [])
+            scores: dict[int, float] = {}
+            for item in results:
+                local_idx = int(item["index"])
+                orig_idx = docs[local_idx][0]
+                scores[orig_idx] = float(item["relevance_score"])
+            return scores
+
+        raise RuntimeError(
+            f"API reranker failed after {max_retries} attempts. Last error: {last_exc}"
+        )
--- a/codex-lens-v2/src/codexlens/rerank/base.py
+++ b/codex-lens-v2/src/codexlens/rerank/base.py
@@ -0,0 +1,8 @@
+from __future__ import annotations
+from abc import ABC, abstractmethod
+
+
+class BaseReranker(ABC):
+    @abstractmethod
+    def score_pairs(self, query: str, documents: list[str]) -> list[float]:
+        """Score (query, doc) pairs. Returns list of floats same length as documents."""
--- a/codex-lens-v2/src/codexlens/rerank/local.py
+++ b/codex-lens-v2/src/codexlens/rerank/local.py
@@ -0,0 +1,25 @@
+from __future__ import annotations
+
+from codexlens.config import Config
+from .base import BaseReranker
+
+
+class FastEmbedReranker(BaseReranker):
+    """Local reranker backed by fastembed TextCrossEncoder."""
+
+    def __init__(self, config: Config) -> None:
+        self._config = config
+        self._model = None
+
+    def _load(self) -> None:
+        if self._model is None:
+            from fastembed.rerank.cross_encoder import TextCrossEncoder
+            self._model = TextCrossEncoder(model_name=self._config.reranker_model)
+
+    def score_pairs(self, query: str, documents: list[str]) -> list[float]:
+        self._load()
+        results = list(self._model.rerank(query, documents))
+        scores = [0.0] * len(documents)
+        for r in results:
+            scores[r.index] = float(r.score)
+        return scores
--- a/codex-lens-v2/src/codexlens/search/init.py
+++ b/codex-lens-v2/src/codexlens/search/init.py
@@ -0,0 +1,8 @@
+from .fts import FTSEngine
+from .fusion import reciprocal_rank_fusion, detect_query_intent, QueryIntent, DEFAULT_WEIGHTS
+from .pipeline import SearchPipeline, SearchResult
+
+__all__ = [
+    "FTSEngine", "reciprocal_rank_fusion", "detect_query_intent",
+    "QueryIntent", "DEFAULT_WEIGHTS", "SearchPipeline", "SearchResult",
+]
--- a/codex-lens-v2/src/codexlens/search/fts.py
+++ b/codex-lens-v2/src/codexlens/search/fts.py
@@ -0,0 +1,69 @@
+from __future__ import annotations
+
+import sqlite3
+from pathlib import Path
+
+
+class FTSEngine:
+    def __init__(self, db_path: str | Path) -> None:
+        self._conn = sqlite3.connect(str(db_path), check_same_thread=False)
+        self._conn.execute(
+            "CREATE VIRTUAL TABLE IF NOT EXISTS docs "
+            "USING fts5(content, tokenize='porter unicode61')"
+        )
+        self._conn.execute(
+            "CREATE TABLE IF NOT EXISTS docs_meta "
+            "(id INTEGER PRIMARY KEY, path TEXT)"
+        )
+        self._conn.commit()
+
+    def add_documents(self, docs: list[tuple[int, str, str]]) -> None:
+        """Add documents in batch. docs: list of (id, path, content)."""
+        if not docs:
+            return
+        self._conn.executemany(
+            "INSERT OR REPLACE INTO docs_meta (id, path) VALUES (?, ?)",
+            [(doc_id, path) for doc_id, path, content in docs],
+        )
+        self._conn.executemany(
+            "INSERT OR REPLACE INTO docs (rowid, content) VALUES (?, ?)",
+            [(doc_id, content) for doc_id, path, content in docs],
+        )
+        self._conn.commit()
+
+    def exact_search(self, query: str, top_k: int = 50) -> list[tuple[int, float]]:
+        """FTS5 MATCH query, return (id, bm25_score) sorted by score descending."""
+        try:
+            rows = self._conn.execute(
+                "SELECT rowid, bm25(docs) AS score FROM docs "
+                "WHERE docs MATCH ? ORDER BY score LIMIT ?",
+                (query, top_k),
+            ).fetchall()
+        except sqlite3.OperationalError:
+            return []
+        # bm25 in SQLite FTS5 returns negative values (lower = better match)
+        # Negate so higher is better
+        return [(int(row[0]), -float(row[1])) for row in rows]
+
+    def fuzzy_search(self, query: str, top_k: int = 50) -> list[tuple[int, float]]:
+        """Prefix search: each token + '*', return (id, score) sorted descending."""
+        tokens = query.strip().split()
+        if not tokens:
+            return []
+        prefix_query = " ".join(t + "*" for t in tokens)
+        try:
+            rows = self._conn.execute(
+                "SELECT rowid, bm25(docs) AS score FROM docs "
+                "WHERE docs MATCH ? ORDER BY score LIMIT ?",
+                (prefix_query, top_k),
+            ).fetchall()
+        except sqlite3.OperationalError:
+            return []
+        return [(int(row[0]), -float(row[1])) for row in rows]
+
+    def get_content(self, doc_id: int) -> str:
+        """Retrieve content for a doc_id."""
+        row = self._conn.execute(
+            "SELECT content FROM docs WHERE rowid = ?", (doc_id,)
+        ).fetchone()
+        return row[0] if row else ""
--- a/codex-lens-v2/src/codexlens/search/fusion.py
+++ b/codex-lens-v2/src/codexlens/search/fusion.py
@@ -0,0 +1,106 @@
+from __future__ import annotations
+
+import re
+from enum import Enum
+
+DEFAULT_WEIGHTS: dict[str, float] = {
+    "exact": 0.25,
+    "fuzzy": 0.10,
+    "vector": 0.50,
+    "graph": 0.15,
+}
+
+_CODE_CAMEL_RE = re.compile(r"[a-z][A-Z]")
+_CODE_SNAKE_RE = re.compile(r"\b[a-z_]+_[a-z_]+\b")
+_CODE_SYMBOLS_RE = re.compile(r"[.\[\](){}]|->|::")
+_CODE_KEYWORDS_RE = re.compile(r"\b(import|def|class|return|from|async|await|lambda|yield)\b")
+_QUESTION_WORDS_RE = re.compile(r"\b(how|what|why|when|where|which|who|does|do|is|are|can|should)\b", re.IGNORECASE)
+
+
+class QueryIntent(Enum):
+    CODE_SYMBOL = "code_symbol"
+    NATURAL_LANGUAGE = "natural"
+    MIXED = "mixed"
+
+
+def detect_query_intent(query: str) -> QueryIntent:
+    """Detect whether query is a code symbol, natural language, or mixed."""
+    words = query.strip().split()
+    word_count = len(words)
+
+    code_signals = 0
+    natural_signals = 0
+
+    if _CODE_CAMEL_RE.search(query):
+        code_signals += 2
+    if _CODE_SNAKE_RE.search(query):
+        code_signals += 2
+    if _CODE_SYMBOLS_RE.search(query):
+        code_signals += 2
+    if _CODE_KEYWORDS_RE.search(query):
+        code_signals += 2
+    if "`" in query:
+        code_signals += 1
+    if word_count < 4:
+        code_signals += 1
+
+    if _QUESTION_WORDS_RE.search(query):
+        natural_signals += 2
+    if word_count > 5:
+        natural_signals += 2
+    if code_signals == 0 and word_count >= 3:
+        natural_signals += 1
+
+    if code_signals >= 2 and natural_signals == 0:
+        return QueryIntent.CODE_SYMBOL
+    if natural_signals >= 2 and code_signals == 0:
+        return QueryIntent.NATURAL_LANGUAGE
+    if code_signals >= 2 and natural_signals == 0:
+        return QueryIntent.CODE_SYMBOL
+    if natural_signals > code_signals:
+        return QueryIntent.NATURAL_LANGUAGE
+    if code_signals > natural_signals:
+        return QueryIntent.CODE_SYMBOL
+    return QueryIntent.MIXED
+
+
+def get_adaptive_weights(intent: QueryIntent, base: dict | None = None) -> dict[str, float]:
+    """Return weights adapted to query intent."""
+    weights = dict(base or DEFAULT_WEIGHTS)
+    if intent == QueryIntent.CODE_SYMBOL:
+        weights["exact"] = 0.45
+        weights["vector"] = 0.35
+    elif intent == QueryIntent.NATURAL_LANGUAGE:
+        weights["vector"] = 0.65
+        weights["exact"] = 0.15
+    # MIXED: use weights as-is
+    return weights
+
+
+def reciprocal_rank_fusion(
+    results: dict[str, list[tuple[int, float]]],
+    weights: dict[str, float] | None = None,
+    k: int = 60,
+) -> list[tuple[int, float]]:
+    """Fuse ranked result lists using Reciprocal Rank Fusion.
+
+    results: {source_name: [(doc_id, score), ...]} each list sorted desc by score.
+    weights: weight per source (defaults to equal weight across all sources).
+    k: RRF constant (default 60).
+    Returns sorted list of (doc_id, fused_score) descending.
+    """
+    if not results:
+        return []
+
+    sources = list(results.keys())
+    if weights is None:
+        equal_w = 1.0 / len(sources)
+        weights = {s: equal_w for s in sources}
+
+    scores: dict[int, float] = {}
+    for source, ranked_list in results.items():
+        w = weights.get(source, 0.0)
+        for rank, (doc_id, _) in enumerate(ranked_list, start=1):
+            scores[doc_id] = scores.get(doc_id, 0.0) + w * (1.0 / (k + rank))
+
+    return sorted(scores.items(), key=lambda x: x[1], reverse=True)
--- a/codex-lens-v2/src/codexlens/search/pipeline.py
+++ b/codex-lens-v2/src/codexlens/search/pipeline.py
@@ -0,0 +1,163 @@
+from __future__ import annotations
+
+import logging
+from concurrent.futures import ThreadPoolExecutor
+from dataclasses import dataclass
+
+import numpy as np
+
+from ..config import Config
+from ..core import ANNIndex, BinaryStore
+from ..embed import BaseEmbedder
+from ..rerank import BaseReranker
+from .fts import FTSEngine
+from .fusion import (
+    DEFAULT_WEIGHTS,
+    detect_query_intent,
+    get_adaptive_weights,
+    reciprocal_rank_fusion,
+)
+
+_log = logging.getLogger(__name__)
+
+
+@dataclass
+class SearchResult:
+    id: int
+    path: str
+    score: float
+    snippet: str = ""
+
+
+class SearchPipeline:
+    def __init__(
+        self,
+        embedder: BaseEmbedder,
+        binary_store: BinaryStore,
+        ann_index: ANNIndex,
+        reranker: BaseReranker,
+        fts: FTSEngine,
+        config: Config,
+    ) -> None:
+        self._embedder = embedder
+        self._binary_store = binary_store
+        self._ann_index = ann_index
+        self._reranker = reranker
+        self._fts = fts
+        self._config = config
+
+    # -- Helper: vector search (binary coarse + ANN fine) -----------------
+
+    def _vector_search(
+        self, query_vec: np.ndarray
+    ) -> list[tuple[int, float]]:
+        """Run binary coarse search then ANN fine search and intersect."""
+        cfg = self._config
+
+        # Binary coarse search -> candidate_ids set
+        candidate_ids_list, _ = self._binary_store.coarse_search(
+            query_vec, top_k=cfg.binary_top_k
+        )
+        candidate_ids = set(candidate_ids_list)
+
+        # ANN fine search on full index, then intersect with binary candidates
+        ann_ids, ann_scores = self._ann_index.fine_search(
+            query_vec, top_k=cfg.ann_top_k
+        )
+        # Keep only results that appear in binary candidates (2-stage funnel)
+        vector_results: list[tuple[int, float]] = [
+            (int(doc_id), float(score))
+            for doc_id, score in zip(ann_ids, ann_scores)
+            if int(doc_id) in candidate_ids
+        ]
+        # Fall back to full ANN results if intersection is empty
+        if not vector_results:
+            vector_results = [
+                (int(doc_id), float(score))
+                for doc_id, score in zip(ann_ids, ann_scores)
+            ]
+        return vector_results
+
+    # -- Helper: FTS search (exact + fuzzy) ------------------------------
+
+    def _fts_search(
+        self, query: str
+    ) -> tuple[list[tuple[int, float]], list[tuple[int, float]]]:
+        """Run exact and fuzzy full-text search."""
+        cfg = self._config
+        exact_results = self._fts.exact_search(query, top_k=cfg.fts_top_k)
+        fuzzy_results = self._fts.fuzzy_search(query, top_k=cfg.fts_top_k)
+        return exact_results, fuzzy_results
+
+    # -- Main search entry point -----------------------------------------
+
+    def search(self, query: str, top_k: int | None = None) -> list[SearchResult]:
+        cfg = self._config
+        final_top_k = top_k if top_k is not None else cfg.reranker_top_k
+
+        # 1. Detect intent -> adaptive weights
+        intent = detect_query_intent(query)
+        weights = get_adaptive_weights(intent, cfg.fusion_weights)
+
+        # 2. Embed query
+        query_vec = self._embedder.embed([query])[0]
+
+        # 3. Parallel vector + FTS search
+        vector_results: list[tuple[int, float]] = []
+        exact_results: list[tuple[int, float]] = []
+        fuzzy_results: list[tuple[int, float]] = []
+
+        with ThreadPoolExecutor(max_workers=2) as pool:
+            vec_future = pool.submit(self._vector_search, query_vec)
+            fts_future = pool.submit(self._fts_search, query)
+
+            # Collect vector results
+            try:
+                vector_results = vec_future.result()
+            except Exception:
+                _log.warning("Vector search failed, using empty results", exc_info=True)
+
+            # Collect FTS results
+            try:
+                exact_results, fuzzy_results = fts_future.result()
+            except Exception:
+                _log.warning("FTS search failed, using empty results", exc_info=True)
+
+        # 4. RRF fusion
+        fusion_input: dict[str, list[tuple[int, float]]] = {}
+        if vector_results:
+            fusion_input["vector"] = vector_results
+        if exact_results:
+            fusion_input["exact"] = exact_results
+        if fuzzy_results:
+            fusion_input["fuzzy"] = fuzzy_results
+
+        if not fusion_input:
+            return []
+
+        fused = reciprocal_rank_fusion(fusion_input, weights=weights, k=cfg.fusion_k)
+
+        # 5. Rerank top candidates
+        rerank_ids = [doc_id for doc_id, _ in fused[:50]]
+        contents = [self._fts.get_content(doc_id) for doc_id in rerank_ids]
+        rerank_scores = self._reranker.score_pairs(query, contents)
+
+        # 6. Sort by rerank score, build SearchResult list
+        ranked = sorted(
+            zip(rerank_ids, rerank_scores), key=lambda x: x[1], reverse=True
+        )
+
+        results: list[SearchResult] = []
+        for doc_id, score in ranked[:final_top_k]:
+            path = self._fts._conn.execute(
+                "SELECT path FROM docs_meta WHERE id = ?", (doc_id,)
+            ).fetchone()
+            results.append(
+                SearchResult(
+                    id=doc_id,
+                    path=path[0] if path else "",
+                    score=float(score),
+                    snippet=self._fts.get_content(doc_id)[:200],
+                )
+            )
+        return results
--- a/codex-lens-v2/tests/init.py
+++ b/codex-lens-v2/tests/init.py
--- a/codex-lens-v2/tests/integration/init.py
+++ b/codex-lens-v2/tests/integration/init.py
--- a/codex-lens-v2/tests/integration/conftest.py
+++ b/codex-lens-v2/tests/integration/conftest.py
@@ -0,0 +1,108 @@
+import pytest
+import numpy as np
+import tempfile
+from pathlib import Path
+
+from codexlens.config import Config
+from codexlens.core import ANNIndex, BinaryStore
+from codexlens.embed.base import BaseEmbedder
+from codexlens.rerank.base import BaseReranker
+from codexlens.search.fts import FTSEngine
+from codexlens.search.pipeline import SearchPipeline
+
+# Test documents: 20 code snippets with id, path, content
+TEST_DOCS = [
+    (0, "auth.py", "def authenticate(user, password): return check_hash(password, user.hash)"),
+    (1, "auth.py", "def authorize(user, permission): return permission in user.roles"),
+    (2, "models.py", "class User: def __init__(self, name, email): self.name = name; self.email = email"),
+    (3, "models.py", "class Session: token = None; expires_at = None"),
+    (4, "middleware.py", "def auth_middleware(request): token = request.headers.get('Authorization')"),
+    (5, "utils.py", "def hash_password(password): import bcrypt; return bcrypt.hashpw(password)"),
+    (6, "config.py", "DATABASE_URL = os.environ.get('DATABASE_URL', 'sqlite:///db.sqlite3')"),
+    (7, "search.py", "def search_users(query): return User.objects.filter(name__icontains=query)"),
+    (8, "api.py", "def get_user(request, user_id): user = User.objects.get(id=user_id)"),
+    (9, "api.py", "def create_user(request): data = request.json(); user = User(**data)"),
+    (10, "tests.py", "def test_authenticate(): assert authenticate('admin', 'pass') is not None"),
+    (11, "tests.py", "def test_search(): results = search_users('alice'); assert len(results) > 0"),
+    (12, "router.py", "app.route('/users', methods=['GET'])(list_users)"),
+    (13, "router.py", "app.route('/login', methods=['POST'])(login_handler)"),
+    (14, "db.py", "def get_connection(): return sqlite3.connect(DATABASE_URL)"),
+    (15, "cache.py", "def cache_get(key): return redis_client.get(key)"),
+    (16, "cache.py", "def cache_set(key, value, ttl=3600): redis_client.setex(key, ttl, value)"),
+    (17, "errors.py", "class AuthError(Exception): status_code = 401"),
+    (18, "errors.py", "class NotFoundError(Exception): status_code = 404"),
+    (19, "validators.py", "def validate_email(email): return '@' in email and '.' in email.split('@')[1]"),
+]
+
+DIM = 32  # Use small dim for fast tests
+
+
+def make_stable_vec(doc_id: int, dim: int = DIM) -> np.ndarray:
+    """Generate a deterministic float32 vector for a given doc_id."""
+    rng = np.random.default_rng(seed=doc_id)
+    vec = rng.standard_normal(dim).astype(np.float32)
+    vec /= np.linalg.norm(vec)
+    return vec
+
+
+class MockEmbedder(BaseEmbedder):
+    """Returns stable deterministic vectors based on content hash."""
+
+    def embed_single(self, text: str) -> np.ndarray:
+        seed = hash(text) % (2**31)
+        rng = np.random.default_rng(seed=seed)
+        vec = rng.standard_normal(DIM).astype(np.float32)
+        vec /= np.linalg.norm(vec)
+        return vec
+
+    def embed_batch(self, texts: list[str]) -> list[np.ndarray]:
+        return [self.embed_single(t) for t in texts]
+
+    def embed(self, texts: list[str]) -> list[np.ndarray]:
+        """Called by SearchPipeline as self._embedder.embed([query])[0]."""
+        return self.embed_batch(texts)
+
+
+class MockReranker(BaseReranker):
+    """Returns score based on simple keyword overlap."""
+
+    def score_pairs(self, query: str, documents: list[str]) -> list[float]:
+        query_words = set(query.lower().split())
+        scores = []
+        for doc in documents:
+            doc_words = set(doc.lower().split())
+            overlap = len(query_words & doc_words)
+            scores.append(float(overlap) / max(len(query_words), 1))
+        return scores
+
+
+@pytest.fixture
+def config():
+    return Config.small()  # hnsw_ef=50, hnsw_M=16, binary_top_k=50, ann_top_k=20, rerank_top_k=10
+
+
+@pytest.fixture
+def search_pipeline(tmp_path, config):
+    """Build a full SearchPipeline with 20 test docs indexed."""
+    embedder = MockEmbedder()
+    binary_store = BinaryStore(tmp_path / "binary", dim=DIM, config=config)
+    ann_index = ANNIndex(tmp_path / "ann.hnsw", dim=DIM, config=config)
+    fts = FTSEngine(tmp_path / "fts.db")
+    reranker = MockReranker()
+
+    # Index all test docs
+    ids = np.array([d[0] for d in TEST_DOCS], dtype=np.int64)
+    vectors = np.array([embedder.embed_single(d[2]) for d in TEST_DOCS], dtype=np.float32)
+
+    binary_store.add(ids, vectors)
+    ann_index.add(ids, vectors)
+    fts.add_documents(TEST_DOCS)
+
+    return SearchPipeline(
+        embedder=embedder,
+        binary_store=binary_store,
+        ann_index=ann_index,
+        reranker=reranker,
+        fts=fts,
+        config=config,
+    )
--- a/codex-lens-v2/tests/integration/test_search_pipeline.py
+++ b/codex-lens-v2/tests/integration/test_search_pipeline.py
@@ -0,0 +1,44 @@
+"""Integration tests for SearchPipeline using real components and mock embedder/reranker."""
+from __future__ import annotations
+
+
+def test_vector_search_returns_results(search_pipeline):
+    results = search_pipeline.search("authentication middleware")
+    assert len(results) > 0
+    assert all(isinstance(r.score, float) for r in results)
+
+
+def test_exact_keyword_search(search_pipeline):
+    results = search_pipeline.search("authenticate")
+    assert len(results) > 0
+    result_ids = {r.id for r in results}
+    # Doc 0 and 10 both contain "authenticate"
+    assert result_ids & {0, 10}, f"Expected doc 0 or 10 in results, got {result_ids}"
+
+
+def test_pipeline_top_k_limit(search_pipeline):
+    results = search_pipeline.search("user", top_k=5)
+    assert len(results) <= 5
+
+
+def test_search_result_fields_populated(search_pipeline):
+    results = search_pipeline.search("password")
+    assert len(results) > 0
+    for r in results:
+        assert r.id >= 0
+        assert r.score >= 0
+        assert isinstance(r.path, str)
+
+
+def test_empty_query_handled(search_pipeline):
+    results = search_pipeline.search("")
+    assert isinstance(results, list)  # no exception
+
+
+def test_different_queries_give_different_results(search_pipeline):
+    r1 = search_pipeline.search("authenticate user")
+    r2 = search_pipeline.search("cache redis")
+    # Results should differ (different top IDs or scores), unless both are empty
+    ids1 = [r.id for r in r1]
+    ids2 = [r.id for r in r2]
+    assert ids1 != ids2 or len(r1) == 0
--- a/codex-lens-v2/tests/unit/init.py
+++ b/codex-lens-v2/tests/unit/init.py
--- a/codex-lens-v2/tests/unit/test_config.py
+++ b/codex-lens-v2/tests/unit/test_config.py
@@ -0,0 +1,31 @@
+from codexlens.config import Config
+
+
+def test_config_instantiates_no_args():
+    cfg = Config()
+    assert cfg is not None
+
+
+def test_defaults_hnsw_ef():
+    cfg = Config.defaults()
+    assert cfg.hnsw_ef == 150
+
+
+def test_defaults_hnsw_M():
+    cfg = Config.defaults()
+    assert cfg.hnsw_M == 32
+
+
+def test_small_hnsw_ef():
+    cfg = Config.small()
+    assert cfg.hnsw_ef == 50
+
+
+def test_custom_instantiation():
+    cfg = Config(hnsw_ef=100)
+    assert cfg.hnsw_ef == 100
+
+
+def test_fusion_weights_keys():
+    cfg = Config()
+    assert set(cfg.fusion_weights.keys()) == {"exact", "fuzzy", "vector", "graph"}
--- a/codex-lens-v2/tests/unit/test_core.py
+++ b/codex-lens-v2/tests/unit/test_core.py
@@ -0,0 +1,136 @@
+"""Unit tests for BinaryStore and ANNIndex (no fastembed required)."""
+from __future__ import annotations
+
+import concurrent.futures
+import tempfile
+from pathlib import Path
+
+import numpy as np
+import pytest
+
+from codexlens.config import Config
+from codexlens.core import ANNIndex, BinaryStore
+
+
+DIM = 32
+RNG = np.random.default_rng(42)
+
+
+def make_vectors(n: int, dim: int = DIM) -> np.ndarray:
+    return RNG.standard_normal((n, dim)).astype(np.float32)
+
+
+def make_ids(n: int, start: int = 0) -> np.ndarray:
+    return np.arange(start, start + n, dtype=np.int64)
+
+
+# ---------------------------------------------------------------------------
+# BinaryStore tests
+# ---------------------------------------------------------------------------
+
+
+class TestBinaryStore:
+    def test_binary_store_add_and_search(self, tmp_path: Path) -> None:
+        cfg = Config.small()
+        store = BinaryStore(tmp_path, DIM, cfg)
+        vecs = make_vectors(10)
+        ids = make_ids(10)
+        store.add(ids, vecs)
+
+        assert len(store) == 10
+
+        top_k = 5
+        ret_ids, ret_dists = store.coarse_search(vecs[0], top_k=top_k)
+        assert ret_ids.shape == (top_k,)
+        assert ret_dists.shape == (top_k,)
+        # distances are non-negative integers
+        assert (ret_dists >= 0).all()
+
+    def test_binary_hamming_correctness(self, tmp_path: Path) -> None:
+        cfg = Config.small()
+        store = BinaryStore(tmp_path, DIM, cfg)
+        vecs = make_vectors(20)
+        ids = make_ids(20)
+        store.add(ids, vecs)
+
+        # Query with the exact stored vector; it must be the top-1 result
+        query = vecs[7]
+        ret_ids, ret_dists = store.coarse_search(query, top_k=1)
+        assert ret_ids[0] == 7
+        assert ret_dists[0] == 0  # Hamming distance to itself is 0
+
+    def test_binary_store_persist(self, tmp_path: Path) -> None:
+        cfg = Config.small()
+        store = BinaryStore(tmp_path, DIM, cfg)
+        vecs = make_vectors(15)
+        ids = make_ids(15)
+        store.add(ids, vecs)
+        store.save()
+
+        # Load into a fresh instance
+        store2 = BinaryStore(tmp_path, DIM, cfg)
+        assert len(store2) == 15
+
+        query = vecs[3]
+        ret_ids, ret_dists = store2.coarse_search(query, top_k=1)
+        assert ret_ids[0] == 3
+        assert ret_dists[0] == 0
+
+
+# ---------------------------------------------------------------------------
+# ANNIndex tests
+# ---------------------------------------------------------------------------
+
+
+class TestANNIndex:
+    def test_ann_index_add_and_search(self, tmp_path: Path) -> None:
+        cfg = Config.small()
+        idx = ANNIndex(tmp_path, DIM, cfg)
+        vecs = make_vectors(50)
+        ids = make_ids(50)
+        idx.add(ids, vecs)
+
+        assert len(idx) == 50
+
+        ret_ids, ret_dists = idx.fine_search(vecs[0], top_k=5)
+        assert len(ret_ids) == 5
+        assert len(ret_dists) == 5
+
+    def test_ann_index_thread_safety(self, tmp_path: Path) -> None:
+        cfg = Config.small()
+        idx = ANNIndex(tmp_path, DIM, cfg)
+        vecs = make_vectors(50)
+        ids = make_ids(50)
+        idx.add(ids, vecs)
+
+        query = vecs[0]
+        errors: list[Exception] = []
+
+        def search() -> None:
+            try:
+                idx.fine_search(query, top_k=3)
+            except Exception as exc:
+                errors.append(exc)
+
+        with concurrent.futures.ThreadPoolExecutor(max_workers=5) as pool:
+            futures = [pool.submit(search) for _ in range(5)]
+            concurrent.futures.wait(futures)
+
+        assert errors == [], f"Thread safety errors: {errors}"
+
+    def test_ann_index_save_load(self, tmp_path: Path) -> None:
+        cfg = Config.small()
+        idx = ANNIndex(tmp_path, DIM, cfg)
+        vecs = make_vectors(30)
+        ids = make_ids(30)
+        idx.add(ids, vecs)
+        idx.save()
+
+        # Load into a fresh instance
+        idx2 = ANNIndex(tmp_path, DIM, cfg)
+        idx2.load()
+        assert len(idx2) == 30
+
+        ret_ids, ret_dists = idx2.fine_search(vecs[10], top_k=1)
+        assert len(ret_ids) == 1
+        assert ret_ids[0] == 10
--- a/codex-lens-v2/tests/unit/test_embed.py
+++ b/codex-lens-v2/tests/unit/test_embed.py
@@ -0,0 +1,80 @@
+from __future__ import annotations
+
+import sys
+import types
+import unittest
+from unittest.mock import MagicMock, patch
+
+import numpy as np
+
+
+def _make_fastembed_mock():
+    """Build a minimal fastembed stub so imports succeed without the real package."""
+    fastembed_mod = types.ModuleType("fastembed")
+    fastembed_mod.TextEmbedding = MagicMock()
+    sys.modules.setdefault("fastembed", fastembed_mod)
+    return fastembed_mod
+
+
+_make_fastembed_mock()
+
+from codexlens.config import Config  # noqa: E402
+from codexlens.embed.base import BaseEmbedder  # noqa: E402
+from codexlens.embed.local import EMBED_PROFILES, FastEmbedEmbedder  # noqa: E402
+
+
+class TestEmbedSingle(unittest.TestCase):
+    def test_embed_single_returns_float32_ndarray(self):
+        config = Config()
+        embedder = FastEmbedEmbedder(config)
+
+        mock_model = MagicMock()
+        mock_model.embed.return_value = iter([np.ones(384, dtype=np.float64)])
+
+        # Inject mock model directly to bypass lazy load (no real fastembed needed)
+        embedder._model = mock_model
+        result = embedder.embed_single("hello world")
+
+        self.assertIsInstance(result, np.ndarray)
+        self.assertEqual(result.dtype, np.float32)
+        self.assertEqual(result.shape, (384,))
+
+
+class TestEmbedBatch(unittest.TestCase):
+    def test_embed_batch_returns_list(self):
+        config = Config()
+        embedder = FastEmbedEmbedder(config)
+
+        vecs = [np.ones(384, dtype=np.float64) * i for i in range(3)]
+        mock_model = MagicMock()
+        mock_model.embed.return_value = iter(vecs)
+
+        embedder._model = mock_model
+        result = embedder.embed_batch(["a", "b", "c"])
+
+        self.assertIsInstance(result, list)
+        self.assertEqual(len(result), 3)
+        for arr in result:
+            self.assertIsInstance(arr, np.ndarray)
+            self.assertEqual(arr.dtype, np.float32)
+
+
+class TestEmbedProfiles(unittest.TestCase):
+    def test_embed_profiles_all_have_valid_keys(self):
+        expected_keys = {"small", "base", "large", "code"}
+        self.assertEqual(set(EMBED_PROFILES.keys()), expected_keys)
+
+    def test_embed_profiles_model_ids_non_empty(self):
+        for key, model_id in EMBED_PROFILES.items():
+            self.assertIsInstance(model_id, str, msg=f"{key} model id should be str")
+            self.assertTrue(len(model_id) > 0, msg=f"{key} model id should be non-empty")
+
+
+class TestBaseEmbedderAbstract(unittest.TestCase):
+    def test_base_embedder_is_abstract(self):
+        with self.assertRaises(TypeError):
+            BaseEmbedder()  # type: ignore[abstract]
+
+
+if __name__ == "__main__":
+    unittest.main()
--- a/codex-lens-v2/tests/unit/test_rerank.py
+++ b/codex-lens-v2/tests/unit/test_rerank.py
@@ -0,0 +1,179 @@
+from __future__ import annotations
+
+import types
+from unittest.mock import MagicMock, patch
+
+import pytest
+
+from codexlens.config import Config
+from codexlens.rerank.base import BaseReranker
+from codexlens.rerank.local import FastEmbedReranker
+from codexlens.rerank.api import APIReranker
+
+
+# ---------------------------------------------------------------------------
+# BaseReranker
+# ---------------------------------------------------------------------------
+
+def test_base_reranker_is_abstract():
+    with pytest.raises(TypeError):
+        BaseReranker()  # type: ignore[abstract]
+
+
+# ---------------------------------------------------------------------------
+# FastEmbedReranker
+# ---------------------------------------------------------------------------
+
+def _make_rerank_result(index: int, score: float) -> object:
+    obj = types.SimpleNamespace(index=index, score=score)
+    return obj
+
+
+def test_local_reranker_score_pairs_length():
+    config = Config()
+    reranker = FastEmbedReranker(config)
+
+    mock_results = [
+        _make_rerank_result(0, 0.9),
+        _make_rerank_result(1, 0.5),
+        _make_rerank_result(2, 0.1),
+    ]
+
+    mock_model = MagicMock()
+    mock_model.rerank.return_value = iter(mock_results)
+    reranker._model = mock_model
+
+    docs = ["doc0", "doc1", "doc2"]
+    scores = reranker.score_pairs("query", docs)
+
+    assert len(scores) == 3
+
+
+def test_local_reranker_preserves_order():
+    config = Config()
+    reranker = FastEmbedReranker(config)
+
+    # rerank returns results in reverse order (index 2, 1, 0)
+    mock_results = [
+        _make_rerank_result(2, 0.1),
+        _make_rerank_result(1, 0.5),
+        _make_rerank_result(0, 0.9),
+    ]
+
+    mock_model = MagicMock()
+    mock_model.rerank.return_value = iter(mock_results)
+    reranker._model = mock_model
+
+    docs = ["doc0", "doc1", "doc2"]
+    scores = reranker.score_pairs("query", docs)
+
+    assert scores[0] == pytest.approx(0.9)
+    assert scores[1] == pytest.approx(0.5)
+    assert scores[2] == pytest.approx(0.1)
+
+
+# ---------------------------------------------------------------------------
+# APIReranker
+# ---------------------------------------------------------------------------
+
+def _make_config(max_tokens_per_batch: int = 512) -> Config:
+    return Config(
+        reranker_api_url="https://api.example.com",
+        reranker_api_key="test-key",
+        reranker_api_model="test-model",
+        reranker_api_max_tokens_per_batch=max_tokens_per_batch,
+    )
+
+
+def test_api_reranker_batch_splitting():
+    config = _make_config(max_tokens_per_batch=512)
+
+    with patch("httpx.Client"):
+        reranker = APIReranker(config)
+
+    # 10 docs, each ~200 tokens (800 chars)
+    docs = ["x" * 800] * 10
+    batches = reranker._split_batches(docs, max_tokens=512)
+
+    # Each doc is 200 tokens; batches should have at most 2 docs (200+200=400 <= 512, 400+200=600 > 512)
+    assert len(batches) > 1
+    for batch in batches:
+        total = sum(len(text) // 4 for _, text in batch)
+        assert total <= 512 or len(batch) == 1
+
+
+def test_api_reranker_retry_on_429():
+    config = _make_config()
+
+    mock_429 = MagicMock()
+    mock_429.status_code = 429
+
+    mock_200 = MagicMock()
+    mock_200.status_code = 200
+    mock_200.json.return_value = {
+        "results": [
+            {"index": 0, "relevance_score": 0.8},
+            {"index": 1, "relevance_score": 0.3},
+        ]
+    }
+    mock_200.raise_for_status = MagicMock()
+
+    with patch("httpx.Client") as mock_client_cls:
+        mock_client = MagicMock()
+        mock_client_cls.return_value = mock_client
+        mock_client.post.side_effect = [mock_429, mock_429, mock_200]
+
+        reranker = APIReranker(config)
+
+        with patch("time.sleep"):
+            result = reranker._call_api_with_retry(
+                "query",
+                [(0, "doc0"), (1, "doc1")],
+                max_retries=3,
+            )
+
+    assert mock_client.post.call_count == 3
+    assert 0 in result
+    assert 1 in result
+
+
+def test_api_reranker_merge_batches():
+    config = _make_config(max_tokens_per_batch=100)
+
+    # 4 docs of 25 tokens each (100 chars); each batch holds at most 4 docs
+    # Use smaller docs to force 2 batches: 2 docs per batch (50 tokens each = 200 chars)
+    docs = ["x" * 200] * 4  # 50 tokens each; 50+50=100 <= 100, 100+50=150 > 100 -> 2 per batch
+
+    batch0_response = MagicMock()
+    batch0_response.status_code = 200
+    batch0_response.json.return_value = {
+        "results": [
+            {"index": 0, "relevance_score": 0.9},
+            {"index": 1, "relevance_score": 0.8},
+        ]
+    }
+    batch0_response.raise_for_status = MagicMock()
+
+    batch1_response = MagicMock()
+    batch1_response.status_code = 200
+    batch1_response.json.return_value = {
+        "results": [
+            {"index": 0, "relevance_score": 0.7},
+            {"index": 1, "relevance_score": 0.6},
+        ]
+    }
+    batch1_response.raise_for_status = MagicMock()
+
+    with patch("httpx.Client") as mock_client_cls:
+        mock_client = MagicMock()
+        mock_client_cls.return_value = mock_client
+        mock_client.post.side_effect = [batch0_response, batch1_response]
+
+        reranker = APIReranker(config)
+
+        with patch("time.sleep"):
+            scores = reranker.score_pairs("query", docs)
+
+    assert len(scores) == 4
+    # All original indices should have scores
+    assert all(s > 0 for s in scores)
--- a/codex-lens-v2/tests/unit/test_search.py
+++ b/codex-lens-v2/tests/unit/test_search.py
@@ -0,0 +1,156 @@
+"""Unit tests for search layer: FTSEngine, fusion, and SearchPipeline."""
+from __future__ import annotations
+
+from unittest.mock import MagicMock
+
+import pytest
+
+from codexlens.search.fts import FTSEngine
+from codexlens.search.fusion import (
+    DEFAULT_WEIGHTS,
+    QueryIntent,
+    detect_query_intent,
+    get_adaptive_weights,
+    reciprocal_rank_fusion,
+)
+from codexlens.search.pipeline import SearchPipeline, SearchResult
+from codexlens.config import Config
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+def make_fts(docs: list[tuple[int, str, str]] | None = None) -> FTSEngine:
+    """Create an in-memory FTSEngine and optionally add documents."""
+    engine = FTSEngine(":memory:")
+    if docs:
+        engine.add_documents(docs)
+    return engine
+
+
+# ---------------------------------------------------------------------------
+# FTSEngine tests
+# ---------------------------------------------------------------------------
+
+def test_fts_add_and_exact_search():
+    docs = [
+        (1, "a.py", "def authenticate user password login"),
+        (2, "b.py", "connect to database with credentials"),
+        (3, "c.py", "render template html response"),
+    ]
+    engine = make_fts(docs)
+    results = engine.exact_search("authenticate", top_k=10)
+    ids = [r[0] for r in results]
+    assert 1 in ids, "doc 1 should match 'authenticate'"
+    assert 2 not in ids or results[0][0] == 1  # doc 1 must rank higher
+
+
+def test_fts_fuzzy_search_prefix():
+    docs = [
+        (10, "auth.py", "authentication token refresh"),
+        (11, "db.py", "database connection pool"),
+        (12, "ui.py", "render button click handler"),
+    ]
+    engine = make_fts(docs)
+    # Prefix 'auth' should match 'authentication' in doc 10
+    results = engine.fuzzy_search("auth", top_k=10)
+    ids = [r[0] for r in results]
+    assert 10 in ids, "prefix 'auth' should match doc 10 with 'authentication'"
+
+
+# ---------------------------------------------------------------------------
+# RRF fusion tests
+# ---------------------------------------------------------------------------
+
+def test_rrf_fusion_ordering():
+    """When two sources agree on top-1, it should rank first in fused result."""
+    source_a = [(1, 0.9), (2, 0.5), (3, 0.2)]
+    source_b = [(1, 0.8), (3, 0.6), (2, 0.1)]
+    fused = reciprocal_rank_fusion({"a": source_a, "b": source_b})
+    assert fused[0][0] == 1, "doc 1 agreed top by both sources must rank first"
+
+
+def test_rrf_equal_weight_default():
+    """Calling with None weights should use DEFAULT_WEIGHTS shape (not crash)."""
+    source_exact = [(5, 1.0), (6, 0.8)]
+    source_vector = [(6, 0.9), (5, 0.7)]
+    # Should not raise and should return results
+    fused = reciprocal_rank_fusion(
+        {"exact": source_exact, "vector": source_vector},
+        weights=None,
+    )
+    assert len(fused) == 2
+    ids = [r[0] for r in fused]
+    assert 5 in ids and 6 in ids
+
+
+# ---------------------------------------------------------------------------
+# detect_query_intent tests
+# ---------------------------------------------------------------------------
+
+def test_detect_intent_code_symbol():
+    assert detect_query_intent("def authenticate()") == QueryIntent.CODE_SYMBOL
+
+
+def test_detect_intent_natural():
+    assert detect_query_intent("how do I authenticate users") == QueryIntent.NATURAL_LANGUAGE
+
+
+# ---------------------------------------------------------------------------
+# SearchPipeline tests
+# ---------------------------------------------------------------------------
+
+def _make_pipeline(fts: FTSEngine, top_k: int = 5) -> SearchPipeline:
+    """Build a SearchPipeline with mocked heavy components."""
+    cfg = Config.small()
+    cfg.reranker_top_k = top_k
+
+    embedder = MagicMock()
+    embedder.embed.return_value = [[0.1] * cfg.embed_dim]
+
+    binary_store = MagicMock()
+    binary_store.coarse_search.return_value = ([1, 2, 3], None)
+
+    ann_index = MagicMock()
+    ann_index.fine_search.return_value = ([1, 2, 3], [0.9, 0.8, 0.7])
+
+    reranker = MagicMock()
+    # Return a score for each content string passed
+    reranker.score_pairs.side_effect = lambda q, contents: [0.9 - i * 0.1 for i in range(len(contents))]
+
+    return SearchPipeline(
+        embedder=embedder,
+        binary_store=binary_store,
+        ann_index=ann_index,
+        reranker=reranker,
+        fts=fts,
+        config=cfg,
+    )
+
+
+def test_pipeline_search_returns_results():
+    docs = [
+        (1, "a.py", "test content alpha"),
+        (2, "b.py", "test content beta"),
+        (3, "c.py", "test content gamma"),
+    ]
+    fts = make_fts(docs)
+    pipeline = _make_pipeline(fts)
+    results = pipeline.search("test")
+    assert len(results) > 0
+    assert all(isinstance(r, SearchResult) for r in results)
+
+
+def test_pipeline_top_k_limit():
+    docs = [
+        (1, "a.py", "hello world one"),
+        (2, "b.py", "hello world two"),
+        (3, "c.py", "hello world three"),
+        (4, "d.py", "hello world four"),
+        (5, "e.py", "hello world five"),
+    ]
+    fts = make_fts(docs)
+    pipeline = _make_pipeline(fts, top_k=2)
+    results = pipeline.search("hello", top_k=2)
+    assert len(results) <= 2, "pipeline must respect top_k limit"