diff --git a/.claude/skills/command-generator/SKILL.md b/.claude/skills/command-generator/SKILL.md
index c9e068d1..28e084f7 100644
--- a/.claude/skills/command-generator/SKILL.md
+++ b/.claude/skills/command-generator/SKILL.md
@@ -1,190 +1,307 @@
 ---
 name: command-generator
 description: Command file generator - 5 phase workflow for creating Claude Code command files with YAML frontmatter. Generates .md command files for project or user scope. Triggers on "create command", "new command", "command generator".
-allowed-tools: Read, Write, Edit, Bash, Glob
+allowed-tools: Read, Write, Edit, Bash, Glob, AskUserQuestion
 ---
 
-# Command Generator
+<purpose>
+Generate Claude Code command .md files with concrete, domain-specific content in GSD style. Produces command files at project-level (`.claude/commands/`) or user-level (`~/.claude/commands/`) with YAML frontmatter, XML semantic tags (`<purpose>`, `<process>`, `<step>`, `<error_codes>`, `<success_criteria>`), and actionable execution logic — NOT empty placeholders.
 
-CLI-based command file generator producing Claude Code command .md files through a structured 5-phase workflow. Supports both project-level (`.claude/commands/`) and user-level (`~/.claude/commands/`) command locations.
+Invoked when user requests "create command", "new command", or "command generator".
+</purpose>
 
-## Architecture Overview
+<required_reading>
+- @.claude/skills/command-generator/specs/command-design-spec.md
+- @.claude/skills/command-generator/templates/command-md.md
+</required_reading>
+
+<process>
+
+<step name="validate_params" priority="first">
+**Parse and validate all input parameters.**
+
+Extract from `$ARGUMENTS` or skill args:
+
+| Parameter | Required | Validation | Example |
+|-----------|----------|------------|---------|
+| `$SKILL_NAME` | Yes | `/^[a-z][a-z0-9-]*$/`, min 1 char | `deploy`, `create` |
+| `$DESCRIPTION` | Yes | min 10 chars | `"Deploy application to production"` |
+| `$LOCATION` | Yes | `"project"` or `"user"` | `project` |
+| `$GROUP` | No | `/^[a-z][a-z0-9-]*$/` or null | `issue`, `workflow` |
+| `$ARGUMENT_HINT` | No | any string or empty | `"<url> [--priority 1-5]"` |
+
+**Validation rules:**
+- Missing required param → Error with specific message (e.g., `"skillName is required"`)
+- Invalid `$SKILL_NAME` pattern → Error: `"skillName must be lowercase alphanumeric with hyphens, starting with a letter"`
+- Invalid `$LOCATION` → Error: `"location must be 'project' or 'user'"`
+- Invalid `$GROUP` pattern → Warning, continue
+
+**Normalize:** trim + lowercase for `$SKILL_NAME`, `$LOCATION`, `$GROUP`.
+</step>
+
+<step name="resolve_target_path">
+**Resolve target file path based on location and group.**
+
+Path mapping:
+
+| Location | Base Directory |
+|----------|---------------|
+| `project` | `.claude/commands` |
+| `user` | `~/.claude/commands` (expand `~` to `$HOME`) |
+
+Path construction:
 
 ```
-+-----------------------------------------------------------+
-|                    Command Generator                        |
-|                                                            |
-|  Input: skillName, description, location, [group], [hint]  |
-|                         |                                  |
-|  +-------------------------------------------------+      |
-|  |  Phase 1-5: Sequential Pipeline                 |      |
-|  |                                                 |      |
-|  |  [P1] --> [P2] --> [P3] --> [P4] --> [P5]      |      |
-|  |  Param   Target   Template  Content   File     |      |
-|  |  Valid   Path     Loading   Format    Gen      |      |
-|  +-------------------------------------------------+      |
-|                         |                                  |
-|  Output: {scope}/.claude/commands/{group}/{name}.md       |
-|                                                            |
-+-----------------------------------------------------------+
+If $GROUP:
+  $TARGET_DIR = {base}/{$GROUP}
+  $TARGET_PATH = {base}/{$GROUP}/{$SKILL_NAME}.md
+Else:
+  $TARGET_DIR = {base}
+  $TARGET_PATH = {base}/{$SKILL_NAME}.md
 ```
 
-## Key Design Principles
+Check if `$TARGET_PATH` already exists → store as `$FILE_EXISTS` (true/false).
+</step>
 
-1. **Single Responsibility**: Generates one command file per invocation
-2. **Scope Awareness**: Supports project and user-level command locations
-3. **Template-Driven**: Uses consistent template for all generated commands
-4. **Validation First**: Validates all required parameters before file operations
-5. **Non-Destructive**: Warns if command file already exists
+<step name="gather_requirements">
+**Gather domain-specific requirements to generate concrete content.**
 
+Infer the command's domain from `$SKILL_NAME`, `$DESCRIPTION`, and `$ARGUMENT_HINT`:
+
+| Signal | Extract |
+|--------|---------|
+| `$SKILL_NAME` | Action verb (deploy, create, analyze, sync) → step naming |
+| `$DESCRIPTION` | Domain keywords → execution logic, error scenarios |
+| `$ARGUMENT_HINT` | Flags/args → parse_input step details, validation rules |
+| `$GROUP` | Command family → related commands, shared patterns |
+
+**Determine command complexity:**
+
+| Complexity | Criteria | Steps to Generate |
+|------------|----------|-------------------|
+| Simple | Single action, no flags | 2-3 steps |
+| Standard | 1-2 flags, clear workflow | 3-4 steps |
+| Complex | Multiple flags, multi-phase | 4-6 steps |
+
+**If complexity is unclear**, ask user:
+
+```
+AskUserQuestion(
+  header: "Command Scope",
+  question: "What are the main execution steps for this command?",
+  options: [
+    { label: "Simple", description: "Single action: validate → execute → report" },
+    { label: "Standard", description: "Multi-step: parse → process → verify → report" },
+    { label: "Complex", description: "Full workflow: parse → explore → execute → verify → report" },
+    { label: "I'll describe", description: "Let me specify the steps" }
+  ]
+)
+```
+
+Store as `$COMMAND_STEPS`, `$ERROR_SCENARIOS`, `$SUCCESS_CONDITIONS`.
+</step>
+
+<step name="draft_content">
+**Generate concrete, domain-specific command content in GSD style.**
+
+This is the core generation step. Draft the COMPLETE command file — not a template with placeholders — using the gathered requirements.
+
+**YAML Frontmatter:**
+```yaml
 ---
-
-## Execution Flow
-
-```
-Phase 1: Parameter Validation
-   - Ref: phases/01-parameter-validation.md
-   - Validate: skillName (required), description (required), location (required)
-   - Optional: group, argumentHint
-   - Output: validated params object
-
-Phase 2: Target Path Resolution
-   - Ref: phases/02-target-path-resolution.md
-   - Resolve: location -> target commands directory
-   - Support: project (.claude/commands/) vs user (~/.claude/commands/)
-   - Handle: group subdirectory if provided
-   - Output: targetPath string
-
-Phase 3: Template Loading
-   - Ref: phases/03-template-loading.md
-   - Load: templates/command-md.md
-   - Template contains YAML frontmatter with placeholders
-   - Output: templateContent string
-
-Phase 4: Content Formatting
-   - Ref: phases/04-content-formatting.md
-   - Substitute: {{name}}, {{description}}, {{group}}, {{argumentHint}}
-   - Handle: optional fields (group, argumentHint)
-   - Output: formattedContent string
-
-Phase 5: File Generation
-   - Ref: phases/05-file-generation.md
-   - Check: file existence (warn if exists)
-   - Write: formatted content to target path
-   - Output: success confirmation with file path
-```
-
-## Usage Examples
-
-### Basic Command (Project Scope)
-```javascript
-Skill(skill="command-generator", args={
-  skillName: "deploy",
-  description: "Deploy application to production environment",
-  location: "project"
-})
-// Output: .claude/commands/deploy.md
-```
-
-### Grouped Command with Argument Hint
-```javascript
-Skill(skill="command-generator", args={
-  skillName: "create",
-  description: "Create new issue from GitHub URL or text",
-  location: "project",
-  group: "issue",
-  argumentHint: "[-y|--yes] <github-url | text-description> [--priority 1-5]"
-})
-// Output: .claude/commands/issue/create.md
-```
-
-### User-Level Command
-```javascript
-Skill(skill="command-generator", args={
-  skillName: "global-status",
-  description: "Show global Claude Code status",
-  location: "user"
-})
-// Output: ~/.claude/commands/global-status.md
-```
-
+name: $SKILL_NAME
+description: $DESCRIPTION
+argument-hint: $ARGUMENT_HINT  # only if provided
 ---
+```
 
-## Reference Documents by Phase
+**`<purpose>` section:** Write 2-3 sentences describing:
+- What the command does (action + target)
+- When it's invoked (trigger conditions)
+- What it produces (output artifacts or effects)
 
-### Phase 1: Parameter Validation
-| Document | Purpose | When to Use |
-|----------|---------|-------------|
-| [phases/01-parameter-validation.md](phases/01-parameter-validation.md) | Validate required parameters | Phase 1 execution |
+**`<required_reading>` section:** Infer from domain:
+- If command reads config → `@.claude/CLAUDE.md` or relevant config files
+- If command modifies code → relevant source directories
+- If command is part of a group → other commands in the same group
 
-### Phase 2: Target Path Resolution
-| Document | Purpose | When to Use |
-|----------|---------|-------------|
-| [phases/02-target-path-resolution.md](phases/02-target-path-resolution.md) | Resolve target directory | Phase 2 execution |
+**`<process>` section with `<step>` blocks:**
 
-### Phase 3: Template Loading
-| Document | Purpose | When to Use |
-|----------|---------|-------------|
-| [phases/03-template-loading.md](phases/03-template-loading.md) | Load command template | Phase 3 execution |
-| [templates/command-md.md](templates/command-md.md) | Command file template | Template reference |
+For each step in `$COMMAND_STEPS`, generate a `<step name="snake_case">` block containing:
 
-### Phase 4: Content Formatting
-| Document | Purpose | When to Use |
-|----------|---------|-------------|
-| [phases/04-content-formatting.md](phases/04-content-formatting.md) | Format content with params | Phase 4 execution |
+1. **`parse_input`** (always first, `priority="first"`):
+   - Parse `$ARGUMENTS` for flags and positional args derived from `$ARGUMENT_HINT`
+   - Include specific flag detection logic (e.g., `if arguments contain "--env"`)
+   - Include validation with specific error messages
+   - Include decision routing table if multiple modes exist
 
-### Phase 5: File Generation
-| Document | Purpose | When to Use |
-|----------|---------|-------------|
-| [phases/05-file-generation.md](phases/05-file-generation.md) | Write final file | Phase 5 execution |
+2. **Domain-specific execution steps** (2-4 steps):
+   - Each step has a **bold action description**
+   - Include concrete shell commands, file operations, or tool calls
+   - Use `$UPPER_CASE` variables for user input, `${computed}` for derived values
+   - Include conditional logic with specific conditions (not generic)
+   - Reference actual file paths and tool names
 
-### Design Specifications
-| Document | Purpose | When to Use |
-|----------|---------|-------------|
-| [specs/command-design-spec.md](specs/command-design-spec.md) | Command design guidelines | Understanding best practices |
+3. **`report`** (always last):
+   - Format output with banner and status
+   - Include file paths, timestamps, next step suggestions
 
----
+**Shell Correctness Checklist (MANDATORY for every shell block):**
 
-## Output Structure
+| Rule | Wrong | Correct |
+|------|-------|---------|
+| Multi-line output | `echo "{ ... }"` (unquoted multi-line) | `cat <<'EOF' > file`...`EOF` (heredoc) |
+| Variable init | Use `$VAR` after conditional | `VAR="default"` BEFORE any conditional that sets it |
+| Error exit | `echo "Error: ..."` (no exit) | `echo "Error: ..." # (see code: E00X)` + `exit 1` |
+| Quoting | `$VAR` in commands | `"$VAR"` (double-quoted in all expansions) |
+| Exit on fail | Command chain without checks | `set -e` or explicit `|| { echo "Failed"; exit 1; }` |
+| Command from var | `$CMD --flag` (word-split fragile) | `eval "$CMD" --flag` or use array: `cmd=(...); "${cmd[@]}"` |
+| Prerequisites | Implicit `git`/`curl` usage | Declare in `<prerequisites>` section |
 
-### Generated Command File
+**Golden Example — a correctly-written execution step:**
 
 ```markdown
----
-name: {skillName}
-description: {description}
-{group} {argumentHint}
----
+<step name="run_deployment">
+**Execute deployment to target environment.**
 
-# {skillName} Command
+$DEPLOY_STATUS="pending"  # Initialize before conditional
 
-## Overview
-{Auto-generated placeholder for command overview}
+```bash
+# Save current state for rollback
+cp .deploy/latest.json .deploy/previous.json 2>/dev/null || true
 
-## Usage
-{Auto-generated placeholder for usage examples}
+# Write deployment manifest via heredoc
+cat <<EOF > .deploy/latest.json
+{
+  "env": "$ENV",
+  "tag": "$DEPLOY_TAG",
+  "timestamp": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
+  "commit": "$(git rev-parse --short HEAD)",
+  "status": "deploying"
+}
+EOF
 
-## Execution Flow
-{Auto-generated placeholder for execution steps}
+# Execute deployment
+if ! deploy_cmd --env "$ENV" --tag "$DEPLOY_TAG" 2>&1 | tee .deploy/latest.log; then
+  echo "Error: Deployment to $ENV failed" # (see code: E004)
+  exit 1
+fi
+
+$DEPLOY_STATUS="success"
 ```
 
----
+| Condition | Action |
+|-----------|--------|
+| Deploy succeeds | Update status → `"deployed"`, continue to verify |
+| Deploy fails | Log error `# (see code: E004)`, exit 1 |
+| `$ROLLBACK_MODE` | Load `.deploy/previous.json`, redeploy prior version |
+</step>
+```
 
-## Error Handling
+**Optional `<prerequisites>` section** (include when command uses external tools):
 
-| Error | Stage | Action |
-|-------|-------|--------|
-| Missing skillName | Phase 1 | Error: "skillName is required" |
-| Missing description | Phase 1 | Error: "description is required" |
-| Missing location | Phase 1 | Error: "location is required (project or user)" |
-| Invalid location | Phase 2 | Error: "location must be 'project' or 'user'" |
-| Template not found | Phase 3 | Error: "Command template not found" |
-| File exists | Phase 5 | Warning: "Command file already exists, will overwrite" |
-| Write failure | Phase 5 | Error: "Failed to write command file" |
+```markdown
+<prerequisites>
+- git (2.20+) — version control operations
+- curl — health check endpoints
+- jq — JSON processing (optional)
+</prerequisites>
+```
 
----
+**`<error_codes>` table:** Generate 3-6 specific error codes:
+- Derive from `$ARGUMENT_HINT` validation failures (E001-E003)
+- Derive from domain-specific failure modes (E004+)
+- Include 1-2 warnings (W001+)
+- Each code has: Code, Severity, Description, **Stage** (which step triggers it)
+- **Cross-reference rule**: Every `# (see code: E00X)` comment in `<process>` MUST have a matching row in `<error_codes>`, and every error code row MUST be referenced by at least one inline comment
 
-## Related Skills
+**`<success_criteria>` checkboxes:** Generate 4-8 verifiable conditions:
+- Input validation passed
+- Each execution step completed its action
+- Output artifacts exist / effects applied
+- Report displayed
 
-- **skill-generator**: Create complete skills with phases, templates, and specs
-- **flow-coordinator**: Orchestrate multi-step command workflows
+**Quality rules for generated content:**
+- NO bracket placeholders (`[Describe...]`, `[List...]`) — all content must be concrete
+- Steps must contain actionable logic, not descriptions of what to do
+- Error codes must reference specific failure conditions from this command's domain
+- Success criteria must be verifiable (not "command works correctly")
+- Every shell block must pass the Shell Correctness Checklist above
+- Follow patterns from @.claude/skills/command-generator/templates/command-md.md for structural reference only
+</step>
+
+<step name="write_file">
+**Write generated content to target path.**
+
+**If `$FILE_EXISTS`:** Warn: `"Command file already exists at {path}. Will overwrite."`
+
+```bash
+mkdir -p "$TARGET_DIR"
+```
+
+Write the drafted content to `$TARGET_PATH` using Write tool.
+
+**Verify:** Read back the file and confirm:
+- File exists and is non-empty
+- Contains `<purpose>` tag with concrete content (no placeholders)
+- Contains at least 2 `<step name=` blocks with shell code or tool calls
+- Contains `<error_codes>` with at least 3 rows including Stage column
+- Contains `<success_criteria>` with at least 4 checkboxes
+- No unresolved `{{...}}` or `[...]` placeholders remain
+- Every `# (see code: E0XX)` has a matching `<error_codes>` row (cross-ref check)
+- Every shell block uses heredoc for multi-line output (no bare multi-line echo)
+- All state variables initialized before conditional use
+- All error paths include `exit 1` after error message
+
+**If verification fails:** Fix the content in-place using Edit tool.
+
+**Report completion:**
+
+```
+Command generated successfully!
+
+File: {$TARGET_PATH}
+Name: {$SKILL_NAME}
+Description: {$DESCRIPTION}
+Location: {$LOCATION}
+Group: {$GROUP or "(none)"}
+Steps: {number of <step> blocks generated}
+Error codes: {number of error codes}
+
+Next Steps:
+1. Review and customize {$TARGET_PATH}
+2. Test: /{$GROUP}:{$SKILL_NAME} or /{$SKILL_NAME}
+```
+</step>
+
+</process>
+
+<error_codes>
+
+| Code | Severity | Description | Stage |
+|------|----------|-------------|-------|
+| E001 | error | skillName is required | validate_params |
+| E002 | error | description is required (min 10 chars) | validate_params |
+| E003 | error | location is required ("project" or "user") | validate_params |
+| E004 | error | skillName must be lowercase alphanumeric with hyphens | validate_params |
+| E005 | error | Failed to infer command domain from description | gather_requirements |
+| E006 | error | Failed to write command file | write_file |
+| E007 | error | Generated content contains unresolved placeholders | write_file |
+| W001 | warning | group must be lowercase alphanumeric with hyphens | validate_params |
+| W002 | warning | Command file already exists, will overwrite | write_file |
+| W003 | warning | Could not infer required_reading, using defaults | draft_content |
+
+</error_codes>
+
+<success_criteria>
+- [ ] All required parameters validated ($SKILL_NAME, $DESCRIPTION, $LOCATION)
+- [ ] Target path resolved with correct scope (project vs user) and group
+- [ ] Command domain inferred from description and argument hint
+- [ ] Concrete `<purpose>` drafted (no placeholders)
+- [ ] 2-6 `<step>` blocks generated with domain-specific logic
+- [ ] `<error_codes>` table generated with 3+ specific codes
+- [ ] `<success_criteria>` generated with 4+ verifiable checkboxes
+- [ ] File written to $TARGET_PATH and verified
+- [ ] Zero bracket placeholders in final output
+- [ ] Completion report displayed
+</success_criteria>
diff --git a/.claude/skills/command-generator/phases/01-parameter-validation.md b/.claude/skills/command-generator/phases/01-parameter-validation.md
deleted file mode 100644
index 7c22ccfa..00000000
--- a/.claude/skills/command-generator/phases/01-parameter-validation.md
+++ /dev/null
@@ -1,174 +0,0 @@
-# Phase 1: Parameter Validation
-
-Validate all required parameters for command generation.
-
-## Objective
-
-Ensure all required parameters are provided before proceeding with command generation:
-- **skillName**: Command identifier (required)
-- **description**: Command description (required)
-- **location**: Target scope - "project" or "user" (required)
-- **group**: Optional grouping subdirectory
-- **argumentHint**: Optional argument hint string
-
-## Input
-
-Parameters received from skill invocation:
-- `skillName`: string (required)
-- `description`: string (required)
-- `location`: "project" | "user" (required)
-- `group`: string (optional)
-- `argumentHint`: string (optional)
-
-## Validation Rules
-
-### Required Parameters
-
-```javascript
-const requiredParams = {
-  skillName: {
-    type: 'string',
-    minLength: 1,
-    pattern: /^[a-z][a-z0-9-]*$/,  // lowercase, alphanumeric, hyphens
-    error: 'skillName must be lowercase alphanumeric with hyphens, starting with a letter'
-  },
-  description: {
-    type: 'string',
-    minLength: 10,
-    error: 'description must be at least 10 characters'
-  },
-  location: {
-    type: 'string',
-    enum: ['project', 'user'],
-    error: 'location must be "project" or "user"'
-  }
-};
-```
-
-### Optional Parameters
-
-```javascript
-const optionalParams = {
-  group: {
-    type: 'string',
-    pattern: /^[a-z][a-z0-9-]*$/,
-    default: null,
-    error: 'group must be lowercase alphanumeric with hyphens'
-  },
-  argumentHint: {
-    type: 'string',
-    default: '',
-    error: 'argumentHint must be a string'
-  }
-};
-```
-
-## Execution Steps
-
-### Step 1: Extract Parameters
-
-```javascript
-// Extract from skill args
-const params = {
-  skillName: args.skillName,
-  description: args.description,
-  location: args.location,
-  group: args.group || null,
-  argumentHint: args.argumentHint || ''
-};
-```
-
-### Step 2: Validate Required Parameters
-
-```javascript
-function validateRequired(params, rules) {
-  const errors = [];
-  
-  for (const [key, rule] of Object.entries(rules)) {
-    const value = params[key];
-    
-    // Check existence
-    if (value === undefined || value === null || value === '') {
-      errors.push(`${key} is required`);
-      continue;
-    }
-    
-    // Check type
-    if (typeof value !== rule.type) {
-      errors.push(`${key} must be a ${rule.type}`);
-      continue;
-    }
-    
-    // Check minLength
-    if (rule.minLength && value.length < rule.minLength) {
-      errors.push(`${key} must be at least ${rule.minLength} characters`);
-    }
-    
-    // Check pattern
-    if (rule.pattern && !rule.pattern.test(value)) {
-      errors.push(rule.error);
-    }
-    
-    // Check enum
-    if (rule.enum && !rule.enum.includes(value)) {
-      errors.push(`${key} must be one of: ${rule.enum.join(', ')}`);
-    }
-  }
-  
-  return errors;
-}
-
-const requiredErrors = validateRequired(params, requiredParams);
-if (requiredErrors.length > 0) {
-  throw new Error(`Validation failed:\n${requiredErrors.join('\n')}`);
-}
-```
-
-### Step 3: Validate Optional Parameters
-
-```javascript
-function validateOptional(params, rules) {
-  const warnings = [];
-  
-  for (const [key, rule] of Object.entries(rules)) {
-    const value = params[key];
-    
-    if (value !== null && value !== undefined && value !== '') {
-      if (rule.pattern && !rule.pattern.test(value)) {
-        warnings.push(`${key}: ${rule.error}`);
-      }
-    }
-  }
-  
-  return warnings;
-}
-
-const optionalWarnings = validateOptional(params, optionalParams);
-// Log warnings but continue
-```
-
-### Step 4: Normalize Parameters
-
-```javascript
-const validatedParams = {
-  skillName: params.skillName.trim().toLowerCase(),
-  description: params.description.trim(),
-  location: params.location.trim().toLowerCase(),
-  group: params.group ? params.group.trim().toLowerCase() : null,
-  argumentHint: params.argumentHint ? params.argumentHint.trim() : ''
-};
-```
-
-## Output
-
-```javascript
-{
-  status: 'validated',
-  params: validatedParams,
-  warnings: optionalWarnings
-}
-```
-
-## Next Phase
-
-Proceed to [Phase 2: Target Path Resolution](02-target-path-resolution.md) with `validatedParams`.
diff --git a/.claude/skills/command-generator/phases/02-target-path-resolution.md b/.claude/skills/command-generator/phases/02-target-path-resolution.md
deleted file mode 100644
index 259660e5..00000000
--- a/.claude/skills/command-generator/phases/02-target-path-resolution.md
+++ /dev/null
@@ -1,171 +0,0 @@
-# Phase 2: Target Path Resolution
-
-Resolve the target commands directory based on location parameter.
-
-## Objective
-
-Determine the correct target path for the command file based on:
-- **location**: "project" or "user" scope
-- **group**: Optional subdirectory for command organization
-- **skillName**: Command filename (with .md extension)
-
-## Input
-
-From Phase 1 validation:
-```javascript
-{
-  skillName: string,      // e.g., "create"
-  description: string,
-  location: "project" | "user",
-  group: string | null,   // e.g., "issue"
-  argumentHint: string
-}
-```
-
-## Path Resolution Rules
-
-### Location Mapping
-
-```javascript
-const locationMap = {
-  project: '.claude/commands',
-  user: '~/.claude/commands'  // Expands to user home directory
-};
-```
-
-### Path Construction
-
-```javascript
-function resolveTargetPath(params) {
-  const baseDir = locationMap[params.location];
-  
-  if (!baseDir) {
-    throw new Error(`Invalid location: ${params.location}. Must be "project" or "user".`);
-  }
-  
-  // Expand ~ to user home if present
-  const expandedBase = baseDir.startsWith('~') 
-    ? path.join(os.homedir(), baseDir.slice(1))
-    : baseDir;
-  
-  // Build full path
-  let targetPath;
-  if (params.group) {
-    // Grouped command: .claude/commands/{group}/{skillName}.md
-    targetPath = path.join(expandedBase, params.group, `${params.skillName}.md`);
-  } else {
-    // Top-level command: .claude/commands/{skillName}.md
-    targetPath = path.join(expandedBase, `${params.skillName}.md`);
-  }
-  
-  return targetPath;
-}
-```
-
-## Execution Steps
-
-### Step 1: Get Base Directory
-
-```javascript
-const location = validatedParams.location;
-const baseDir = locationMap[location];
-
-if (!baseDir) {
-  throw new Error(`Invalid location: ${location}. Must be "project" or "user".`);
-}
-```
-
-### Step 2: Expand User Path (if applicable)
-
-```javascript
-const os = require('os');
-const path = require('path');
-
-let expandedBase = baseDir;
-if (baseDir.startsWith('~')) {
-  expandedBase = path.join(os.homedir(), baseDir.slice(1));
-}
-```
-
-### Step 3: Construct Full Path
-
-```javascript
-let targetPath;
-let targetDir;
-
-if (validatedParams.group) {
-  // Command with group subdirectory
-  targetDir = path.join(expandedBase, validatedParams.group);
-  targetPath = path.join(targetDir, `${validatedParams.skillName}.md`);
-} else {
-  // Top-level command
-  targetDir = expandedBase;
-  targetPath = path.join(targetDir, `${validatedParams.skillName}.md`);
-}
-```
-
-### Step 4: Ensure Target Directory Exists
-
-```javascript
-// Check and create directory if needed
-Bash(`mkdir -p "${targetDir}"`);
-```
-
-### Step 5: Check File Existence
-
-```javascript
-const fileExists = Bash(`test -f "${targetPath}" && echo "EXISTS" || echo "NOT_FOUND"`);
-
-if (fileExists.includes('EXISTS')) {
-  console.warn(`Warning: Command file already exists at ${targetPath}. Will overwrite.`);
-}
-```
-
-## Output
-
-```javascript
-{
-  status: 'resolved',
-  targetPath: targetPath,     // Full path to command file
-  targetDir: targetDir,       // Directory containing command
-  fileName: `${skillName}.md`,
-  fileExists: fileExists.includes('EXISTS'),
-  params: validatedParams     // Pass through to next phase
-}
-```
-
-## Path Examples
-
-### Project Scope (No Group)
-```
-location: "project"
-skillName: "deploy"
--> .claude/commands/deploy.md
-```
-
-### Project Scope (With Group)
-```
-location: "project"
-skillName: "create"
-group: "issue"
--> .claude/commands/issue/create.md
-```
-
-### User Scope (No Group)
-```
-location: "user"
-skillName: "global-status"
--> ~/.claude/commands/global-status.md
-```
-
-### User Scope (With Group)
-```
-location: "user"
-skillName: "sync"
-group: "session"
--> ~/.claude/commands/session/sync.md
-```
-
-## Next Phase
-
-Proceed to [Phase 3: Template Loading](03-template-loading.md) with `targetPath` and `params`.
diff --git a/.claude/skills/command-generator/phases/03-template-loading.md b/.claude/skills/command-generator/phases/03-template-loading.md
deleted file mode 100644
index dd2fb43c..00000000
--- a/.claude/skills/command-generator/phases/03-template-loading.md
+++ /dev/null
@@ -1,123 +0,0 @@
-# Phase 3: Template Loading
-
-Load the command template file for content generation.
-
-## Objective
-
-Load the command template from the skill's templates directory. The template provides:
-- YAML frontmatter structure
-- Placeholder variables for substitution
-- Standard command file sections
-
-## Input
-
-From Phase 2:
-```javascript
-{
-  targetPath: string,
-  targetDir: string,
-  fileName: string,
-  fileExists: boolean,
-  params: {
-    skillName: string,
-    description: string,
-    location: string,
-    group: string | null,
-    argumentHint: string
-  }
-}
-```
-
-## Template Location
-
-```
-.claude/skills/command-generator/templates/command-md.md
-```
-
-## Execution Steps
-
-### Step 1: Locate Template File
-
-```javascript
-// Template is located in the skill's templates directory
-const skillDir = '.claude/skills/command-generator';
-const templatePath = `${skillDir}/templates/command-md.md`;
-```
-
-### Step 2: Read Template Content
-
-```javascript
-const templateContent = Read(templatePath);
-
-if (!templateContent) {
-  throw new Error(`Command template not found at ${templatePath}`);
-}
-```
-
-### Step 3: Validate Template Structure
-
-```javascript
-// Verify template contains expected placeholders
-const requiredPlaceholders = ['{{name}}', '{{description}}'];
-const optionalPlaceholders = ['{{group}}', '{{argumentHint}}'];
-
-for (const placeholder of requiredPlaceholders) {
-  if (!templateContent.includes(placeholder)) {
-    throw new Error(`Template missing required placeholder: ${placeholder}`);
-  }
-}
-```
-
-### Step 4: Store Template for Next Phase
-
-```javascript
-const template = {
-  content: templateContent,
-  requiredPlaceholders: requiredPlaceholders,
-  optionalPlaceholders: optionalPlaceholders
-};
-```
-
-## Template Format Reference
-
-The template should follow this structure:
-
-```markdown
----
-name: {{name}}
-description: {{description}}
-{{#if group}}group: {{group}}{{/if}}
-{{#if argumentHint}}argument-hint: {{argumentHint}}{{/if}}
----
-
-# {{name}} Command
-
-[Template content with placeholders]
-```
-
-## Output
-
-```javascript
-{
-  status: 'loaded',
-  template: {
-    content: templateContent,
-    requiredPlaceholders: requiredPlaceholders,
-    optionalPlaceholders: optionalPlaceholders
-  },
-  targetPath: targetPath,
-  params: params
-}
-```
-
-## Error Handling
-
-| Error | Action |
-|-------|--------|
-| Template file not found | Throw error with path |
-| Missing required placeholder | Throw error with missing placeholder name |
-| Empty template | Throw error |
-
-## Next Phase
-
-Proceed to [Phase 4: Content Formatting](04-content-formatting.md) with `template`, `targetPath`, and `params`.
diff --git a/.claude/skills/command-generator/phases/04-content-formatting.md b/.claude/skills/command-generator/phases/04-content-formatting.md
deleted file mode 100644
index 1f5a4bb3..00000000
--- a/.claude/skills/command-generator/phases/04-content-formatting.md
+++ /dev/null
@@ -1,184 +0,0 @@
-# Phase 4: Content Formatting
-
-Format template content by substituting placeholders with parameter values.
-
-## Objective
-
-Replace all placeholder variables in the template with validated parameter values:
-- `{{name}}` -> skillName
-- `{{description}}` -> description
-- `{{group}}` -> group (if provided)
-- `{{argumentHint}}` -> argumentHint (if provided)
-
-## Input
-
-From Phase 3:
-```javascript
-{
-  template: {
-    content: string,
-    requiredPlaceholders: string[],
-    optionalPlaceholders: string[]
-  },
-  targetPath: string,
-  params: {
-    skillName: string,
-    description: string,
-    location: string,
-    group: string | null,
-    argumentHint: string
-  }
-}
-```
-
-## Placeholder Mapping
-
-```javascript
-const placeholderMap = {
-  '{{name}}': params.skillName,
-  '{{description}}': params.description,
-  '{{group}}': params.group || '',
-  '{{argumentHint}}': params.argumentHint || ''
-};
-```
-
-## Execution Steps
-
-### Step 1: Initialize Content
-
-```javascript
-let formattedContent = template.content;
-```
-
-### Step 2: Substitute Required Placeholders
-
-```javascript
-// These must always be replaced
-formattedContent = formattedContent.replace(/\{\{name\}\}/g, params.skillName);
-formattedContent = formattedContent.replace(/\{\{description\}\}/g, params.description);
-```
-
-### Step 3: Handle Optional Placeholders
-
-```javascript
-// Group placeholder
-if (params.group) {
-  formattedContent = formattedContent.replace(/\{\{group\}\}/g, params.group);
-} else {
-  // Remove group line if not provided
-  formattedContent = formattedContent.replace(/^group: \{\{group\}\}\n?/gm, '');
-  formattedContent = formattedContent.replace(/\{\{group\}\}/g, '');
-}
-
-// Argument hint placeholder
-if (params.argumentHint) {
-  formattedContent = formattedContent.replace(/\{\{argumentHint\}\}/g, params.argumentHint);
-} else {
-  // Remove argument-hint line if not provided
-  formattedContent = formattedContent.replace(/^argument-hint: \{\{argumentHint\}\}\n?/gm, '');
-  formattedContent = formattedContent.replace(/\{\{argumentHint\}\}/g, '');
-}
-```
-
-### Step 4: Handle Conditional Sections
-
-```javascript
-// Remove empty frontmatter lines (caused by missing optional fields)
-formattedContent = formattedContent.replace(/\n{3,}/g, '\n\n');
-
-// Handle {{#if group}} style conditionals
-if (formattedContent.includes('{{#if')) {
-  // Process group conditional
-  if (params.group) {
-    formattedContent = formattedContent.replace(/\{\{#if group\}\}([\s\S]*?)\{\{\/if\}\}/g, '$1');
-  } else {
-    formattedContent = formattedContent.replace(/\{\{#if group\}\}[\s\S]*?\{\{\/if\}\}/g, '');
-  }
-  
-  // Process argumentHint conditional
-  if (params.argumentHint) {
-    formattedContent = formattedContent.replace(/\{\{#if argumentHint\}\}([\s\S]*?)\{\{\/if\}\}/g, '$1');
-  } else {
-    formattedContent = formattedContent.replace(/\{\{#if argumentHint\}\}[\s\S]*?\{\{\/if\}\}/g, '');
-  }
-}
-```
-
-### Step 5: Validate Final Content
-
-```javascript
-// Ensure no unresolved placeholders remain
-const unresolvedPlaceholders = formattedContent.match(/\{\{[^}]+\}\}/g);
-if (unresolvedPlaceholders) {
-  console.warn(`Warning: Unresolved placeholders found: ${unresolvedPlaceholders.join(', ')}`);
-}
-
-// Ensure frontmatter is valid
-const frontmatterMatch = formattedContent.match(/^---\n([\s\S]*?)\n---/);
-if (!frontmatterMatch) {
-  throw new Error('Generated content has invalid frontmatter structure');
-}
-```
-
-### Step 6: Generate Summary
-
-```javascript
-const summary = {
-  name: params.skillName,
-  description: params.description.substring(0, 50) + (params.description.length > 50 ? '...' : ''),
-  location: params.location,
-  group: params.group,
-  hasArgumentHint: !!params.argumentHint
-};
-```
-
-## Output
-
-```javascript
-{
-  status: 'formatted',
-  content: formattedContent,
-  targetPath: targetPath,
-  summary: summary
-}
-```
-
-## Content Example
-
-### Input Template
-```markdown
----
-name: {{name}}
-description: {{description}}
-{{#if group}}group: {{group}}{{/if}}
-{{#if argumentHint}}argument-hint: {{argumentHint}}{{/if}}
----
-
-# {{name}} Command
-```
-
-### Output (with all fields)
-```markdown
----
-name: create
-description: Create structured issue from GitHub URL or text description
-group: issue
-argument-hint: [-y|--yes] <github-url | text-description> [--priority 1-5]
----
-
-# create Command
-```
-
-### Output (minimal fields)
-```markdown
----
-name: deploy
-description: Deploy application to production environment
----
-
-# deploy Command
-```
-
-## Next Phase
-
-Proceed to [Phase 5: File Generation](05-file-generation.md) with `content` and `targetPath`.
diff --git a/.claude/skills/command-generator/phases/05-file-generation.md b/.claude/skills/command-generator/phases/05-file-generation.md
deleted file mode 100644
index 77f11544..00000000
--- a/.claude/skills/command-generator/phases/05-file-generation.md
+++ /dev/null
@@ -1,185 +0,0 @@
-# Phase 5: File Generation
-
-Write the formatted content to the target command file.
-
-## Objective
-
-Generate the final command file by:
-1. Checking for existing file (warn if present)
-2. Writing formatted content to target path
-3. Confirming successful generation
-
-## Input
-
-From Phase 4:
-```javascript
-{
-  status: 'formatted',
-  content: string,
-  targetPath: string,
-  summary: {
-    name: string,
-    description: string,
-    location: string,
-    group: string | null,
-    hasArgumentHint: boolean
-  }
-}
-```
-
-## Execution Steps
-
-### Step 1: Pre-Write Check
-
-```javascript
-// Check if file already exists
-const fileExists = Bash(`test -f "${targetPath}" && echo "EXISTS" || echo "NOT_FOUND"`);
-
-if (fileExists.includes('EXISTS')) {
-  console.warn(`
-WARNING: Command file already exists at: ${targetPath}
-The file will be overwritten with new content.
-  `);
-}
-```
-
-### Step 2: Ensure Directory Exists
-
-```javascript
-// Get directory from target path
-const targetDir = path.dirname(targetPath);
-
-// Create directory if it doesn't exist
-Bash(`mkdir -p "${targetDir}"`);
-```
-
-### Step 3: Write File
-
-```javascript
-// Write the formatted content
-Write(targetPath, content);
-```
-
-### Step 4: Verify Write
-
-```javascript
-// Confirm file was created
-const verifyExists = Bash(`test -f "${targetPath}" && echo "SUCCESS" || echo "FAILED"`);
-
-if (!verifyExists.includes('SUCCESS')) {
-  throw new Error(`Failed to create command file at ${targetPath}`);
-}
-
-// Verify content was written
-const writtenContent = Read(targetPath);
-if (!writtenContent || writtenContent.length === 0) {
-  throw new Error(`Command file created but appears to be empty`);
-}
-```
-
-### Step 5: Generate Success Report
-
-```javascript
-const report = {
-  status: 'completed',
-  file: {
-    path: targetPath,
-    name: summary.name,
-    location: summary.location,
-    group: summary.group,
-    size: writtenContent.length,
-    created: new Date().toISOString()
-  },
-  command: {
-    name: summary.name,
-    description: summary.description,
-    hasArgumentHint: summary.hasArgumentHint
-  },
-  nextSteps: [
-    `Edit ${targetPath} to add implementation details`,
-    'Add usage examples and execution flow',
-    'Test the command with Claude Code'
-  ]
-};
-```
-
-## Output
-
-### Success Output
-
-```javascript
-{
-  status: 'completed',
-  file: {
-    path: '.claude/commands/issue/create.md',
-    name: 'create',
-    location: 'project',
-    group: 'issue',
-    size: 1234,
-    created: '2026-02-27T12:00:00.000Z'
-  },
-  command: {
-    name: 'create',
-    description: 'Create structured issue from GitHub URL...',
-    hasArgumentHint: true
-  },
-  nextSteps: [
-    'Edit .claude/commands/issue/create.md to add implementation details',
-    'Add usage examples and execution flow',
-    'Test the command with Claude Code'
-  ]
-}
-```
-
-### Console Output
-
-```
-Command generated successfully!
-
-File: .claude/commands/issue/create.md
-Name: create
-Description: Create structured issue from GitHub URL...
-Location: project
-Group: issue
-
-Next Steps:
-1. Edit .claude/commands/issue/create.md to add implementation details
-2. Add usage examples and execution flow
-3. Test the command with Claude Code
-```
-
-## Error Handling
-
-| Error | Action |
-|-------|--------|
-| Directory creation failed | Throw error with directory path |
-| File write failed | Throw error with target path |
-| Empty file detected | Throw error and attempt cleanup |
-| Permission denied | Throw error with permission hint |
-
-## Cleanup on Failure
-
-```javascript
-// If any step fails, attempt to clean up partial artifacts
-function cleanup(targetPath) {
-  try {
-    Bash(`rm -f "${targetPath}"`);
-  } catch (e) {
-    // Ignore cleanup errors
-  }
-}
-```
-
-## Completion
-
-The command file has been successfully generated. The skill execution is complete.
-
-### Usage Example
-
-```bash
-# Use the generated command
-/issue:create https://github.com/owner/repo/issues/123
-
-# Or with the group prefix
-/issue:create "Login fails with special chars"
-```
diff --git a/.claude/skills/command-generator/specs/command-design-spec.md b/.claude/skills/command-generator/specs/command-design-spec.md
index 48875ca2..c7966599 100644
--- a/.claude/skills/command-generator/specs/command-design-spec.md
+++ b/.claude/skills/command-generator/specs/command-design-spec.md
@@ -1,160 +1,65 @@
 # Command Design Specification
 
-Guidelines and best practices for designing Claude Code command files.
+Guidelines for Claude Code command files generated by command-generator.
 
-## Command File Structure
-
-### YAML Frontmatter
-
-Every command file must start with YAML frontmatter containing:
+## YAML Frontmatter
 
 ```yaml
 ---
-name: command-name           # Required: Command identifier (lowercase, hyphens)
-description: Description     # Required: Brief description of command purpose
-argument-hint: "[args]"      # Optional: Argument format hint
-allowed-tools: Tool1, Tool2  # Optional: Restricted tool set
-examples:                    # Optional: Usage examples
-  - /command:example1
-  - /command:example2 --flag
+name: command-name           # Required: lowercase with hyphens
+description: Description     # Required: brief purpose
+argument-hint: "[args]"      # Optional: argument format hint
+allowed-tools: Tool1, Tool2  # Optional: restricted tool set
 ---
 ```
 
-### Frontmatter Fields
-
-| Field | Required | Description |
-|-------|----------|-------------|
-| `name` | Yes | Command identifier, lowercase with hyphens |
-| `description` | Yes | Brief description, appears in command listings |
-| `argument-hint` | No | Usage hint for arguments (shown in help) |
-| `allowed-tools` | No | Restrict available tools for this command |
-| `examples` | No | Array of usage examples |
-
 ## Naming Conventions
 
-### Command Names
+| Element | Convention | Examples |
+|---------|-----------|----------|
+| Command name | lowercase, hyphens, 2-3 words max | `deploy`, `create-issue` |
+| Group name | singular noun | `issue`, `session`, `workflow` |
+| Verbs for actions | imperative | `deploy`, `create`, `analyze` |
 
-- Use lowercase letters only
-- Separate words with hyphens (`create-issue`, not `createIssue`)
-- Keep names short but descriptive (2-3 words max)
-- Use verbs for actions (`deploy`, `create`, `analyze`)
-
-### Group Names
-
-- Groups organize related commands
-- Use singular nouns (`issue`, `session`, `workflow`)
-- Common groups: `issue`, `workflow`, `session`, `memory`, `cli`
-
-### Path Examples
+## Path Structure
 
 ```
 .claude/commands/deploy.md           # Top-level command
 .claude/commands/issue/create.md     # Grouped command
-.claude/commands/workflow/init.md    # Grouped command
+~/.claude/commands/global-status.md  # User-level command
 ```
 
-## Content Sections
+## Content Structure (GSD Style)
 
-### Required Sections
+Generated commands should use XML semantic tags:
 
-1. **Overview**: Brief description of command purpose
-2. **Usage**: Command syntax and examples
-3. **Execution Flow**: High-level process diagram
+| Tag | Required | Purpose |
+|-----|----------|---------|
+| `<purpose>` | Yes | What the command does, when invoked, what it produces |
+| `<required_reading>` | Yes | Files to read before execution (@ notation) |
+| `<process>` | Yes | Container for execution steps |
+| `<step name="...">` | Yes | Individual execution steps with snake_case names |
+| `<error_codes>` | No | Error code table with severity and description |
+| `<success_criteria>` | Yes | Checkbox list of verifiable completion conditions |
 
-### Recommended Sections
+## Step Naming
 
-4. **Implementation**: Code examples for each phase
-5. **Error Handling**: Error cases and recovery
-6. **Related Commands**: Links to related functionality
+- Use snake_case: `parse_input`, `validate_config`, `write_output`
+- Use action verbs: `discover`, `validate`, `spawn`, `collect`, `report`
+- First step gets `priority="first"` attribute
 
-## Best Practices
-
-### 1. Clear Purpose
-
-Each command should do one thing well:
+## Error Messages
 
 ```
-Good: /issue:create - Create a new issue
-Bad:  /issue:manage - Create, update, delete issues (too broad)
-```
+Good: Error: GitHub issue URL required
+      Usage: /issue:create <github-url>
 
-### 2. Consistent Structure
-
-Follow the same pattern across all commands in a group:
-
-```markdown
-# All issue commands should have:
-- Overview
-- Usage with examples
-- Phase-based implementation
-- Error handling table
-```
-
-### 3. Progressive Detail
-
-Start simple, add detail in phases:
-
-```
-Phase 1: Quick overview
-Phase 2: Implementation details
-Phase 3: Edge cases and errors
-```
-
-### 4. Reusable Patterns
-
-Use consistent patterns for common operations:
-
-```javascript
-// Input parsing pattern
-const args = parseArguments($ARGUMENTS);
-const flags = parseFlags($ARGUMENTS);
-
-// Validation pattern
-if (!args.required) {
-  throw new Error('Required argument missing');
-}
+Bad:  Error: Invalid input
 ```
 
 ## Scope Guidelines
 
-### Project Commands (`.claude/commands/`)
-
-- Project-specific workflows
-- Team conventions
-- Integration with project tools
-
-### User Commands (`~/.claude/commands/`)
-
-- Personal productivity tools
-- Cross-project utilities
-- Global configuration
-
-## Error Messages
-
-### Good Error Messages
-
-```
-Error: GitHub issue URL required
-Usage: /issue:create <github-url>
-Example: /issue:create https://github.com/owner/repo/issues/123
-```
-
-### Bad Error Messages
-
-```
-Error: Invalid input
-```
-
-## Testing Commands
-
-After creating a command, test:
-
-1. **Basic invocation**: Does it run without arguments?
-2. **Argument parsing**: Does it handle valid arguments?
-3. **Error cases**: Does it show helpful errors for invalid input?
-4. **Help text**: Is the usage clear?
-
-## Related Documentation
-
-- [SKILL-DESIGN-SPEC.md](../_shared/SKILL-DESIGN-SPEC.md) - Full skill design specification
-- [../skill-generator/SKILL.md](../skill-generator/SKILL.md) - Meta-skill for creating skills
+| Scope | Location | Use For |
+|-------|----------|---------|
+| Project | `.claude/commands/` | Team workflows, project integrations |
+| User | `~/.claude/commands/` | Personal tools, cross-project utilities |
diff --git a/.claude/skills/command-generator/templates/command-md.md b/.claude/skills/command-generator/templates/command-md.md
index d3004430..1b543237 100644
--- a/.claude/skills/command-generator/templates/command-md.md
+++ b/.claude/skills/command-generator/templates/command-md.md
@@ -1,75 +1,112 @@
+# Command Template — Structural Reference
+
+This template defines the **structural pattern** for generated commands. The `draft_content` step uses this as a guide to generate concrete, domain-specific content — NOT as a literal copy target.
+
+## Required Structure
+
+```markdown
+---
+name: {$SKILL_NAME}
+description: {$DESCRIPTION}
+argument-hint: {$ARGUMENT_HINT}  # omit line if empty
 ---
-name: {{name}}
-description: {{description}}
-{{#if argumentHint}}argument-hint: {{argumentHint}}
-{{/if}}---
 
-# {{name}} Command
+<purpose>
+{2-3 concrete sentences: what it does + when invoked + what it produces}
+</purpose>
 
-## Overview
+<required_reading>
+{@ references to files this command needs before execution}
+</required_reading>
 
-[Describe the command purpose and what it does]
+<prerequisites>   <!-- include when command uses external CLI tools -->
+- {tool} ({version}+) — {what it's used for}
+</prerequisites>
 
-## Usage
+<process>
+
+<step name="parse_input" priority="first">
+**Parse arguments and validate input.**
+
+Parse `$ARGUMENTS` for:
+- {specific flags from $ARGUMENT_HINT}
+- {positional args}
+
+{Decision routing table if multiple modes:}
+| Condition | Action |
+|-----------|--------|
+| flag present | set variable |
+| missing required | Error: "message" `# (see code: E001)` + `exit 1` |
+
+</step>
+
+<step name="{domain_action_1}">
+**{Concrete action description.}**
+
+$STATE_VAR="default"  <!-- Initialize BEFORE conditional -->
 
 ```bash
-/{{#if group}}{{group}}:{{/if}}{{name}} [arguments]
+# Use heredoc for multi-line output
+cat <<EOF > output-file
+{structured content with $VARIABLES}
+EOF
+
+# Every error path: message + code ref + exit
+if [ ! -f "$REQUIRED_FILE" ]; then
+  echo "Error: Required file missing" # (see code: E003)
+  exit 1
+fi
 ```
 
-**Examples**:
-```bash
-# Example 1: Basic usage
-/{{#if group}}{{group}}:{{/if}}{{name}}
+| Condition | Action |
+|-----------|--------|
+| success | Continue to next step |
+| failure | Error `# (see code: E0XX)`, exit 1 |
+</step>
 
-# Example 2: With arguments
-/{{#if group}}{{group}}:{{/if}}{{name}} --option value
+<step name="report">
+**Format and display results.**
+
+{Banner with status, file paths, next steps}
+</step>
+
+</process>
+
+<error_codes>
+
+| Code | Severity | Description | Stage |
+|------|----------|-------------|-------|
+| E001 | error | {specific to parse_input validation} | parse_input |
+| E002 | error | {specific to domain action failure} | {step_name} |
+| W001 | warning | {specific recoverable condition} | {step_name} |
+
+<!-- Every code MUST be referenced by `# (see code: EXXX)` in <process> -->
+</error_codes>
+
+<success_criteria>
+- [ ] {Input validated}
+- [ ] {Domain action 1 completed}
+- [ ] {Domain action 2 completed}
+- [ ] {Output produced / effect applied}
+</success_criteria>
 ```
 
-## Execution Flow
+## Content Quality Rules
 
-```
-Phase 1: Input Parsing
-   - Parse arguments and flags
-   - Validate input parameters
+| Rule | Bad Example | Good Example |
+|------|-------------|--------------|
+| No bracket placeholders | `[Describe purpose]` | `Deploy to target environment with rollback on failure.` |
+| Concrete step names | `execute` | `run_deployment`, `validate_config` |
+| Specific error codes | `E001: Invalid input` | `E001: --env must be "prod" or "staging"` |
+| Verifiable criteria | `Command works` | `Deployment log written to .deploy/latest.log` |
+| Real shell commands | `# TODO: implement` | `kubectl apply -f $MANIFEST_PATH` |
 
-Phase 2: Core Processing
-   - Execute main logic
-   - Handle edge cases
+## Step Naming Conventions
 
-Phase 3: Output Generation
-   - Format results
-   - Display to user
-```
-
-## Implementation
-
-### Phase 1: Input Parsing
-
-```javascript
-// Parse command arguments
-const args = parseArguments($ARGUMENTS);
-```
-
-### Phase 2: Core Processing
-
-```javascript
-// TODO: Implement core logic
-```
-
-### Phase 3: Output Generation
-
-```javascript
-// TODO: Format and display output
-```
-
-## Error Handling
-
-| Error | Action |
-|-------|--------|
-| Invalid input | Show usage and error message |
-| Processing failure | Log error and suggest recovery |
-
-## Related Commands
-
-- [Related command 1]
-- [Related command 2]
+| Domain | Typical Steps |
+|--------|--------------|
+| Deploy/Release | `validate_config`, `run_deployment`, `verify_health`, `report` |
+| CRUD operations | `parse_input`, `validate_entity`, `persist_changes`, `report` |
+| Analysis/Review | `parse_input`, `gather_context`, `run_analysis`, `present_findings` |
+| Sync/Migration | `parse_input`, `detect_changes`, `apply_sync`, `verify_state` |
+| Build/Generate | `parse_input`, `resolve_dependencies`, `run_build`, `write_output` |
diff --git a/ccw/src/commands/cli.ts b/ccw/src/commands/cli.ts
index 8a257c6a..5eb42e46 100644
--- a/ccw/src/commands/cli.ts
+++ b/ccw/src/commands/cli.ts
@@ -29,7 +29,7 @@ import {
   projectExists,
   getStorageLocationInstructions
 } from '../tools/storage-manager.js';
-import { getHistoryStore, findProjectWithExecution } from '../tools/cli-history-store.js';
+import { getHistoryStore, findProjectWithExecution, getRegisteredExecutionHistory } from '../tools/cli-history-store.js';
 import { createSpinner } from '../utils/ui.js';
 import { loadClaudeCliSettings } from '../tools/claude-cli-tools.js';
 
@@ -421,11 +421,15 @@ async function outputAction(conversationId: string | undefined, options: OutputV
   if (!result) {
     const hint = options.project
       ? `in project: ${options.project}`
-      : 'in current directory or parent directories';
+      : 'in registered CCW project history';
     console.error(chalk.red(`Error: Execution not found: ${conversationId}`));
     console.error(chalk.gray(`  Searched ${hint}`));
+    console.error(chalk.gray('  Tip: use the real CCW execution ID, not an outer task label.'));
+    console.error(chalk.gray('  Capture [CCW_EXEC_ID=...] from stderr, or start with --id <your-id>.'));
+    console.error(chalk.gray('  Discover IDs via: ccw cli show or ccw cli history'));
     console.error(chalk.gray('Usage: ccw cli output <conversation-id> [--project <path>]'));
     process.exit(1);
+    return;
   }
 
   if (options.raw) {
@@ -1394,7 +1398,7 @@ async function showAction(options: { all?: boolean }): Promise<void> {
 
   // 2. Get recent history from SQLite
   const historyLimit = options.all ? 100 : 20;
-  const history = await getExecutionHistoryAsync(process.cwd(), { limit: historyLimit, recursive: true });
+  const history = getRegisteredExecutionHistory({ limit: historyLimit });
   const historyById = new Map(history.executions.map(exec => [exec.id, exec]));
 
   // 3. Build unified list: active first, then history (de-duped)
@@ -1595,7 +1599,7 @@ async function historyAction(options: HistoryOptions): Promise<void> {
   console.log(chalk.bold.cyan('\n  CLI Execution History\n'));
 
   // Use recursive: true to aggregate history from parent and child projects (matches Dashboard behavior)
-  const history = await getExecutionHistoryAsync(process.cwd(), { limit: parseInt(limit, 10), tool, status, recursive: true });
+  const history = getRegisteredExecutionHistory({ limit: parseInt(limit, 10), tool, status });
 
   if (history.executions.length === 0) {
     console.log(chalk.gray('  No executions found.\n'));
@@ -1650,7 +1654,14 @@ async function detailAction(conversationId: string | undefined): Promise<void> {
     process.exit(1);
   }
 
-  const conversation = getConversationDetail(process.cwd(), conversationId);
+  let conversation = getConversationDetail(process.cwd(), conversationId);
+
+  if (!conversation) {
+    const found = findProjectWithExecution(conversationId, process.cwd());
+    if (found) {
+      conversation = getConversationDetail(found.projectPath, conversationId);
+    }
+  }
 
   if (!conversation) {
     console.error(chalk.red(`Error: Conversation not found: ${conversationId}`));
diff --git a/ccw/src/core/memory-embedder-bridge.ts b/ccw/src/core/memory-embedder-bridge.ts
index 75564286..0b733041 100644
--- a/ccw/src/core/memory-embedder-bridge.ts
+++ b/ccw/src/core/memory-embedder-bridge.ts
@@ -16,7 +16,7 @@ import { spawn } from 'child_process';
 import { join, dirname } from 'path';
 import { existsSync } from 'fs';
 import { fileURLToPath } from 'url';
-import { getCodexLensPython } from '../utils/codexlens-path.js';
+import { getCodexLensHiddenPython } from '../utils/codexlens-path.js';
 import { getCoreMemoryStore } from './core-memory-store.js';
 import type { Stage1Output } from './core-memory-store.js';
 import { StoragePaths } from '../config/storage-paths.js';
@@ -26,7 +26,7 @@ const __filename = fileURLToPath(import.meta.url);
 const __dirname = dirname(__filename);
 
 // Venv paths (reuse CodexLens venv)
-const VENV_PYTHON = getCodexLensPython();
+const VENV_PYTHON = getCodexLensHiddenPython();
 
 // Script path
 const EMBEDDER_SCRIPT = join(__dirname, '..', '..', 'scripts', 'memory_embedder.py');
@@ -116,8 +116,11 @@ function runPython(args: string[], timeout: number = 300000): Promise<string> {
 
     // Spawn Python process
     const child = spawn(VENV_PYTHON, [EMBEDDER_SCRIPT, ...args], {
+      shell: false,
       stdio: ['ignore', 'pipe', 'pipe'],
       timeout,
+      windowsHide: true,
+      env: { ...process.env, PYTHONIOENCODING: 'utf-8' },
     });
 
     let stdout = '';
diff --git a/ccw/src/core/routes/codexlens/semantic-handlers.ts b/ccw/src/core/routes/codexlens/semantic-handlers.ts
index 27344bd7..76d56f1b 100644
--- a/ccw/src/core/routes/codexlens/semantic-handlers.ts
+++ b/ccw/src/core/routes/codexlens/semantic-handlers.ts
@@ -8,7 +8,7 @@ import {
   executeCodexLens,
   installSemantic,
 } from '../../../tools/codex-lens.js';
-import { getCodexLensPython } from '../../../utils/codexlens-path.js';
+import { getCodexLensHiddenPython } from '../../../utils/codexlens-path.js';
 import { spawn } from 'child_process';
 import type { GpuMode } from '../../../tools/codex-lens.js';
 import { loadLiteLLMApiConfig, getAvailableModelsForType, getProvider, getAllProviders } from '../../../config/litellm-api-config-manager.js';
@@ -59,10 +59,13 @@ except Exception as e:
     sys.exit(1)
 `;
 
-    const pythonPath = getCodexLensPython();
+    const pythonPath = getCodexLensHiddenPython();
     const child = spawn(pythonPath, ['-c', pythonScript], {
+      shell: false,
       stdio: ['ignore', 'pipe', 'pipe'],
       timeout,
+      windowsHide: true,
+      env: { ...process.env, PYTHONIOENCODING: 'utf-8' },
     });
 
     let stdout = '';
diff --git a/ccw/src/core/routes/codexlens/watcher-handlers.ts b/ccw/src/core/routes/codexlens/watcher-handlers.ts
index 606e58e0..ef8af3d5 100644
--- a/ccw/src/core/routes/codexlens/watcher-handlers.ts
+++ b/ccw/src/core/routes/codexlens/watcher-handlers.ts
@@ -126,8 +126,10 @@ export async function handleCodexLensWatcherRoutes(ctx: RouteContext): Promise<b
         const args = ['-m', 'codexlens', 'watch', targetPath, '--debounce', String(debounceMs)];
         watcherProcess = spawn(pythonPath, args, {
           cwd: targetPath,
+          shell: false,
           stdio: ['ignore', 'pipe', 'pipe'],
-          env: { ...process.env }
+          windowsHide: true,
+          env: { ...process.env, PYTHONIOENCODING: 'utf-8' }
         });
 
         watcherStats = {
diff --git a/ccw/src/core/routes/litellm-api-routes.ts b/ccw/src/core/routes/litellm-api-routes.ts
index e88c6ac4..2a50901e 100644
--- a/ccw/src/core/routes/litellm-api-routes.ts
+++ b/ccw/src/core/routes/litellm-api-routes.ts
@@ -4,7 +4,11 @@
  */
 import { z } from 'zod';
 import { spawn } from 'child_process';
-import { getSystemPython } from '../../utils/python-utils.js';
+import {
+  getSystemPythonCommand,
+  parsePythonCommandSpec,
+  type PythonCommandSpec,
+} from '../../utils/python-utils.js';
 import {
   isUvAvailable,
   createCodexLensUvManager
@@ -102,10 +106,11 @@ interface CcwLitellmStatusResponse {
 }
 
 function checkCcwLitellmImport(
-  pythonCmd: string,
-  options: { timeout: number; shell?: boolean }
+  pythonCmd: string | PythonCommandSpec,
+  options: { timeout: number }
 ): Promise<CcwLitellmEnvCheck> {
-  const { timeout, shell = false } = options;
+  const { timeout } = options;
+  const pythonSpec = typeof pythonCmd === 'string' ? parsePythonCommandSpec(pythonCmd) : pythonCmd;
 
   const sanitizePythonError = (stderrText: string): string | undefined => {
     const trimmed = stderrText.trim();
@@ -119,11 +124,12 @@ function checkCcwLitellmImport(
   };
 
   return new Promise((resolve) => {
-    const child = spawn(pythonCmd, ['-c', 'import ccw_litellm; print(ccw_litellm.__version__)'], {
+    const child = spawn(pythonSpec.command, [...pythonSpec.args, '-c', 'import ccw_litellm; print(ccw_litellm.__version__)'], {
       stdio: ['ignore', 'pipe', 'pipe'],
       timeout,
       windowsHide: true,
-      shell,
+      shell: false,
+      env: { ...process.env, PYTHONIOENCODING: 'utf-8' },
     });
 
     let stdout = '';
@@ -142,20 +148,20 @@ function checkCcwLitellmImport(
       const error = sanitizePythonError(stderr);
 
       if (code === 0 && version) {
-        resolve({ python: pythonCmd, installed: true, version });
+        resolve({ python: pythonSpec.display, installed: true, version });
         return;
       }
 
       if (code === null) {
-        resolve({ python: pythonCmd, installed: false, error: `Timed out after ${timeout}ms` });
+        resolve({ python: pythonSpec.display, installed: false, error: `Timed out after ${timeout}ms` });
         return;
       }
 
-      resolve({ python: pythonCmd, installed: false, error: error || undefined });
+      resolve({ python: pythonSpec.display, installed: false, error: error || undefined });
     });
 
     child.on('error', (err) => {
-      resolve({ python: pythonCmd, installed: false, error: err.message });
+      resolve({ python: pythonSpec.display, installed: false, error: err.message });
     });
   });
 }
@@ -940,7 +946,7 @@ export async function handleLiteLLMApiRoutes(ctx: RouteContext): Promise<boolean
       // Diagnostics only: if not installed in venv, also check system python so users understand mismatches.
       // NOTE: `installed` flag remains the CodexLens venv status (we want isolated venv dependencies).
       const systemPython = !codexLensVenv.installed
-        ? await checkCcwLitellmImport(getSystemPython(), { timeout: statusTimeout, shell: true })
+        ? await checkCcwLitellmImport(getSystemPythonCommand(), { timeout: statusTimeout })
         : undefined;
 
       const result: CcwLitellmStatusResponse = {
@@ -1410,10 +1416,19 @@ export async function handleLiteLLMApiRoutes(ctx: RouteContext): Promise<boolean
 
         // Priority 2: Fallback to system pip uninstall
         console.log('[ccw-litellm uninstall] Using pip fallback...');
-        const pythonCmd = getSystemPython();
+        const pythonCmd = getSystemPythonCommand();
 
         return new Promise((resolve) => {
-          const proc = spawn(pythonCmd, ['-m', 'pip', 'uninstall', '-y', 'ccw-litellm'], { shell: true, timeout: 120000 });
+          const proc = spawn(
+            pythonCmd.command,
+            [...pythonCmd.args, '-m', 'pip', 'uninstall', '-y', 'ccw-litellm'],
+            {
+              shell: false,
+              timeout: 120000,
+              windowsHide: true,
+              env: { ...process.env, PYTHONIOENCODING: 'utf-8' },
+            },
+          );
           let output = '';
           let error = '';
           proc.stdout?.on('data', (data) => { output += data.toString(); });
diff --git a/ccw/src/core/unified-vector-index.ts b/ccw/src/core/unified-vector-index.ts
index 1b9571a9..d35405c6 100644
--- a/ccw/src/core/unified-vector-index.ts
+++ b/ccw/src/core/unified-vector-index.ts
@@ -16,7 +16,7 @@ import { spawn } from 'child_process';
 import { join, dirname } from 'path';
 import { existsSync } from 'fs';
 import { fileURLToPath } from 'url';
-import { getCodexLensPython } from '../utils/codexlens-path.js';
+import { getCodexLensHiddenPython } from '../utils/codexlens-path.js';
 import { StoragePaths, ensureStorageDir } from '../config/storage-paths.js';
 
 // Get directory of this module
@@ -24,7 +24,7 @@ const __filename = fileURLToPath(import.meta.url);
 const __dirname = dirname(__filename);
 
 // Venv python path (reuse CodexLens venv)
-const VENV_PYTHON = getCodexLensPython();
+const VENV_PYTHON = getCodexLensHiddenPython();
 
 // Script path
 const EMBEDDER_SCRIPT = join(__dirname, '..', '..', 'scripts', 'unified_memory_embedder.py');
@@ -170,8 +170,11 @@ function runPython<T>(request: Record<string, unknown>, timeout: number = 300000
     }
 
     const child = spawn(VENV_PYTHON, [EMBEDDER_SCRIPT], {
+      shell: false,
       stdio: ['pipe', 'pipe', 'pipe'],
       timeout,
+      windowsHide: true,
+      env: { ...process.env, PYTHONIOENCODING: 'utf-8' },
     });
 
     let stdout = '';
diff --git a/ccw/src/tools/cli-history-store.ts b/ccw/src/tools/cli-history-store.ts
index 10c8911e..c2d50af7 100644
--- a/ccw/src/tools/cli-history-store.ts
+++ b/ccw/src/tools/cli-history-store.ts
@@ -1532,6 +1532,197 @@ export function closeAllStores(): void {
   storeCache.clear();
 }
 
+function collectHistoryDatabasePaths(): string[] {
+  const projectsDir = join(getCCWHome(), 'projects');
+  if (!existsSync(projectsDir)) {
+    return [];
+  }
+
+  const historyDbPaths: string[] = [];
+  const visitedDirs = new Set<string>();
+  const skipDirs = new Set(['cache', 'cli-history', 'config', 'memory']);
+
+  function scanDirectory(dir: string): void {
+    const resolvedDir = resolve(dir);
+    if (visitedDirs.has(resolvedDir)) {
+      return;
+    }
+    visitedDirs.add(resolvedDir);
+
+    const historyDb = join(resolvedDir, 'cli-history', 'history.db');
+    if (existsSync(historyDb)) {
+      historyDbPaths.push(historyDb);
+    }
+
+    try {
+      const entries = readdirSync(resolvedDir, { withFileTypes: true });
+      for (const entry of entries) {
+        if (!entry.isDirectory() || skipDirs.has(entry.name)) {
+          continue;
+        }
+        scanDirectory(join(resolvedDir, entry.name));
+      }
+    } catch {
+      // Ignore unreadable directories during best-effort global scans.
+    }
+  }
+
+  scanDirectory(projectsDir);
+  return historyDbPaths;
+}
+
+function getConversationLocationColumns(db: Database.Database): {
+  projectRootSelect: string;
+  relativePathSelect: string;
+} {
+  const tableInfo = db.prepare(`PRAGMA table_info(conversations)`).all() as Array<{ name: string }>;
+  const hasProjectRoot = tableInfo.some(col => col.name === 'project_root');
+  const hasRelativePath = tableInfo.some(col => col.name === 'relative_path');
+
+  return {
+    projectRootSelect: hasProjectRoot ? 'c.project_root AS project_root' : `'' AS project_root`,
+    relativePathSelect: hasRelativePath ? 'c.relative_path AS relative_path' : `'' AS relative_path`
+  };
+}
+
+function normalizeHistoryTimestamp(updatedAt: unknown, createdAt: unknown): number {
+  const parsedUpdatedAt = typeof updatedAt === 'string' ? Date.parse(updatedAt) : NaN;
+  if (!Number.isNaN(parsedUpdatedAt)) {
+    return parsedUpdatedAt;
+  }
+
+  const parsedCreatedAt = typeof createdAt === 'string' ? Date.parse(createdAt) : NaN;
+  return Number.isNaN(parsedCreatedAt) ? 0 : parsedCreatedAt;
+}
+
+export function getRegisteredExecutionHistory(options: {
+  limit?: number;
+  offset?: number;
+  tool?: string | null;
+  status?: string | null;
+  category?: ExecutionCategory | null;
+} = {}): {
+  total: number;
+  count: number;
+  executions: (HistoryIndexEntry & { sourceDir?: string })[];
+} {
+  const {
+    limit = 50,
+    offset = 0,
+    tool = null,
+    status = null,
+    category = null
+  } = options;
+
+  const perStoreLimit = Math.max(limit + offset, limit, 1);
+  const allExecutions: (HistoryIndexEntry & { sourceDir?: string })[] = [];
+  let totalCount = 0;
+
+  for (const historyDb of collectHistoryDatabasePaths()) {
+    let db: Database.Database | null = null;
+    try {
+      db = new Database(historyDb, { readonly: true });
+      const { projectRootSelect, relativePathSelect } = getConversationLocationColumns(db);
+
+      let whereClause = '1=1';
+      const params: Record<string, string | number> = { limit: perStoreLimit };
+
+      if (tool) {
+        whereClause += ' AND c.tool = @tool';
+        params.tool = tool;
+      }
+
+      if (status) {
+        whereClause += ' AND c.latest_status = @status';
+        params.status = status;
+      }
+
+      if (category) {
+        whereClause += ' AND c.category = @category';
+        params.category = category;
+      }
+
+      const countRow = db.prepare(`
+        SELECT COUNT(*) AS count
+        FROM conversations c
+        WHERE ${whereClause}
+      `).get(params) as { count?: number } | undefined;
+      totalCount += countRow?.count || 0;
+
+      const rows = db.prepare(`
+        SELECT
+          c.id,
+          c.created_at AS timestamp,
+          c.updated_at,
+          c.tool,
+          c.latest_status AS status,
+          c.category,
+          c.total_duration_ms AS duration_ms,
+          c.turn_count,
+          c.prompt_preview,
+          ${projectRootSelect},
+          ${relativePathSelect}
+        FROM conversations c
+        WHERE ${whereClause}
+        ORDER BY c.updated_at DESC
+        LIMIT @limit
+      `).all(params) as Array<{
+        id: string;
+        timestamp: string;
+        updated_at?: string;
+        tool: string;
+        status: string;
+        category?: ExecutionCategory;
+        duration_ms: number;
+        turn_count?: number;
+        prompt_preview: unknown;
+        project_root?: string;
+        relative_path?: string;
+      }>;
+
+      for (const row of rows) {
+        allExecutions.push({
+          id: row.id,
+          timestamp: row.timestamp,
+          updated_at: row.updated_at,
+          tool: row.tool,
+          status: row.status,
+          category: row.category || 'user',
+          duration_ms: row.duration_ms,
+          turn_count: row.turn_count,
+          prompt_preview: typeof row.prompt_preview === 'string'
+            ? row.prompt_preview
+            : (row.prompt_preview ? JSON.stringify(row.prompt_preview) : ''),
+          sourceDir: row.project_root || row.relative_path || undefined
+        });
+      }
+    } catch {
+      // Skip databases that are unavailable or incompatible.
+    } finally {
+      db?.close();
+    }
+  }
+
+  allExecutions.sort((a, b) => normalizeHistoryTimestamp(b.updated_at, b.timestamp) - normalizeHistoryTimestamp(a.updated_at, a.timestamp));
+
+  const dedupedExecutions: (HistoryIndexEntry & { sourceDir?: string })[] = [];
+  const seenIds = new Set<string>();
+  for (const execution of allExecutions) {
+    if (seenIds.has(execution.id)) {
+      continue;
+    }
+    seenIds.add(execution.id);
+    dedupedExecutions.push(execution);
+  }
+
+  const pagedExecutions = dedupedExecutions.slice(offset, offset + limit);
+  return {
+    total: dedupedExecutions.length || totalCount,
+    count: pagedExecutions.length,
+    executions: pagedExecutions
+  };
+}
+
 /**
  * Find project path that contains the given execution
  * Searches upward through parent directories and all registered projects
@@ -1579,43 +1770,28 @@ export function findProjectWithExecution(
 
   // Strategy 2: Search in all registered projects (global search)
   // This covers cases where execution might be in a completely different project tree
-  const projectsDir = join(getCCWHome(), 'projects');
-  if (existsSync(projectsDir)) {
+  for (const historyDb of collectHistoryDatabasePaths()) {
+    let db: Database.Database | null = null;
     try {
-      const entries = readdirSync(projectsDir, { withFileTypes: true });
+      db = new Database(historyDb, { readonly: true });
+      const { projectRootSelect } = getConversationLocationColumns(db);
+      const row = db.prepare(`
+        SELECT ${projectRootSelect}
+        FROM conversations c
+        WHERE c.id = ?
+        LIMIT 1
+      `).get(conversationId) as { project_root?: string } | undefined;
 
-      for (const entry of entries) {
-        if (!entry.isDirectory()) continue;
-
-        const projectId = entry.name;
-        const historyDb = join(projectsDir, projectId, 'cli-history', 'history.db');
-
-        if (!existsSync(historyDb)) continue;
-
-        try {
-          // Open and query this database directly
-          const db = new Database(historyDb, { readonly: true });
-          const turn = db.prepare(`
-            SELECT * FROM turns
-            WHERE conversation_id = ?
-            ORDER BY turn_number DESC
-            LIMIT 1
-          `).get(conversationId);
-
-          db.close();
-
-          if (turn) {
-            // Found in this project - return the projectId
-            // Note: projectPath is set to projectId since we don't have the original path stored
-            return { projectPath: projectId, projectId };
-          }
-        } catch {
-          // Skip this database (might be corrupted or locked)
-          continue;
-        }
+      if (row?.project_root) {
+        return {
+          projectPath: row.project_root,
+          projectId: getProjectId(row.project_root)
+        };
       }
     } catch {
-      // Failed to read projects directory
+      // Skip this database (might be corrupted or locked)
+    } finally {
+      db?.close();
     }
   }
 
diff --git a/ccw/src/tools/codex-lens-lsp.ts b/ccw/src/tools/codex-lens-lsp.ts
index 45d1127e..14f77ce4 100644
--- a/ccw/src/tools/codex-lens-lsp.ts
+++ b/ccw/src/tools/codex-lens-lsp.ts
@@ -13,10 +13,10 @@ import type { ToolSchema, ToolResult } from '../types/tool.js';
 import { spawn } from 'child_process';
 import { join } from 'path';
 import { getProjectRoot } from '../utils/path-validator.js';
-import { getCodexLensPython } from '../utils/codexlens-path.js';
+import { getCodexLensHiddenPython } from '../utils/codexlens-path.js';
 
 // CodexLens venv configuration
-const CODEXLENS_VENV = getCodexLensPython();
+const CODEXLENS_VENV = getCodexLensHiddenPython();
 
 // Define Zod schema for validation
 const ParamsSchema = z.object({
@@ -122,8 +122,11 @@ except Exception as e:
 `;
 
     const child = spawn(CODEXLENS_VENV, ['-c', pythonScript], {
+      shell: false,
       stdio: ['ignore', 'pipe', 'pipe'],
       timeout,
+      windowsHide: true,
+      env: { ...process.env, PYTHONIOENCODING: 'utf-8' },
     });
 
     let stdout = '';
diff --git a/ccw/src/tools/codex-lens.ts b/ccw/src/tools/codex-lens.ts
index 59a8a6be..36878ce4 100644
--- a/ccw/src/tools/codex-lens.ts
+++ b/ccw/src/tools/codex-lens.ts
@@ -11,10 +11,15 @@
 
 import { z } from 'zod';
 import type { ToolSchema, ToolResult } from '../types/tool.js';
-import { spawn, execSync, exec } from 'child_process';
-import { existsSync, mkdirSync, rmSync } from 'fs';
-import { join } from 'path';
-import { getSystemPython, parsePythonVersion, isPythonVersionCompatible } from '../utils/python-utils.js';
+import { spawn, spawnSync, execSync, exec, type SpawnOptions, type SpawnSyncOptionsWithStringEncoding } from 'child_process';
+import { existsSync, mkdirSync, rmSync, statSync } from 'fs';
+import { join, resolve } from 'path';
+import {
+  getSystemPythonCommand,
+  parsePythonVersion,
+  isPythonVersionCompatible,
+  type PythonCommandSpec,
+} from '../utils/python-utils.js';
 import { EXEC_TIMEOUTS } from '../utils/exec-constants.js';
 import {
   UvManager,
@@ -26,6 +31,7 @@ import {
   getCodexLensDataDir,
   getCodexLensVenvDir,
   getCodexLensPython,
+  getCodexLensHiddenPython,
   getCodexLensPip,
 } from '../utils/codexlens-path.js';
 import {
@@ -58,6 +64,10 @@ interface SemanticStatusCache {
 let semanticStatusCache: SemanticStatusCache | null = null;
 const SEMANTIC_STATUS_TTL = 5 * 60 * 1000; // 5 minutes TTL
 
+type HiddenCodexLensSpawnSyncOptions = Omit<SpawnSyncOptionsWithStringEncoding, 'encoding'> & {
+  encoding?: BufferEncoding;
+};
+
 // Track running indexing process for cancellation
 let currentIndexingProcess: ReturnType<typeof spawn> | null = null;
 let currentIndexingAborted = false;
@@ -69,13 +79,34 @@ const VENV_CHECK_TIMEOUT = process.platform === 'win32' ? 15000 : 10000;
  * Pre-flight check: verify Python 3.9+ is available before attempting bootstrap.
  * Returns an error message if Python is not suitable, or null if OK.
  */
+function probePythonVersion(
+  pythonCommand: PythonCommandSpec,
+  runner: typeof spawnSync = spawnSync,
+): string {
+  const result = runner(
+    pythonCommand.command,
+    [...pythonCommand.args, '--version'],
+    buildCodexLensSpawnSyncOptions({
+      timeout: EXEC_TIMEOUTS.PYTHON_VERSION,
+    }),
+  );
+
+  if (result.error) {
+    throw result.error;
+  }
+
+  const versionOutput = `${result.stdout ?? ''}${result.stderr ?? ''}`.trim();
+  if (result.status !== 0) {
+    throw new Error(versionOutput || `Python version probe exited with code ${String(result.status)}`);
+  }
+
+  return versionOutput;
+}
+
 function preFlightCheck(): string | null {
   try {
-    const pythonCmd = getSystemPython();
-    const version = execSync(`${pythonCmd} --version 2>&1`, {
-      encoding: 'utf8',
-      timeout: EXEC_TIMEOUTS.PYTHON_VERSION,
-    }).trim();
+    const pythonCommand = getSystemPythonCommand();
+    const version = probePythonVersion(pythonCommand);
     const parsed = parsePythonVersion(version);
     if (!parsed) {
       return `Cannot parse Python version from: "${version}". Ensure Python 3.9+ is installed.`;
@@ -244,7 +275,7 @@ async function checkVenvStatus(force = false): Promise<ReadyStatus> {
     return result;
   }
 
-  const pythonPath = getCodexLensPython();
+  const pythonPath = getCodexLensHiddenPython();
 
   // Check python executable exists
   if (!existsSync(pythonPath)) {
@@ -259,18 +290,21 @@ async function checkVenvStatus(force = false): Promise<ReadyStatus> {
   console.log('[PERF][CodexLens] checkVenvStatus spawning Python...');
 
   return new Promise((resolve) => {
-    const child = spawn(pythonPath, ['-c', 'import sys; import codexlens; import watchdog; print(f"{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}"); print(codexlens.__version__)'], {
-      stdio: ['ignore', 'pipe', 'pipe'],
-      timeout: VENV_CHECK_TIMEOUT,
-    });
+    const child = spawn(
+      pythonPath,
+      ['-c', 'import sys; import codexlens; import watchdog; print(f"{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}"); print(codexlens.__version__)'],
+      buildCodexLensSpawnOptions(venvPath, VENV_CHECK_TIMEOUT, {
+        stdio: ['ignore', 'pipe', 'pipe'],
+      }),
+    );
 
     let stdout = '';
     let stderr = '';
 
-    child.stdout.on('data', (data) => {
+    child.stdout?.on('data', (data) => {
       stdout += data.toString();
     });
-    child.stderr.on('data', (data) => {
+    child.stderr?.on('data', (data) => {
       stderr += data.toString();
     });
 
@@ -380,18 +414,21 @@ try:
 except Exception as e:
     print(json.dumps({"available": False, "error": str(e)}))
 `;
-    const child = spawn(getCodexLensPython(), ['-c', checkCode], {
-      stdio: ['ignore', 'pipe', 'pipe'],
-      timeout: 15000,
-    });
+    const child = spawn(
+      getCodexLensHiddenPython(),
+      ['-c', checkCode],
+      buildCodexLensSpawnOptions(getCodexLensVenvDir(), 15000, {
+        stdio: ['ignore', 'pipe', 'pipe'],
+      }),
+    );
 
     let stdout = '';
     let stderr = '';
 
-    child.stdout.on('data', (data) => {
+    child.stdout?.on('data', (data) => {
       stdout += data.toString();
     });
-    child.stderr.on('data', (data) => {
+    child.stderr?.on('data', (data) => {
       stderr += data.toString();
     });
 
@@ -441,13 +478,16 @@ async function ensureLiteLLMEmbedderReady(): Promise<BootstrapResult> {
 
   // Check if ccw_litellm can be imported
   const importStatus = await new Promise<{ ok: boolean; error?: string }>((resolve) => {
-    const child = spawn(getCodexLensPython(), ['-c', 'import ccw_litellm; print("OK")'], {
-      stdio: ['ignore', 'pipe', 'pipe'],
-      timeout: 15000,
-    });
+    const child = spawn(
+      getCodexLensHiddenPython(),
+      ['-c', 'import ccw_litellm; print("OK")'],
+      buildCodexLensSpawnOptions(getCodexLensVenvDir(), 15000, {
+        stdio: ['ignore', 'pipe', 'pipe'],
+      }),
+    );
 
     let stderr = '';
-    child.stderr.on('data', (data) => {
+    child.stderr?.on('data', (data) => {
       stderr += data.toString();
     });
 
@@ -522,10 +562,19 @@ async function ensureLiteLLMEmbedderReady(): Promise<BootstrapResult> {
     const venvPython = getCodexLensPython();
     console.warn(`[CodexLens] pip not found at: ${pipPath}. Attempting to bootstrap pip with ensurepip...`);
     try {
-      execSync(`\"${venvPython}\" -m ensurepip --upgrade`, {
-        stdio: 'inherit',
-        timeout: EXEC_TIMEOUTS.PACKAGE_INSTALL,
-      });
+      const ensurePipResult = spawnSync(
+        venvPython,
+        ['-m', 'ensurepip', '--upgrade'],
+        buildCodexLensSpawnSyncOptions({
+          timeout: EXEC_TIMEOUTS.PACKAGE_INSTALL,
+        }),
+      );
+      if (ensurePipResult.error) {
+        throw ensurePipResult.error;
+      }
+      if (ensurePipResult.status !== 0) {
+        throw new Error(`ensurepip exited with code ${String(ensurePipResult.status)}`);
+      }
     } catch (err) {
       console.warn(`[CodexLens] ensurepip failed: ${(err as Error).message}`);
     }
@@ -549,13 +598,36 @@ async function ensureLiteLLMEmbedderReady(): Promise<BootstrapResult> {
 
   try {
     if (localPath) {
-      const pipFlag = editable ? '-e' : '';
-      const pipInstallSpec = editable ? `"${localPath}"` : `"${localPath}"`;
+      const pipArgs = editable ? ['install', '-e', localPath] : ['install', localPath];
       console.log(`[CodexLens] Installing ccw-litellm from local path with pip: ${localPath} (editable: ${editable})`);
-      execSync(`"${pipPath}" install ${pipFlag} ${pipInstallSpec}`.replace(/  +/g, ' '), { stdio: 'inherit', timeout: EXEC_TIMEOUTS.PACKAGE_INSTALL });
+      const installResult = spawnSync(
+        pipPath,
+        pipArgs,
+        buildCodexLensSpawnSyncOptions({
+          timeout: EXEC_TIMEOUTS.PACKAGE_INSTALL,
+        }),
+      );
+      if (installResult.error) {
+        throw installResult.error;
+      }
+      if (installResult.status !== 0) {
+        throw new Error(`pip install exited with code ${String(installResult.status)}`);
+      }
     } else {
       console.log('[CodexLens] Installing ccw-litellm from PyPI with pip...');
-      execSync(`"${pipPath}" install ccw-litellm`, { stdio: 'inherit', timeout: EXEC_TIMEOUTS.PACKAGE_INSTALL });
+      const installResult = spawnSync(
+        pipPath,
+        ['install', 'ccw-litellm'],
+        buildCodexLensSpawnSyncOptions({
+          timeout: EXEC_TIMEOUTS.PACKAGE_INSTALL,
+        }),
+      );
+      if (installResult.error) {
+        throw installResult.error;
+      }
+      if (installResult.status !== 0) {
+        throw new Error(`pip install exited with code ${String(installResult.status)}`);
+      }
     }
 
     return {
@@ -609,7 +681,7 @@ interface PythonEnvInfo {
  * DirectML requires: 64-bit Python, version 3.8-3.12
  */
 async function checkPythonEnvForDirectML(): Promise<PythonEnvInfo> {
-  const pythonPath = getCodexLensPython();
+  const pythonPath = getCodexLensHiddenPython();
 
   if (!existsSync(pythonPath)) {
     return { version: '', majorMinor: '', architecture: 0, compatible: false, error: 'Python not found in venv' };
@@ -619,8 +691,19 @@ async function checkPythonEnvForDirectML(): Promise<PythonEnvInfo> {
     // Get Python version and architecture in one call
     // Use % formatting instead of f-string to avoid Windows shell escaping issues with curly braces
     const checkScript = `import sys, struct; print('%d.%d.%d|%d' % (sys.version_info.major, sys.version_info.minor, sys.version_info.micro, struct.calcsize('P') * 8))`;
-    const result = execSync(`"${pythonPath}" -c "${checkScript}"`, { encoding: 'utf-8', timeout: 10000 }).trim();
-    const [version, archStr] = result.split('|');
+    const result = spawnSync(
+      pythonPath,
+      ['-c', checkScript],
+      buildCodexLensSpawnSyncOptions({ timeout: 10000 }),
+    );
+    if (result.error) {
+      throw result.error;
+    }
+    const output = `${result.stdout ?? ''}${result.stderr ?? ''}`.trim();
+    if (result.status !== 0) {
+      throw new Error(output || `Python probe exited with code ${String(result.status)}`);
+    }
+    const [version, archStr] = output.split('|');
     const architecture = parseInt(archStr, 10);
     const [major, minor] = version.split('.').map(Number);
     const majorMinor = `${major}.${minor}`;
@@ -898,15 +981,18 @@ async function installSemantic(gpuMode: GpuMode = 'cpu'): Promise<BootstrapResul
     console.log(`[CodexLens] Packages: ${packages.join(', ')}`);
 
     // Install ONNX Runtime first with force-reinstall to ensure clean state
-    const installOnnx = spawn(pipPath, ['install', '--force-reinstall', onnxPackage], {
-      stdio: ['ignore', 'pipe', 'pipe'],
-      timeout: 600000, // 10 minutes for GPU packages
-    });
+    const installOnnx = spawn(
+      pipPath,
+      ['install', '--force-reinstall', onnxPackage],
+      buildCodexLensSpawnOptions(getCodexLensVenvDir(), 600000, {
+        stdio: ['ignore', 'pipe', 'pipe'],
+      }),
+    );
 
     let onnxStdout = '';
     let onnxStderr = '';
 
-    installOnnx.stdout.on('data', (data) => {
+    installOnnx.stdout?.on('data', (data) => {
       onnxStdout += data.toString();
       const line = data.toString().trim();
       if (line.includes('Downloading') || line.includes('Installing')) {
@@ -914,7 +1000,7 @@ async function installSemantic(gpuMode: GpuMode = 'cpu'): Promise<BootstrapResul
       }
     });
 
-    installOnnx.stderr.on('data', (data) => {
+    installOnnx.stderr?.on('data', (data) => {
       onnxStderr += data.toString();
     });
 
@@ -927,15 +1013,18 @@ async function installSemantic(gpuMode: GpuMode = 'cpu'): Promise<BootstrapResul
       console.log(`[CodexLens] ${onnxPackage} installed successfully`);
 
       // Now install remaining packages
-      const child = spawn(pipPath, ['install', ...packages], {
-        stdio: ['ignore', 'pipe', 'pipe'],
-        timeout: 600000,
-      });
+      const child = spawn(
+        pipPath,
+        ['install', ...packages],
+        buildCodexLensSpawnOptions(getCodexLensVenvDir(), 600000, {
+          stdio: ['ignore', 'pipe', 'pipe'],
+        }),
+      );
 
       let stdout = '';
       let stderr = '';
 
-      child.stdout.on('data', (data) => {
+      child.stdout?.on('data', (data) => {
         stdout += data.toString();
         const line = data.toString().trim();
         if (line.includes('Downloading') || line.includes('Installing') || line.includes('Collecting')) {
@@ -943,7 +1032,7 @@ async function installSemantic(gpuMode: GpuMode = 'cpu'): Promise<BootstrapResul
         }
       });
 
-      child.stderr.on('data', (data) => {
+      child.stderr?.on('data', (data) => {
         stderr += data.toString();
       });
 
@@ -1028,8 +1117,20 @@ async function bootstrapVenv(): Promise<BootstrapResult> {
     if (!existsSync(venvDir)) {
       try {
         console.log('[CodexLens] Creating virtual environment...');
-        const pythonCmd = getSystemPython();
-        execSync(`${pythonCmd} -m venv "${venvDir}"`, { stdio: 'inherit', timeout: EXEC_TIMEOUTS.PROCESS_SPAWN });
+        const pythonCmd = getSystemPythonCommand();
+        const createResult = spawnSync(
+          pythonCmd.command,
+          [...pythonCmd.args, '-m', 'venv', venvDir],
+          buildCodexLensSpawnSyncOptions({
+            timeout: EXEC_TIMEOUTS.PROCESS_SPAWN,
+          }),
+        );
+        if (createResult.error) {
+          throw createResult.error;
+        }
+        if (createResult.status !== 0) {
+          throw new Error(`venv creation exited with code ${String(createResult.status)}`);
+        }
       } catch (err) {
         return {
           success: false,
@@ -1049,10 +1150,19 @@ async function bootstrapVenv(): Promise<BootstrapResult> {
         const venvPython = getCodexLensPython();
         console.warn(`[CodexLens] pip not found at: ${pipPath}. Attempting to bootstrap pip with ensurepip...`);
         try {
-          execSync(`\"${venvPython}\" -m ensurepip --upgrade`, {
-            stdio: 'inherit',
-            timeout: EXEC_TIMEOUTS.PACKAGE_INSTALL,
-          });
+          const ensurePipResult = spawnSync(
+            venvPython,
+            ['-m', 'ensurepip', '--upgrade'],
+            buildCodexLensSpawnSyncOptions({
+              timeout: EXEC_TIMEOUTS.PACKAGE_INSTALL,
+            }),
+          );
+          if (ensurePipResult.error) {
+            throw ensurePipResult.error;
+          }
+          if (ensurePipResult.status !== 0) {
+            throw new Error(`ensurepip exited with code ${String(ensurePipResult.status)}`);
+          }
         } catch (err) {
           console.warn(`[CodexLens] ensurepip failed: ${(err as Error).message}`);
         }
@@ -1063,8 +1173,20 @@ async function bootstrapVenv(): Promise<BootstrapResult> {
         console.warn('[CodexLens] pip still missing after ensurepip; recreating venv with system Python...');
         try {
           rmSync(venvDir, { recursive: true, force: true });
-          const pythonCmd = getSystemPython();
-          execSync(`${pythonCmd} -m venv \"${venvDir}\"`, { stdio: 'inherit', timeout: EXEC_TIMEOUTS.PROCESS_SPAWN });
+          const pythonCmd = getSystemPythonCommand();
+          const recreateResult = spawnSync(
+            pythonCmd.command,
+            [...pythonCmd.args, '-m', 'venv', venvDir],
+            buildCodexLensSpawnSyncOptions({
+              timeout: EXEC_TIMEOUTS.PROCESS_SPAWN,
+            }),
+          );
+          if (recreateResult.error) {
+            throw recreateResult.error;
+          }
+          if (recreateResult.status !== 0) {
+            throw new Error(`venv recreation exited with code ${String(recreateResult.status)}`);
+          }
         } catch (err) {
           return {
             success: false,
@@ -1090,9 +1212,21 @@ async function bootstrapVenv(): Promise<BootstrapResult> {
     }
 
     const editable = isDevEnvironment() && !discovery.insideNodeModules;
-    const pipFlag = editable ? ' -e' : '';
+    const pipArgs = editable ? ['install', '-e', discovery.path] : ['install', discovery.path];
     console.log(`[CodexLens] Installing from local path: ${discovery.path} (editable: ${editable})`);
-    execSync(`"${pipPath}" install${pipFlag} "${discovery.path}"`, { stdio: 'inherit', timeout: EXEC_TIMEOUTS.PACKAGE_INSTALL });
+    const installResult = spawnSync(
+      pipPath,
+      pipArgs,
+      buildCodexLensSpawnSyncOptions({
+        timeout: EXEC_TIMEOUTS.PACKAGE_INSTALL,
+      }),
+    );
+    if (installResult.error) {
+      throw installResult.error;
+    }
+    if (installResult.status !== 0) {
+      throw new Error(`pip install exited with code ${String(installResult.status)}`);
+    }
 
     // Clear cache after successful installation
     clearVenvStatusCache();
@@ -1237,6 +1371,12 @@ function shouldRetryWithoutLanguageFilters(args: string[], error?: string): bool
   return args.includes('--language') && Boolean(error && /Got unexpected extra arguments?\b/i.test(error));
 }
 
+function shouldRetryWithLegacySearchArgs(args: string[], error?: string): boolean {
+  return args[0] === 'search'
+    && (args.includes('--limit') || args.includes('--mode') || args.includes('--offset'))
+    && Boolean(error && /Got unexpected extra arguments?\b/i.test(error));
+}
+
 function stripFlag(args: string[], flag: string): string[] {
   return args.filter((arg) => arg !== flag);
 }
@@ -1253,6 +1393,29 @@ function stripOptionWithValues(args: string[], option: string): string[] {
   return nextArgs;
 }
 
+function stripSearchCompatibilityOptions(args: string[]): string[] {
+  return stripOptionWithValues(
+    stripOptionWithValues(
+      stripOptionWithValues(args, '--offset'),
+      '--mode',
+    ),
+    '--limit',
+  );
+}
+
+function appendWarning(existing: string | undefined, next: string | undefined): string | undefined {
+  if (!next) {
+    return existing;
+  }
+  if (!existing) {
+    return next;
+  }
+  if (existing.includes(next)) {
+    return existing;
+  }
+  return `${existing} ${next}`;
+}
+
 function shouldRetryWithAstGrepPreference(args: string[], error?: string): boolean {
   return !args.includes('--use-astgrep')
     && !args.includes('--no-use-astgrep')
@@ -1334,6 +1497,142 @@ function tryExtractJsonPayload(raw: string): unknown | null {
   }
 }
 
+function parseLegacySearchPaths(output: string | undefined, cwd: string): string[] {
+  const lines = stripAnsiCodes(output || '')
+    .split(/\r?\n/)
+    .map((line) => line.trim())
+    .filter(Boolean);
+
+  const filePaths: string[] = [];
+  for (const line of lines) {
+    if (line.includes('RuntimeWarning:') || line.startsWith('warn(') || line.startsWith('Warning:')) {
+      continue;
+    }
+
+    const candidate = /^[a-zA-Z]:[\\/]|^\//.test(line)
+      ? line
+      : resolve(cwd, line);
+
+    try {
+      if (statSync(candidate).isFile()) {
+        filePaths.push(candidate);
+      }
+    } catch {
+      continue;
+    }
+  }
+
+  return [...new Set(filePaths)];
+}
+
+function buildLegacySearchPayload(query: string, filePaths: string[], limit: number): Record<string, unknown> {
+  const results = filePaths.slice(0, limit).map((path, index) => ({
+    path,
+    score: Math.max(0.1, 1 - index * 0.05),
+    excerpt: '',
+    content: '',
+    source: 'legacy_text_output',
+    symbol: null,
+  }));
+
+  return {
+    success: true,
+    result: {
+      query,
+      count: filePaths.length,
+      results,
+    },
+  };
+}
+
+function buildLegacySearchFilesPayload(query: string, filePaths: string[], limit: number): Record<string, unknown> {
+  return {
+    success: true,
+    result: {
+      query,
+      count: filePaths.length,
+      files: filePaths.slice(0, limit),
+    },
+  };
+}
+
+function buildEmptySearchPayload(query: string, filesOnly: boolean): Record<string, unknown> {
+  return filesOnly
+    ? {
+        success: true,
+        result: {
+          query,
+          count: 0,
+          files: [],
+        },
+      }
+    : {
+        success: true,
+        result: {
+          query,
+          count: 0,
+          results: [],
+        },
+      };
+}
+
+function normalizeSearchCommandResult(
+  result: ExecuteResult,
+  options: { query: string; cwd: string; limit: number; filesOnly: boolean },
+): ExecuteResult {
+  if (!result.success) {
+    return result;
+  }
+
+  const { query, cwd, limit, filesOnly } = options;
+  const rawOutput = typeof result.output === 'string' ? result.output : '';
+  const parsedPayload = rawOutput ? tryExtractJsonPayload(rawOutput) : null;
+  if (parsedPayload !== null) {
+    if (filesOnly) {
+      result.files = parsedPayload;
+    } else {
+      result.results = parsedPayload;
+    }
+    delete result.output;
+    return result;
+  }
+
+  const legacyPaths = parseLegacySearchPaths(rawOutput, cwd);
+  if (legacyPaths.length > 0) {
+    const warning = filesOnly
+      ? 'CodexLens CLI returned legacy plain-text file output; synthesized JSON-compatible search_files results.'
+      : 'CodexLens CLI returned legacy plain-text search output; synthesized JSON-compatible search results.';
+
+    if (filesOnly) {
+      result.files = buildLegacySearchFilesPayload(query, legacyPaths, limit);
+    } else {
+      result.results = buildLegacySearchPayload(query, legacyPaths, limit);
+    }
+    delete result.output;
+    result.warning = appendWarning(result.warning, warning);
+    result.message = appendWarning(result.message, warning);
+    return result;
+  }
+
+  const warning = rawOutput.trim()
+    ? (filesOnly
+        ? 'CodexLens CLI returned non-JSON search_files output; synthesized an empty JSON-compatible fallback payload.'
+        : 'CodexLens CLI returned non-JSON search output; synthesized an empty JSON-compatible fallback payload.')
+    : (filesOnly
+        ? 'CodexLens CLI returned empty stdout in JSON mode for search_files; synthesized an empty JSON-compatible fallback payload.'
+        : 'CodexLens CLI returned empty stdout in JSON mode for search; synthesized an empty JSON-compatible fallback payload.');
+
+  if (filesOnly) {
+    result.files = buildEmptySearchPayload(query, true);
+  } else {
+    result.results = buildEmptySearchPayload(query, false);
+  }
+  delete result.output;
+  result.warning = appendWarning(result.warning, warning);
+  result.message = appendWarning(result.message, warning);
+  return result;
+}
+
 function extractStructuredError(payload: unknown): string | null {
   if (!payload || typeof payload !== 'object' || Array.isArray(payload)) {
     return null;
@@ -1394,6 +1693,11 @@ async function executeCodexLens(args: string[], options: ExecuteOptions = {}): P
       transform: (currentArgs: string[]) => stripOptionWithValues(currentArgs, '--language'),
       warning: 'CodexLens CLI rejected --language filters; retried without language scoping.',
     },
+    {
+      shouldRetry: shouldRetryWithLegacySearchArgs,
+      transform: stripSearchCompatibilityOptions,
+      warning: 'CodexLens CLI rejected search --limit/--mode compatibility flags; retried with minimal legacy search args.',
+    },
     {
       shouldRetry: shouldRetryWithAstGrepPreference,
       transform: (currentArgs: string[]) => [...currentArgs, '--use-astgrep'],
@@ -1441,6 +1745,32 @@ async function executeCodexLens(args: string[], options: ExecuteOptions = {}): P
   };
 }
 
+function buildCodexLensSpawnOptions(cwd: string, timeout: number, overrides: SpawnOptions = {}): SpawnOptions {
+  const { env, ...rest } = overrides;
+  return {
+    cwd,
+    shell: false,
+    timeout,
+    windowsHide: true,
+    env: { ...process.env, PYTHONIOENCODING: 'utf-8', ...env },
+    ...rest,
+  };
+}
+
+function buildCodexLensSpawnSyncOptions(
+  overrides: HiddenCodexLensSpawnSyncOptions = {},
+): SpawnSyncOptionsWithStringEncoding {
+  const { env, encoding, ...rest } = overrides;
+  return {
+    shell: false,
+    windowsHide: true,
+    stdio: ['ignore', 'pipe', 'pipe'],
+    env: { ...process.env, PYTHONIOENCODING: 'utf-8', ...env },
+    ...rest,
+    encoding: encoding ?? 'utf8',
+  };
+}
+
 async function executeCodexLensOnce(args: string[], options: ExecuteOptions = {}): Promise<ExecuteResult> {
   const { timeout = 300000, cwd = process.cwd(), onProgress } = options; // Default 5 min
 
@@ -1456,13 +1786,7 @@ async function executeCodexLensOnce(args: string[], options: ExecuteOptions = {}
     // spawn's cwd option handles drive changes correctly on Windows
     const spawnArgs = ['-m', 'codexlens', ...args];
 
-    const child = spawn(getCodexLensPython(), spawnArgs, {
-      cwd,
-      shell: false, // CRITICAL: Prevent command injection
-      timeout,
-      // Ensure proper encoding on Windows
-      env: { ...process.env, PYTHONIOENCODING: 'utf-8' },
-    });
+    const child = spawn(getCodexLensHiddenPython(), spawnArgs, buildCodexLensSpawnOptions(cwd, timeout));
 
     // Track indexing process for cancellation (only for init commands)
     const isIndexingCommand = args.includes('init');
@@ -1566,13 +1890,22 @@ async function executeCodexLensOnce(args: string[], options: ExecuteOptions = {}
         }
       }
 
+      const trimmedStdout = stdout.trim();
       if (code === 0) {
-        safeResolve({ success: true, output: stdout.trim() });
+        const warning = args.includes('--json') && trimmedStdout.length === 0
+          ? `CodexLens CLI exited successfully but produced empty stdout in JSON mode for ${args[0] ?? 'command'}.`
+          : undefined;
+        safeResolve({
+          success: true,
+          output: trimmedStdout || undefined,
+          warning,
+          message: warning,
+        });
       } else {
         safeResolve({
           success: false,
           error: extractCodexLensFailure(stdout, stderr, code),
-          output: stdout.trim() || undefined,
+          output: trimmedStdout || undefined,
         });
       }
     });
@@ -1627,18 +1960,12 @@ async function searchCode(params: Params): Promise<ExecuteResult> {
     args.push('--enrich');
   }
 
-  const result = await executeCodexLens(args, { cwd: path });
-
-  if (result.success && result.output) {
-    try {
-      result.results = JSON.parse(result.output);
-      delete result.output;
-    } catch {
-      // Keep raw output if JSON parse fails
-    }
-  }
-
-  return result;
+  return normalizeSearchCommandResult(await executeCodexLens(args, { cwd: path }), {
+    query,
+    cwd: path,
+    limit,
+    filesOnly: false,
+  });
 }
 
 /**
@@ -1672,18 +1999,12 @@ async function searchFiles(params: Params): Promise<ExecuteResult> {
     args.push('--enrich');
   }
 
-  const result = await executeCodexLens(args, { cwd: path });
-
-  if (result.success && result.output) {
-    try {
-      result.files = JSON.parse(result.output);
-      delete result.output;
-    } catch {
-      // Keep raw output if JSON parse fails
-    }
-  }
-
-  return result;
+  return normalizeSearchCommandResult(await executeCodexLens(args, { cwd: path }), {
+    query,
+    cwd: path,
+    limit,
+    filesOnly: true,
+  });
 }
 
 /**
@@ -2185,11 +2506,18 @@ export {
 
 // Export Python path for direct spawn usage (e.g., watcher)
 export function getVenvPythonPath(): string {
-  return getCodexLensPython();
+  return getCodexLensHiddenPython();
 }
 
 export type { GpuMode, PythonEnvInfo };
 
+export const __testables = {
+  normalizeSearchCommandResult,
+  parseLegacySearchPaths,
+  buildCodexLensSpawnOptions,
+  probePythonVersion,
+};
+
 // Backward-compatible export for tests
 export const codexLensTool = {
   name: schema.name,
diff --git a/ccw/src/tools/litellm-client.ts b/ccw/src/tools/litellm-client.ts
index 924b3c6b..e2063efa 100644
--- a/ccw/src/tools/litellm-client.ts
+++ b/ccw/src/tools/litellm-client.ts
@@ -12,7 +12,7 @@
 import { spawn } from 'child_process';
 import { existsSync } from 'fs';
 import { join } from 'path';
-import { getCodexLensPython, getCodexLensVenvDir } from '../utils/codexlens-path.js';
+import { getCodexLensPython, getCodexLensHiddenPython, getCodexLensVenvDir } from '../utils/codexlens-path.js';
 
 export interface LiteLLMConfig {
   pythonPath?: string;  // Default: CodexLens venv Python
@@ -24,7 +24,7 @@ export interface LiteLLMConfig {
 const IS_WINDOWS = process.platform === 'win32';
 const CODEXLENS_VENV = getCodexLensVenvDir();
 const VENV_BIN_DIR = IS_WINDOWS ? 'Scripts' : 'bin';
-const PYTHON_EXECUTABLE = IS_WINDOWS ? 'python.exe' : 'python';
+const PYTHON_EXECUTABLE = IS_WINDOWS ? 'pythonw.exe' : 'python';
 
 /**
  * Get the Python path from CodexLens venv
@@ -36,6 +36,10 @@ export function getCodexLensVenvPython(): string {
   if (existsSync(venvPython)) {
     return venvPython;
   }
+  const hiddenPython = getCodexLensHiddenPython();
+  if (existsSync(hiddenPython)) {
+    return hiddenPython;
+  }
   // Fallback to system Python if venv not available
   return 'python';
 }
@@ -46,10 +50,14 @@ export function getCodexLensVenvPython(): string {
  * @returns Path to Python executable
  */
 export function getCodexLensPythonPath(): string {
-  const codexLensPython = getCodexLensPython();
+  const codexLensPython = getCodexLensHiddenPython();
   if (existsSync(codexLensPython)) {
     return codexLensPython;
   }
+  const fallbackPython = getCodexLensPython();
+  if (existsSync(fallbackPython)) {
+    return fallbackPython;
+  }
   // Fallback to system Python if venv not available
   return 'python';
 }
@@ -100,8 +108,10 @@ export class LiteLLMClient {
 
     return new Promise((resolve, reject) => {
       const proc = spawn(this.pythonPath, ['-m', 'ccw_litellm.cli', ...args], {
+        shell: false,
+        windowsHide: true,
         stdio: ['pipe', 'pipe', 'pipe'],
-        env: { ...process.env }
+        env: { ...process.env, PYTHONIOENCODING: 'utf-8' }
       });
 
       let stdout = '';
diff --git a/ccw/src/tools/smart-search.ts b/ccw/src/tools/smart-search.ts
index 18b2e26c..7e0c487e 100644
--- a/ccw/src/tools/smart-search.ts
+++ b/ccw/src/tools/smart-search.ts
@@ -20,7 +20,7 @@
 
 import { z } from 'zod';
 import type { ToolSchema, ToolResult } from '../types/tool.js';
-import { spawn, execSync } from 'child_process';
+import { spawn, spawnSync, type SpawnOptions } from 'child_process';
 import { existsSync, readFileSync, statSync } from 'fs';
 import { dirname, join, resolve } from 'path';
 import {
@@ -346,8 +346,12 @@ interface SearchMetadata {
   api_max_workers?: number;
   endpoint_count?: number;
   use_gpu?: boolean;
+  reranker_enabled?: boolean;
+  reranker_backend?: string;
+  reranker_model?: string;
   cascade_strategy?: string;
   staged_stage2_mode?: string;
+  static_graph_enabled?: boolean;
   preset?: string;
 }
 
@@ -474,8 +478,52 @@ const CODEX_LENS_FTS_COMPATIBILITY_PATTERNS = [
 ];
 
 let codexLensFtsBackendBroken = false;
+const autoInitJobs = new Map<string, { startedAt: number; languages?: string[] }>();
 const autoEmbedJobs = new Map<string, { startedAt: number; backend?: string; model?: string }>();
 
+type SmartSearchRuntimeOverrides = {
+  checkSemanticStatus?: typeof checkSemanticStatus;
+  getVenvPythonPath?: typeof getVenvPythonPath;
+  spawnProcess?: typeof spawn;
+  now?: () => number;
+};
+
+const runtimeOverrides: SmartSearchRuntimeOverrides = {};
+
+function getSemanticStatusRuntime(): typeof checkSemanticStatus {
+  return runtimeOverrides.checkSemanticStatus ?? checkSemanticStatus;
+}
+
+function getVenvPythonPathRuntime(): typeof getVenvPythonPath {
+  return runtimeOverrides.getVenvPythonPath ?? getVenvPythonPath;
+}
+
+function getSpawnRuntime(): typeof spawn {
+  return runtimeOverrides.spawnProcess ?? spawn;
+}
+
+function getNowRuntime(): number {
+  return (runtimeOverrides.now ?? Date.now)();
+}
+
+function buildSmartSearchSpawnOptions(cwd: string, overrides: SpawnOptions = {}): SpawnOptions {
+  const { env, ...rest } = overrides;
+  return {
+    cwd,
+    shell: false,
+    windowsHide: true,
+    env: { ...process.env, PYTHONIOENCODING: 'utf-8', ...env },
+    ...rest,
+  };
+}
+
+function shouldDetachBackgroundSmartSearchProcess(): boolean {
+  // On Windows, detached Python children can still create a transient console
+  // window even when windowsHide is set. Background warmup only needs to outlive
+  // the current request, not the MCP server process.
+  return process.platform !== 'win32';
+}
+
 /**
  * Truncate content to specified length with ellipsis
  * @param content - The content to truncate
@@ -523,6 +571,58 @@ interface RipgrepQueryModeResolution {
   warning?: string;
 }
 
+const GENERATED_QUERY_RE = /(?<!\w)(dist|build|out|coverage|htmlcov|generated|bundle|compiled|artifact|artifacts|\.workflow)(?!\w)/i;
+const ENV_STYLE_QUERY_RE = /\b[A-Z][A-Z0-9]+(?:_[A-Z0-9]+)+\b/;
+const TOPIC_TOKEN_RE = /[A-Za-z][A-Za-z0-9]*/g;
+const LEXICAL_PRIORITY_SURFACE_TOKENS = new Set([
+  'config',
+  'configs',
+  'configuration',
+  'configurations',
+  'setting',
+  'settings',
+  'backend',
+  'backends',
+  'environment',
+  'env',
+  'variable',
+  'variables',
+  'factory',
+  'factories',
+  'override',
+  'overrides',
+  'option',
+  'options',
+  'flag',
+  'flags',
+  'mode',
+  'modes',
+]);
+const LEXICAL_PRIORITY_FOCUS_TOKENS = new Set([
+  'embedding',
+  'embeddings',
+  'reranker',
+  'rerankers',
+  'onnx',
+  'api',
+  'litellm',
+  'fastembed',
+  'local',
+  'legacy',
+  'stage',
+  'stage2',
+  'stage3',
+  'stage4',
+  'precomputed',
+  'realtime',
+  'static',
+  'global',
+  'graph',
+  'selection',
+  'model',
+  'models',
+]);
+
 function sanitizeSearchQuery(query: string | undefined): string | undefined {
   if (!query) {
     return query;
@@ -676,6 +776,18 @@ function noteCodexLensFtsCompatibility(error: string | undefined): boolean {
   return true;
 }
 
+function shouldSurfaceCodexLensFtsCompatibilityWarning(options: {
+  compatibilityTriggeredThisQuery: boolean;
+  skipExactDueToCompatibility: boolean;
+  ripgrepResultCount: number;
+}): boolean {
+  if (options.ripgrepResultCount > 0) {
+    return false;
+  }
+
+  return options.compatibilityTriggeredThisQuery || options.skipExactDueToCompatibility;
+}
+
 function summarizeBackendError(error: string | undefined): string {
   const cleanError = stripAnsi(error || '').trim();
   if (!cleanError) {
@@ -765,6 +877,61 @@ function hasCentralizedVectorArtifacts(indexRoot: unknown): boolean {
   ].every((artifactPath) => existsSync(artifactPath));
 }
 
+function asObjectRecord(value: unknown): Record<string, unknown> | undefined {
+  if (!value || typeof value !== 'object' || Array.isArray(value)) {
+    return undefined;
+  }
+  return value as Record<string, unknown>;
+}
+
+function asFiniteNumber(value: unknown): number | undefined {
+  if (typeof value !== 'number' || !Number.isFinite(value)) {
+    return undefined;
+  }
+  return value;
+}
+
+function asBoolean(value: unknown): boolean | undefined {
+  return typeof value === 'boolean' ? value : undefined;
+}
+
+function extractEmbeddingsStatusSummary(embeddingsData: unknown): {
+  coveragePercent: number;
+  totalChunks: number;
+  hasEmbeddings: boolean;
+} {
+  const embeddings = asObjectRecord(embeddingsData) ?? {};
+  const root = asObjectRecord(embeddings.root) ?? embeddings;
+  const centralized = asObjectRecord(embeddings.centralized);
+
+  const totalIndexes = asFiniteNumber(root.total_indexes)
+    ?? asFiniteNumber(embeddings.total_indexes)
+    ?? 0;
+  const indexesWithEmbeddings = asFiniteNumber(root.indexes_with_embeddings)
+    ?? asFiniteNumber(embeddings.indexes_with_embeddings)
+    ?? 0;
+  const totalChunks = asFiniteNumber(root.total_chunks)
+    ?? asFiniteNumber(embeddings.total_chunks)
+    ?? 0;
+  const coveragePercent = asFiniteNumber(root.coverage_percent)
+    ?? asFiniteNumber(embeddings.coverage_percent)
+    ?? (totalIndexes > 0 ? (indexesWithEmbeddings / totalIndexes) * 100 : 0);
+  const hasEmbeddings = asBoolean(root.has_embeddings)
+    ?? asBoolean(centralized?.usable)
+    ?? (totalChunks > 0 || indexesWithEmbeddings > 0 || coveragePercent > 0);
+
+  return {
+    coveragePercent,
+    totalChunks,
+    hasEmbeddings,
+  };
+}
+
+function selectEmbeddingsStatusPayload(statusData: unknown): Record<string, unknown> {
+  const status = asObjectRecord(statusData) ?? {};
+  return asObjectRecord(status.embeddings_status) ?? asObjectRecord(status.embeddings) ?? {};
+}
+
 function collectBackendError(
   errors: string[],
   backendName: string,
@@ -825,8 +992,77 @@ function formatSmartSearchCommand(action: string, pathValue: string, extraParams
   return `smart_search(${args.join(', ')})`;
 }
 
+function parseOptionalBooleanEnv(raw: string | undefined): boolean | undefined {
+  const normalized = raw?.trim().toLowerCase();
+  if (!normalized) {
+    return undefined;
+  }
+
+  if (['1', 'true', 'on', 'yes'].includes(normalized)) {
+    return true;
+  }
+
+  if (['0', 'false', 'off', 'no'].includes(normalized)) {
+    return false;
+  }
+
+  return undefined;
+}
+
 function isAutoEmbedMissingEnabled(config: CodexLensConfig | null | undefined): boolean {
-  return config?.embedding_auto_embed_missing !== false;
+  const envOverride = parseOptionalBooleanEnv(process.env.CODEXLENS_AUTO_EMBED_MISSING);
+  if (envOverride !== undefined) {
+    return envOverride;
+  }
+
+  if (process.platform === 'win32') {
+    return false;
+  }
+
+  if (typeof config?.embedding_auto_embed_missing === 'boolean') {
+    return config.embedding_auto_embed_missing;
+  }
+
+  return true;
+}
+
+function isAutoInitMissingEnabled(): boolean {
+  const envOverride = parseOptionalBooleanEnv(process.env.CODEXLENS_AUTO_INIT_MISSING);
+  if (envOverride !== undefined) {
+    return envOverride;
+  }
+
+  return process.platform !== 'win32';
+}
+
+function getAutoEmbedMissingDisabledReason(config: CodexLensConfig | null | undefined): string {
+  const envOverride = parseOptionalBooleanEnv(process.env.CODEXLENS_AUTO_EMBED_MISSING);
+  if (envOverride === false) {
+    return 'Automatic embedding warmup is disabled by CODEXLENS_AUTO_EMBED_MISSING=false.';
+  }
+
+  if (config?.embedding_auto_embed_missing === false) {
+    return 'Automatic embedding warmup is disabled by embedding.auto_embed_missing=false.';
+  }
+
+  if (process.platform === 'win32') {
+    return 'Automatic embedding warmup is disabled by default on Windows even if CodexLens config resolves auto_embed_missing=true. Set CODEXLENS_AUTO_EMBED_MISSING=true to opt in.';
+  }
+
+  return 'Automatic embedding warmup is disabled.';
+}
+
+function getAutoInitMissingDisabledReason(): string {
+  const envOverride = parseOptionalBooleanEnv(process.env.CODEXLENS_AUTO_INIT_MISSING);
+  if (envOverride === false) {
+    return 'Automatic static index warmup is disabled by CODEXLENS_AUTO_INIT_MISSING=false.';
+  }
+
+  if (process.platform === 'win32') {
+    return 'Automatic static index warmup is disabled by default on Windows. Set CODEXLENS_AUTO_INIT_MISSING=true to opt in.';
+  }
+
+  return 'Automatic static index warmup is disabled.';
 }
 
 function buildIndexSuggestions(indexStatus: IndexStatus, scope: SearchScope): SearchSuggestion[] | undefined {
@@ -930,29 +1166,24 @@ async function checkIndexStatus(path: string = '.'): Promise<IndexStatus> {
       const status = parsed.result || parsed;
 
       // Get embeddings coverage from comprehensive status
-      const embeddingsData = status.embeddings || {};
-      const totalIndexes = Number(embeddingsData.total_indexes || 0);
-      const indexesWithEmbeddings = Number(embeddingsData.indexes_with_embeddings || 0);
-      const totalChunks = Number(embeddingsData.total_chunks || 0);
-      const hasCentralizedVectors = hasCentralizedVectorArtifacts(status.index_root);
-      let embeddingsCoverage = typeof embeddingsData.coverage_percent === 'number'
-        ? embeddingsData.coverage_percent
-        : (totalIndexes > 0 ? (indexesWithEmbeddings / totalIndexes) * 100 : 0);
-      if (hasCentralizedVectors) {
-        embeddingsCoverage = Math.max(embeddingsCoverage, 100);
-      }
+      const embeddingsData = selectEmbeddingsStatusPayload(status);
+      const legacyEmbeddingsData = asObjectRecord(status.embeddings) ?? {};
+      const embeddingsSummary = extractEmbeddingsStatusSummary(embeddingsData);
+      const totalIndexes = Number(legacyEmbeddingsData.total_indexes || asObjectRecord(embeddingsData)?.total_indexes || 0);
+      const embeddingsCoverage = embeddingsSummary.coveragePercent;
+      const totalChunks = embeddingsSummary.totalChunks;
       const indexed = Boolean(status.projects_count > 0 || status.total_files > 0 || status.index_root || totalIndexes > 0 || totalChunks > 0);
-      const has_embeddings = indexesWithEmbeddings > 0 || embeddingsCoverage > 0 || totalChunks > 0 || hasCentralizedVectors;
+      const has_embeddings = embeddingsSummary.hasEmbeddings;
 
       // Extract model info if available
-      const modelInfoData = embeddingsData.model_info;
+      const modelInfoData = asObjectRecord(embeddingsData.model_info);
       const modelInfo: ModelInfo | undefined = modelInfoData ? {
-        model_profile: modelInfoData.model_profile,
-        model_name: modelInfoData.model_name,
-        embedding_dim: modelInfoData.embedding_dim,
-        backend: modelInfoData.backend,
-        created_at: modelInfoData.created_at,
-        updated_at: modelInfoData.updated_at,
+        model_profile: typeof modelInfoData.model_profile === 'string' ? modelInfoData.model_profile : undefined,
+        model_name: typeof modelInfoData.model_name === 'string' ? modelInfoData.model_name : undefined,
+        embedding_dim: typeof modelInfoData.embedding_dim === 'number' ? modelInfoData.embedding_dim : undefined,
+        backend: typeof modelInfoData.backend === 'string' ? modelInfoData.backend : undefined,
+        created_at: typeof modelInfoData.created_at === 'string' ? modelInfoData.created_at : undefined,
+        updated_at: typeof modelInfoData.updated_at === 'string' ? modelInfoData.updated_at : undefined,
       } : undefined;
 
       let warning: string | undefined;
@@ -1039,6 +1270,39 @@ function looksLikeCodeQuery(query: string): boolean {
   return false;
 }
 
+function queryTargetsGeneratedFiles(query: string): boolean {
+  return GENERATED_QUERY_RE.test(query.trim());
+}
+
+function prefersLexicalPriorityQuery(query: string): boolean {
+  const trimmed = query.trim();
+  if (!trimmed) return false;
+  if (ENV_STYLE_QUERY_RE.test(trimmed)) return true;
+
+  const tokens = new Set((trimmed.match(TOPIC_TOKEN_RE) ?? []).map((token) => token.toLowerCase()));
+  if (tokens.size === 0) return false;
+  if (tokens.has('factory') || tokens.has('factories')) return true;
+  if ((tokens.has('environment') || tokens.has('env')) && (tokens.has('variable') || tokens.has('variables'))) {
+    return true;
+  }
+  if (
+    tokens.has('backend') &&
+    ['embedding', 'embeddings', 'reranker', 'rerankers', 'onnx', 'api', 'litellm', 'fastembed', 'local', 'legacy']
+      .some((token) => tokens.has(token))
+  ) {
+    return true;
+  }
+
+  let surfaceHit = false;
+  let focusHit = false;
+  for (const token of tokens) {
+    if (LEXICAL_PRIORITY_SURFACE_TOKENS.has(token)) surfaceHit = true;
+    if (LEXICAL_PRIORITY_FOCUS_TOKENS.has(token)) focusHit = true;
+    if (surfaceHit && focusHit) return true;
+  }
+  return false;
+}
+
 /**
  * Classify query intent and recommend search mode
  * Simple mapping: hybrid (NL + index + embeddings) | exact (index or insufficient embeddings) | ripgrep (no index)
@@ -1051,6 +1315,8 @@ function classifyIntent(query: string, hasIndex: boolean = false, hasSufficientE
   const isNaturalLanguage = detectNaturalLanguage(query);
   const isCodeQuery = looksLikeCodeQuery(query);
   const isRegexPattern = detectRegex(query);
+  const targetsGeneratedFiles = queryTargetsGeneratedFiles(query);
+  const prefersLexicalPriority = prefersLexicalPriorityQuery(query);
 
   let mode: string;
   let confidence: number;
@@ -1058,9 +1324,9 @@ function classifyIntent(query: string, hasIndex: boolean = false, hasSufficientE
   if (!hasIndex) {
     mode = 'ripgrep';
     confidence = 1.0;
-  } else if (isCodeQuery || isRegexPattern) {
+  } else if (targetsGeneratedFiles || prefersLexicalPriority || isCodeQuery || isRegexPattern) {
     mode = 'exact';
-    confidence = 0.95;
+    confidence = targetsGeneratedFiles ? 0.97 : prefersLexicalPriority ? 0.93 : 0.95;
   } else if (isNaturalLanguage && hasSufficientEmbeddings) {
     mode = 'hybrid';
     confidence = 0.9;
@@ -1075,6 +1341,8 @@ function classifyIntent(query: string, hasIndex: boolean = false, hasSufficientE
   if (detectNaturalLanguage(query)) detectedPatterns.push('natural language');
   if (detectFilePath(query)) detectedPatterns.push('file path');
   if (detectRelationship(query)) detectedPatterns.push('relationship');
+  if (targetsGeneratedFiles) detectedPatterns.push('generated artifact');
+  if (prefersLexicalPriority) detectedPatterns.push('lexical priority');
   if (isCodeQuery) detectedPatterns.push('code identifier');
 
   const reasoning = `Query classified as ${mode} (confidence: ${confidence.toFixed(2)}, detected: ${detectedPatterns.join(', ')}, index: ${hasIndex ? 'available' : 'not available'}, embeddings: ${hasSufficientEmbeddings ? 'sufficient' : 'insufficient'})`;
@@ -1087,12 +1355,21 @@ function classifyIntent(query: string, hasIndex: boolean = false, hasSufficientE
  * @param toolName - Tool executable name
  * @returns True if available
  */
-function checkToolAvailability(toolName: string): boolean {
+function checkToolAvailability(
+  toolName: string,
+  lookupRuntime: typeof spawnSync = spawnSync,
+): boolean {
   try {
     const isWindows = process.platform === 'win32';
     const command = isWindows ? 'where' : 'which';
-    execSync(`${command} ${toolName}`, { stdio: 'ignore', timeout: EXEC_TIMEOUTS.SYSTEM_INFO });
-    return true;
+    const result = lookupRuntime(command, [toolName], {
+      shell: false,
+      windowsHide: true,
+      stdio: 'ignore',
+      timeout: EXEC_TIMEOUTS.SYSTEM_INFO,
+      env: { ...process.env, PYTHONIOENCODING: 'utf-8' },
+    });
+    return !result.error && result.status === 0;
   } catch {
     return false;
   }
@@ -1330,6 +1607,23 @@ function normalizeEmbeddingBackend(backend?: string): string | undefined {
   return normalized;
 }
 
+function buildIndexInitArgs(projectPath: string, options: { force?: boolean; languages?: string[]; noEmbeddings?: boolean } = {}): string[] {
+  const { force = false, languages, noEmbeddings = true } = options;
+  const args = ['index', 'init', projectPath];
+
+  if (noEmbeddings) {
+    args.push('--no-embeddings');
+  }
+  if (force) {
+    args.push('--force');
+  }
+  if (languages && languages.length > 0) {
+    args.push(...languages.flatMap((language) => ['--language', language]));
+  }
+
+  return args;
+}
+
 function resolveEmbeddingSelection(
   requestedBackend: string | undefined,
   requestedModel: string | undefined,
@@ -1502,17 +1796,17 @@ function spawnBackgroundEmbeddingsViaPython(params: {
 }): { success: boolean; error?: string } {
   const { projectPath, backend, model } = params;
   try {
-    const child = spawn(getVenvPythonPath(), ['-c', buildEmbeddingPythonCode(params)], {
-      cwd: projectPath,
-      shell: false,
-      detached: true,
-      stdio: 'ignore',
-      windowsHide: true,
-      env: { ...process.env, PYTHONIOENCODING: 'utf-8' },
-    });
+    const child = getSpawnRuntime()(
+      getVenvPythonPathRuntime()(),
+      ['-c', buildEmbeddingPythonCode(params)],
+      buildSmartSearchSpawnOptions(projectPath, {
+        detached: shouldDetachBackgroundSmartSearchProcess(),
+        stdio: 'ignore',
+      }),
+    );
 
     autoEmbedJobs.set(projectPath, {
-      startedAt: Date.now(),
+      startedAt: getNowRuntime(),
       backend,
       model,
     });
@@ -1532,6 +1826,84 @@ function spawnBackgroundEmbeddingsViaPython(params: {
   }
 }
 
+function spawnBackgroundIndexInit(params: {
+  projectPath: string;
+  languages?: string[];
+}): { success: boolean; error?: string } {
+  const { projectPath, languages } = params;
+  try {
+    const pythonPath = getVenvPythonPathRuntime()();
+    if (!existsSync(pythonPath)) {
+      return {
+        success: false,
+        error: 'CodexLens Python environment is not ready yet.',
+      };
+    }
+
+    const child = getSpawnRuntime()(
+      pythonPath,
+      ['-m', 'codexlens', ...buildIndexInitArgs(projectPath, { languages })],
+      buildSmartSearchSpawnOptions(projectPath, {
+        detached: shouldDetachBackgroundSmartSearchProcess(),
+        stdio: 'ignore',
+      }),
+    );
+
+    autoInitJobs.set(projectPath, {
+      startedAt: getNowRuntime(),
+      languages,
+    });
+
+    const cleanup = () => {
+      autoInitJobs.delete(projectPath);
+    };
+    child.on('error', cleanup);
+    child.on('close', cleanup);
+    child.unref();
+    return { success: true };
+  } catch (error) {
+    return {
+      success: false,
+      error: error instanceof Error ? error.message : String(error),
+    };
+  }
+}
+
+async function maybeStartBackgroundAutoInit(
+  scope: SearchScope,
+  indexStatus: IndexStatus,
+): Promise<{ note?: string; warning?: string }> {
+  if (indexStatus.indexed) {
+    return {};
+  }
+
+  if (!isAutoInitMissingEnabled()) {
+    return {
+      note: getAutoInitMissingDisabledReason(),
+    };
+  }
+
+  if (autoInitJobs.has(scope.workingDirectory)) {
+    return {
+      note: 'Background static index build is already running for this path.',
+    };
+  }
+
+  const spawned = spawnBackgroundIndexInit({
+    projectPath: scope.workingDirectory,
+  });
+
+  if (!spawned.success) {
+    return {
+      warning: `Automatic static index warmup could not start: ${spawned.error}`,
+    };
+  }
+
+  return {
+    note: 'Background static index build started for this path. Re-run search shortly for indexed FTS results.',
+  };
+}
+
 async function maybeStartBackgroundAutoEmbed(
   scope: SearchScope,
   indexStatus: IndexStatus,
@@ -1542,7 +1914,7 @@ async function maybeStartBackgroundAutoEmbed(
 
   if (!isAutoEmbedMissingEnabled(indexStatus.config)) {
     return {
-      note: 'Automatic embedding warmup is disabled by CODEXLENS_AUTO_EMBED_MISSING=false.',
+      note: getAutoEmbedMissingDisabledReason(indexStatus.config),
     };
   }
 
@@ -1554,7 +1926,7 @@ async function maybeStartBackgroundAutoEmbed(
 
   const backend = normalizeEmbeddingBackend(indexStatus.config?.embedding_backend) ?? 'fastembed';
   const model = indexStatus.config?.embedding_model?.trim() || undefined;
-  const semanticStatus = await checkSemanticStatus();
+  const semanticStatus = await getSemanticStatusRuntime()();
   if (!semanticStatus.available) {
     return {
       warning: 'Automatic embedding warmup skipped because semantic dependencies are not ready.',
@@ -1604,18 +1976,19 @@ async function executeEmbeddingsViaPython(params: {
   const pythonCode = buildEmbeddingPythonCode(params);
 
   return await new Promise((resolve) => {
-    const child = spawn(getVenvPythonPath(), ['-c', pythonCode], {
-      cwd: projectPath,
-      shell: false,
-      timeout: 1800000,
-      env: { ...process.env, PYTHONIOENCODING: 'utf-8' },
-    });
+    const child = getSpawnRuntime()(
+      getVenvPythonPathRuntime()(),
+      ['-c', pythonCode],
+      buildSmartSearchSpawnOptions(projectPath, {
+        timeout: 1800000,
+      }),
+    );
 
     let stdout = '';
     let stderr = '';
     const progressMessages: string[] = [];
 
-    child.stdout.on('data', (data: Buffer) => {
+    child.stdout?.on('data', (data: Buffer) => {
       const chunk = data.toString();
       stdout += chunk;
       for (const line of chunk.split(/\r?\n/)) {
@@ -1625,7 +1998,7 @@ async function executeEmbeddingsViaPython(params: {
       }
     });
 
-    child.stderr.on('data', (data: Buffer) => {
+    child.stderr?.on('data', (data: Buffer) => {
       stderr += data.toString();
     });
 
@@ -1683,13 +2056,7 @@ async function executeInitAction(params: Params, force: boolean = false): Promis
 
   // Build args with --no-embeddings for FTS-only index (faster)
   // Use 'index init' subcommand (new CLI structure)
-  const args = ['index', 'init', scope.workingDirectory, '--no-embeddings'];
-  if (force) {
-    args.push('--force');  // Force full rebuild
-  }
-  if (languages && languages.length > 0) {
-    args.push(...languages.flatMap((language) => ['--language', language]));
-  }
+  const args = buildIndexInitArgs(scope.workingDirectory, { force, languages });
 
   // Track progress updates
   const progressUpdates: ProgressInfo[] = [];
@@ -1805,8 +2172,12 @@ async function executeEmbedAction(params: Params): Promise<SearchResult> {
       api_max_workers: normalizedBackend === 'litellm' ? effectiveApiMaxWorkers : undefined,
       endpoint_count: endpoints.length,
       use_gpu: true,
+      reranker_enabled: currentStatus.config?.reranker_enabled,
+      reranker_backend: currentStatus.config?.reranker_backend,
+      reranker_model: currentStatus.config?.reranker_model,
       cascade_strategy: currentStatus.config?.cascade_strategy,
       staged_stage2_mode: currentStatus.config?.staged_stage2_mode,
+      static_graph_enabled: currentStatus.config?.static_graph_enabled,
       note: [embeddingSelection.note, progressMessage].filter(Boolean).join(' | ') || undefined,
       preset: embeddingSelection.preset,
     },
@@ -1856,6 +2227,9 @@ async function executeStatusAction(params: Params): Promise<SearchResult> {
     if (cfg.staged_stage2_mode) {
       statusParts.push(`Stage2: ${cfg.staged_stage2_mode}`);
     }
+    if (typeof cfg.static_graph_enabled === 'boolean') {
+      statusParts.push(`Static Graph: ${cfg.static_graph_enabled ? 'on' : 'off'}`);
+    }
 
     // Reranker info
     if (cfg.reranker_enabled) {
@@ -1874,6 +2248,12 @@ async function executeStatusAction(params: Params): Promise<SearchResult> {
       action: 'status',
       path: scope.workingDirectory,
       warning: indexStatus.warning,
+      reranker_enabled: indexStatus.config?.reranker_enabled,
+      reranker_backend: indexStatus.config?.reranker_backend,
+      reranker_model: indexStatus.config?.reranker_model,
+      cascade_strategy: indexStatus.config?.cascade_strategy,
+      staged_stage2_mode: indexStatus.config?.staged_stage2_mode,
+      static_graph_enabled: indexStatus.config?.static_graph_enabled,
       suggestions: buildIndexSuggestions(indexStatus, scope),
     },
   };
@@ -2026,6 +2406,7 @@ async function executeFuzzyMode(params: Params): Promise<SearchResult> {
   const ftsWasBroken = codexLensFtsBackendBroken;
   const ripgrepQueryMode = resolveRipgrepQueryMode(query, regex, tokenize);
   const fuzzyWarnings: string[] = [];
+  const skipExactDueToCompatibility = ftsWasBroken && !ripgrepQueryMode.literalFallback;
 
   let skipExactReason: string | undefined;
   if (ripgrepQueryMode.literalFallback) {
@@ -2043,10 +2424,7 @@ async function executeFuzzyMode(params: Params): Promise<SearchResult> {
   ]);
   timer.mark('parallel_search');
 
-  if (!skipExactReason && !ftsWasBroken && codexLensFtsBackendBroken) {
-    fuzzyWarnings.push('CodexLens FTS backend is incompatible with the current CLI runtime. Falling back to ripgrep results.');
-  }
-  if (skipExactReason) {
+  if (skipExactReason && !skipExactDueToCompatibility) {
     fuzzyWarnings.push(skipExactReason);
   }
   if (ripgrepResult.status === 'fulfilled' && ripgrepResult.value.metadata?.warning) {
@@ -2070,6 +2448,16 @@ async function executeFuzzyMode(params: Params): Promise<SearchResult> {
     resultsMap.set('ripgrep', ripgrepResult.value.results as any[]);
   }
 
+  const ripgrepResultCount = (resultsMap.get('ripgrep') ?? []).length;
+  const compatibilityTriggeredThisQuery = !skipExactReason && !ftsWasBroken && codexLensFtsBackendBroken;
+  if (shouldSurfaceCodexLensFtsCompatibilityWarning({
+    compatibilityTriggeredThisQuery,
+    skipExactDueToCompatibility,
+    ripgrepResultCount,
+  })) {
+    fuzzyWarnings.push('CodexLens FTS backend is incompatible with the current CLI runtime. Falling back to ripgrep results.');
+  }
+
   // If both failed, return error
   if (resultsMap.size === 0) {
     const errors: string[] = [];
@@ -2286,20 +2674,23 @@ async function executeRipgrepMode(params: Params): Promise<SearchResult> {
   });
 
   return new Promise((resolve) => {
-    const child = spawn(command, args, {
-      cwd: scope.workingDirectory || getProjectRoot(),
-      stdio: ['ignore', 'pipe', 'pipe'],
-    });
+    const child = getSpawnRuntime()(
+      command,
+      args,
+      buildSmartSearchSpawnOptions(scope.workingDirectory || getProjectRoot(), {
+        stdio: ['ignore', 'pipe', 'pipe'],
+      }),
+    );
 
     let stdout = '';
     let stderr = '';
     let resultLimitReached = false;
 
-    child.stdout.on('data', (data) => {
+    child.stdout?.on('data', (data) => {
       stdout += data.toString();
     });
 
-    child.stderr.on('data', (data) => {
+    child.stderr?.on('data', (data) => {
       stderr += data.toString();
     });
 
@@ -3484,19 +3875,22 @@ async function executeFindFilesAction(params: Params): Promise<SearchResult> {
       }
     }
 
-    const child = spawn('rg', args, {
-      cwd: scope.workingDirectory || getProjectRoot(),
-      stdio: ['ignore', 'pipe', 'pipe'],
-    });
+    const child = getSpawnRuntime()(
+      'rg',
+      args,
+      buildSmartSearchSpawnOptions(scope.workingDirectory || getProjectRoot(), {
+        stdio: ['ignore', 'pipe', 'pipe'],
+      }),
+    );
 
     let stdout = '';
     let stderr = '';
 
-    child.stdout.on('data', (data) => {
+    child.stdout?.on('data', (data) => {
       stdout += data.toString();
     });
 
-    child.stderr.on('data', (data) => {
+    child.stderr?.on('data', (data) => {
       stderr += data.toString();
     });
 
@@ -3800,6 +4194,12 @@ function enrichMetadataWithIndexStatus(
   nextMetadata.index_status = indexStatus.indexed
     ? (indexStatus.has_embeddings ? 'indexed' : 'partial')
     : 'not_indexed';
+  nextMetadata.reranker_enabled = indexStatus.config?.reranker_enabled;
+  nextMetadata.reranker_backend = indexStatus.config?.reranker_backend;
+  nextMetadata.reranker_model = indexStatus.config?.reranker_model;
+  nextMetadata.cascade_strategy = indexStatus.config?.cascade_strategy;
+  nextMetadata.staged_stage2_mode = indexStatus.config?.staged_stage2_mode;
+  nextMetadata.static_graph_enabled = indexStatus.config?.static_graph_enabled;
   nextMetadata.warning = mergeWarnings(nextMetadata.warning, indexStatus.warning);
   nextMetadata.suggestions = mergeSuggestions(nextMetadata.suggestions, buildIndexSuggestions(indexStatus, scope));
   return nextMetadata;
@@ -3890,7 +4290,7 @@ export async function handler(params: Record<string, unknown>): Promise<ToolResu
         break;
     }
 
-    let autoEmbedNote: string | undefined;
+    let backgroundNote: string | undefined;
 
     // Transform output based on output_mode (for search actions only)
     if (action === 'search' || action === 'search_files') {
@@ -3898,12 +4298,13 @@ export async function handler(params: Record<string, unknown>): Promise<ToolResu
       const indexStatus = await checkIndexStatus(scope.workingDirectory);
       result.metadata = enrichMetadataWithIndexStatus(result.metadata, indexStatus, scope);
 
+      const autoInitStatus = await maybeStartBackgroundAutoInit(scope, indexStatus);
       const autoEmbedStatus = await maybeStartBackgroundAutoEmbed(scope, indexStatus);
-      autoEmbedNote = autoEmbedStatus.note;
+      backgroundNote = mergeNotes(autoInitStatus.note, autoEmbedStatus.note);
       result.metadata = {
         ...(result.metadata ?? {}),
-        note: mergeNotes(result.metadata?.note, autoEmbedStatus.note),
-        warning: mergeWarnings(result.metadata?.warning, autoEmbedStatus.warning),
+        note: mergeNotes(result.metadata?.note, autoInitStatus.note, autoEmbedStatus.note),
+        warning: mergeWarnings(result.metadata?.warning, autoInitStatus.warning, autoEmbedStatus.warning),
       };
 
       // Add pagination metadata for search results if not already present
@@ -3935,8 +4336,8 @@ export async function handler(params: Record<string, unknown>): Promise<ToolResu
           if (result.metadata?.warning) {
             advisoryLines.push('', 'Warnings:', `- ${result.metadata.warning}`);
           }
-          if (autoEmbedNote) {
-            advisoryLines.push('', 'Notes:', `- ${autoEmbedNote}`);
+          if (backgroundNote) {
+            advisoryLines.push('', 'Notes:', `- ${backgroundNote}`);
           }
           if (result.metadata?.suggestions && result.metadata.suggestions.length > 0) {
             advisoryLines.push('', 'Suggestions:');
@@ -3972,13 +4373,40 @@ export async function handler(params: Record<string, unknown>): Promise<ToolResu
  */
 export const __testables = {
   isCodexLensCliCompatibilityError,
+  shouldSurfaceCodexLensFtsCompatibilityWarning,
+  buildSmartSearchSpawnOptions,
+  shouldDetachBackgroundSmartSearchProcess,
+  checkToolAvailability,
   parseCodexLensJsonOutput,
   parsePlainTextFileMatches,
   hasCentralizedVectorArtifacts,
+  extractEmbeddingsStatusSummary,
+  selectEmbeddingsStatusPayload,
   resolveRipgrepQueryMode,
+  queryTargetsGeneratedFiles,
+  prefersLexicalPriorityQuery,
+  classifyIntent,
   resolveEmbeddingSelection,
+  parseOptionalBooleanEnv,
+  isAutoInitMissingEnabled,
   isAutoEmbedMissingEnabled,
+  getAutoInitMissingDisabledReason,
+  getAutoEmbedMissingDisabledReason,
   buildIndexSuggestions,
+  maybeStartBackgroundAutoInit,
+  maybeStartBackgroundAutoEmbed,
+  __setRuntimeOverrides(overrides: Partial<SmartSearchRuntimeOverrides>) {
+    Object.assign(runtimeOverrides, overrides);
+  },
+  __resetRuntimeOverrides() {
+    for (const key of Object.keys(runtimeOverrides) as Array<keyof SmartSearchRuntimeOverrides>) {
+      delete runtimeOverrides[key];
+    }
+  },
+  __resetBackgroundJobs() {
+    autoInitJobs.clear();
+    autoEmbedJobs.clear();
+  },
 };
 
 export async function executeInitWithProgress(
diff --git a/ccw/src/utils/codexlens-path.ts b/ccw/src/utils/codexlens-path.ts
index aadffba8..88ae5cbc 100644
--- a/ccw/src/utils/codexlens-path.ts
+++ b/ccw/src/utils/codexlens-path.ts
@@ -9,6 +9,7 @@
  * 2. Default: ~/.codexlens
  */
 
+import { existsSync } from 'fs';
 import { join } from 'path';
 import { homedir } from 'os';
 
@@ -47,6 +48,26 @@ export function getCodexLensPython(): string {
     : join(venvDir, 'bin', 'python');
 }
 
+/**
+ * Get the preferred Python executable for hidden/windowless CodexLens subprocesses.
+ * On Windows this prefers pythonw.exe when available to avoid transient console windows.
+ *
+ * @returns Path to the preferred hidden-subprocess Python executable
+ */
+export function getCodexLensHiddenPython(): string {
+  if (process.platform !== 'win32') {
+    return getCodexLensPython();
+  }
+
+  const venvDir = getCodexLensVenvDir();
+  const pythonwPath = join(venvDir, 'Scripts', 'pythonw.exe');
+  if (existsSync(pythonwPath)) {
+    return pythonwPath;
+  }
+
+  return getCodexLensPython();
+}
+
 /**
  * Get the pip executable path in the CodexLens venv.
  *
diff --git a/ccw/src/utils/python-utils.ts b/ccw/src/utils/python-utils.ts
index b5ac4975..9bf989d7 100644
--- a/ccw/src/utils/python-utils.ts
+++ b/ccw/src/utils/python-utils.ts
@@ -3,9 +3,19 @@
  * Shared module for consistent Python discovery across the application
  */
 
-import { execSync } from 'child_process';
+import { spawnSync, type SpawnSyncOptionsWithStringEncoding } from 'child_process';
 import { EXEC_TIMEOUTS } from './exec-constants.js';
 
+export interface PythonCommandSpec {
+  command: string;
+  args: string[];
+  display: string;
+}
+
+type HiddenPythonProbeOptions = Omit<SpawnSyncOptionsWithStringEncoding, 'encoding'> & {
+  encoding?: BufferEncoding;
+};
+
 function isExecTimeoutError(error: unknown): boolean {
   const err = error as { code?: unknown; errno?: unknown; message?: unknown } | null;
   const code = err?.code ?? err?.errno;
@@ -14,6 +24,98 @@ function isExecTimeoutError(error: unknown): boolean {
   return message.includes('ETIMEDOUT');
 }
 
+function quoteCommandPart(value: string): string {
+  if (!/[\s"]/.test(value)) {
+    return value;
+  }
+  return `"${value.replaceAll('"', '\\"')}"`;
+}
+
+function formatPythonCommandDisplay(command: string, args: string[]): string {
+  return [quoteCommandPart(command), ...args.map(quoteCommandPart)].join(' ');
+}
+
+function buildPythonCommandSpec(command: string, args: string[] = []): PythonCommandSpec {
+  return {
+    command,
+    args: [...args],
+    display: formatPythonCommandDisplay(command, args),
+  };
+}
+
+function tokenizeCommandSpec(raw: string): string[] {
+  const tokens: string[] = [];
+  const tokenPattern = /"((?:\\"|[^"])*)"|(\S+)/g;
+
+  for (const match of raw.matchAll(tokenPattern)) {
+    const quoted = match[1];
+    const plain = match[2];
+    if (quoted !== undefined) {
+      tokens.push(quoted.replaceAll('\\"', '"'));
+    } else if (plain !== undefined) {
+      tokens.push(plain);
+    }
+  }
+
+  return tokens;
+}
+
+export function parsePythonCommandSpec(raw: string): PythonCommandSpec {
+  const trimmed = raw.trim();
+  if (!trimmed) {
+    throw new Error('Python command cannot be empty');
+  }
+
+  // Unquoted executable paths on Windows commonly contain spaces.
+  if (!trimmed.includes('"') && /[\\/]/.test(trimmed)) {
+    return buildPythonCommandSpec(trimmed);
+  }
+
+  const tokens = tokenizeCommandSpec(trimmed);
+  if (tokens.length === 0) {
+    return buildPythonCommandSpec(trimmed);
+  }
+
+  return buildPythonCommandSpec(tokens[0], tokens.slice(1));
+}
+
+function buildPythonProbeOptions(
+  overrides: HiddenPythonProbeOptions = {},
+): SpawnSyncOptionsWithStringEncoding {
+  const { env, encoding, ...rest } = overrides;
+  return {
+    shell: false,
+    windowsHide: true,
+    timeout: EXEC_TIMEOUTS.PYTHON_VERSION,
+    stdio: ['ignore', 'pipe', 'pipe'],
+    env: { ...process.env, PYTHONIOENCODING: 'utf-8', ...env },
+    ...rest,
+    encoding: encoding ?? 'utf8',
+  };
+}
+
+export function probePythonCommandVersion(
+  pythonCommand: PythonCommandSpec,
+  runner: typeof spawnSync = spawnSync,
+): string {
+  const result = runner(
+    pythonCommand.command,
+    [...pythonCommand.args, '--version'],
+    buildPythonProbeOptions(),
+  );
+
+  if (result.error) {
+    throw result.error;
+  }
+
+  const versionOutput = `${result.stdout ?? ''}${result.stderr ?? ''}`.trim();
+  if (result.status !== 0) {
+    throw new Error(versionOutput || `Python version probe exited with code ${String(result.status)}`);
+  }
+
+  return versionOutput;
+}
+
 /**
  * Parse Python version string to major.minor numbers
  * @param versionStr - Version string like "Python 3.11.5"
@@ -42,66 +144,72 @@ export function isPythonVersionCompatible(major: number, minor: number): boolean
  * Detect available Python 3 executable
  * Supports CCW_PYTHON environment variable for custom Python path
  * On Windows, uses py launcher to find compatible versions
- * @returns Python executable command
+ * @returns Python executable command spec
  */
-export function getSystemPython(): string {
-  // Check for user-specified Python via environment variable
-  const customPython = process.env.CCW_PYTHON;
+export function getSystemPythonCommand(runner: typeof spawnSync = spawnSync): PythonCommandSpec {
+  const customPython = process.env.CCW_PYTHON?.trim();
   if (customPython) {
+    const customSpec = parsePythonCommandSpec(customPython);
     try {
-      const version = execSync(`"${customPython}" --version 2>&1`, { encoding: 'utf8', timeout: EXEC_TIMEOUTS.PYTHON_VERSION });
+      const version = probePythonCommandVersion(customSpec, runner);
       if (version.includes('Python 3')) {
         const parsed = parsePythonVersion(version);
         if (parsed && !isPythonVersionCompatible(parsed.major, parsed.minor)) {
-          console.warn(`[Python] Warning: CCW_PYTHON points to Python ${parsed.major}.${parsed.minor}, which may not be compatible with onnxruntime (requires 3.9-3.12)`);
+          console.warn(
+            `[Python] Warning: CCW_PYTHON points to Python ${parsed.major}.${parsed.minor}, which may not be compatible with onnxruntime (requires 3.9-3.12)`,
+          );
         }
-        return `"${customPython}"`;
+        return customSpec;
       }
     } catch (err: unknown) {
       if (isExecTimeoutError(err)) {
-        console.warn(`[Python] Warning: CCW_PYTHON version check timed out after ${EXEC_TIMEOUTS.PYTHON_VERSION}ms, falling back to system Python`);
+        console.warn(
+          `[Python] Warning: CCW_PYTHON version check timed out after ${EXEC_TIMEOUTS.PYTHON_VERSION}ms, falling back to system Python`,
+        );
       } else {
-        console.warn(`[Python] Warning: CCW_PYTHON="${customPython}" is not a valid Python executable, falling back to system Python`);
+        console.warn(
+          `[Python] Warning: CCW_PYTHON="${customPython}" is not a valid Python executable, falling back to system Python`,
+        );
       }
     }
   }
 
-  // On Windows, try py launcher with specific versions first (3.12, 3.11, 3.10, 3.9)
   if (process.platform === 'win32') {
     const compatibleVersions = ['3.12', '3.11', '3.10', '3.9'];
     for (const ver of compatibleVersions) {
+      const launcherSpec = buildPythonCommandSpec('py', [`-${ver}`]);
       try {
-        const version = execSync(`py -${ver} --version 2>&1`, { encoding: 'utf8', timeout: EXEC_TIMEOUTS.PYTHON_VERSION });
+        const version = probePythonCommandVersion(launcherSpec, runner);
         if (version.includes(`Python ${ver}`)) {
           console.log(`[Python] Found compatible Python ${ver} via py launcher`);
-          return `py -${ver}`;
+          return launcherSpec;
         }
       } catch (err: unknown) {
         if (isExecTimeoutError(err)) {
-          console.warn(`[Python] Warning: py -${ver} version check timed out after ${EXEC_TIMEOUTS.PYTHON_VERSION}ms`);
+          console.warn(
+            `[Python] Warning: py -${ver} version check timed out after ${EXEC_TIMEOUTS.PYTHON_VERSION}ms`,
+          );
         }
-        // Version not installed, try next
       }
     }
   }
 
   const commands = process.platform === 'win32' ? ['python', 'py', 'python3'] : ['python3', 'python'];
-  let fallbackCmd: string | null = null;
+  let fallbackCmd: PythonCommandSpec | null = null;
   let fallbackVersion: { major: number; minor: number } | null = null;
 
   for (const cmd of commands) {
+    const pythonSpec = buildPythonCommandSpec(cmd);
     try {
-      const version = execSync(`${cmd} --version 2>&1`, { encoding: 'utf8', timeout: EXEC_TIMEOUTS.PYTHON_VERSION });
+      const version = probePythonCommandVersion(pythonSpec, runner);
       if (version.includes('Python 3')) {
         const parsed = parsePythonVersion(version);
         if (parsed) {
-          // Prefer compatible version (3.9-3.12)
           if (isPythonVersionCompatible(parsed.major, parsed.minor)) {
-            return cmd;
+            return pythonSpec;
           }
-          // Keep track of first Python 3 found as fallback
           if (!fallbackCmd) {
-            fallbackCmd = cmd;
+            fallbackCmd = pythonSpec;
             fallbackVersion = parsed;
           }
         }
@@ -110,13 +218,14 @@ export function getSystemPython(): string {
       if (isExecTimeoutError(err)) {
         console.warn(`[Python] Warning: ${cmd} --version timed out after ${EXEC_TIMEOUTS.PYTHON_VERSION}ms`);
       }
-      // Try next command
     }
   }
 
-  // If no compatible version found, use fallback with warning
   if (fallbackCmd && fallbackVersion) {
-    console.warn(`[Python] Warning: Only Python ${fallbackVersion.major}.${fallbackVersion.minor} found, which may not be compatible with onnxruntime (requires 3.9-3.12).`);
+    console.warn(
+      `[Python] Warning: Only Python ${fallbackVersion.major}.${fallbackVersion.minor} found, which may not be compatible with onnxruntime (requires 3.9-3.12).`,
+    );
+    console.warn('[Python] Semantic search may fail with ImportError for onnxruntime.');
     console.warn('[Python] To use a specific Python version, set CCW_PYTHON environment variable:');
     console.warn('  Windows: set CCW_PYTHON=C:\\path\\to\\python.exe');
     console.warn('  Unix: export CCW_PYTHON=/path/to/python3.11');
@@ -124,7 +233,19 @@ export function getSystemPython(): string {
     return fallbackCmd;
   }
 
-  throw new Error('Python 3 not found. Please install Python 3.9-3.12 and ensure it is in PATH, or set CCW_PYTHON environment variable.');
+  throw new Error(
+    'Python 3 not found. Please install Python 3.9-3.12 and ensure it is in PATH, or set CCW_PYTHON environment variable.',
+  );
+}
+
+/**
+ * Detect available Python 3 executable
+ * Supports CCW_PYTHON environment variable for custom Python path
+ * On Windows, uses py launcher to find compatible versions
+ * @returns Python executable command
+ */
+export function getSystemPython(): string {
+  return getSystemPythonCommand().display;
 }
 
 /**
@@ -135,6 +256,14 @@ export function getPipCommand(): { pythonCmd: string; pipArgs: string[] } {
   const pythonCmd = getSystemPython();
   return {
     pythonCmd,
-    pipArgs: ['-m', 'pip']
+    pipArgs: ['-m', 'pip'],
   };
 }
+
+export const __testables = {
+  buildPythonCommandSpec,
+  buildPythonProbeOptions,
+  formatPythonCommandDisplay,
+  parsePythonCommandSpec,
+  probePythonCommandVersion,
+};
diff --git a/ccw/src/utils/uv-manager.ts b/ccw/src/utils/uv-manager.ts
index 37ff5121..4d9cb9ee 100644
--- a/ccw/src/utils/uv-manager.ts
+++ b/ccw/src/utils/uv-manager.ts
@@ -9,7 +9,7 @@
  * - Support for local project installs with extras
  */
 
-import { execSync, spawn } from 'child_process';
+import { spawn, spawnSync, type SpawnOptions, type SpawnSyncOptionsWithStringEncoding } from 'child_process';
 import { existsSync, mkdirSync } from 'fs';
 import { join, dirname } from 'path';
 import { homedir, platform, arch } from 'os';
@@ -52,6 +52,74 @@ const UV_BINARY_NAME = IS_WINDOWS ? 'uv.exe' : 'uv';
 const VENV_BIN_DIR = IS_WINDOWS ? 'Scripts' : 'bin';
 const PYTHON_EXECUTABLE = IS_WINDOWS ? 'python.exe' : 'python';
 
+type HiddenUvSpawnSyncOptions = Omit<SpawnSyncOptionsWithStringEncoding, 'encoding'> & {
+  encoding?: BufferEncoding;
+};
+
+function buildUvSpawnOptions(overrides: SpawnOptions = {}): SpawnOptions {
+  const { env, ...rest } = overrides;
+  return {
+    shell: false,
+    windowsHide: true,
+    env: { ...process.env, PYTHONIOENCODING: 'utf-8', ...env },
+    ...rest,
+  };
+}
+
+function buildUvSpawnSyncOptions(
+  overrides: HiddenUvSpawnSyncOptions = {},
+): SpawnSyncOptionsWithStringEncoding {
+  const { env, encoding, ...rest } = overrides;
+  return {
+    shell: false,
+    windowsHide: true,
+    env: { ...process.env, PYTHONIOENCODING: 'utf-8', ...env },
+    ...rest,
+    encoding: encoding ?? 'utf-8',
+  };
+}
+
+function findExecutableOnPath(executable: string, runner: typeof spawnSync = spawnSync): string | null {
+  const lookupCommand = IS_WINDOWS ? 'where' : 'which';
+  const result = runner(
+    lookupCommand,
+    [executable],
+    buildUvSpawnSyncOptions({
+      timeout: EXEC_TIMEOUTS.SYSTEM_INFO,
+      stdio: ['ignore', 'pipe', 'pipe'],
+    }),
+  );
+
+  if (result.error || result.status !== 0) {
+    return null;
+  }
+
+  const output = `${result.stdout ?? ''}`.trim();
+  if (!output) {
+    return null;
+  }
+
+  return output.split(/\r?\n/)[0] || null;
+}
+
+function hasWindowsPythonLauncherVersion(version: string, runner: typeof spawnSync = spawnSync): boolean {
+  const result = runner(
+    'py',
+    [`-${version}`, '--version'],
+    buildUvSpawnSyncOptions({
+      timeout: EXEC_TIMEOUTS.PYTHON_VERSION,
+      stdio: ['ignore', 'pipe', 'pipe'],
+    }),
+  );
+
+  if (result.error || result.status !== 0) {
+    return false;
+  }
+
+  const output = `${result.stdout ?? ''}${result.stderr ?? ''}`;
+  return output.includes(`Python ${version}`);
+}
+
 /**
  * Get the path to the UV binary
  * Search order:
@@ -105,15 +173,9 @@ export function getUvBinaryPath(): string {
   }
 
   // 4. Try system PATH using 'which' or 'where'
-  try {
-    const cmd = IS_WINDOWS ? 'where uv' : 'which uv';
-    const result = execSync(cmd, { encoding: 'utf-8', timeout: EXEC_TIMEOUTS.SYSTEM_INFO, stdio: ['pipe', 'pipe', 'pipe'] });
-    const foundPath = result.trim().split('\n')[0];
-    if (foundPath && existsSync(foundPath)) {
-      return foundPath;
-    }
-  } catch {
-    // UV not found in PATH
+  const foundPath = findExecutableOnPath('uv');
+  if (foundPath && existsSync(foundPath)) {
+    return foundPath;
   }
 
   // Return default path (may not exist)
@@ -135,10 +197,10 @@ export async function isUvAvailable(): Promise<boolean> {
   }
 
   return new Promise((resolve) => {
-    const child = spawn(uvPath, ['--version'], {
+    const child = spawn(uvPath, ['--version'], buildUvSpawnOptions({
       stdio: ['ignore', 'pipe', 'pipe'],
       timeout: EXEC_TIMEOUTS.PYTHON_VERSION,
-    });
+    }));
 
     child.on('close', (code) => {
       resolve(code === 0);
@@ -162,14 +224,14 @@ export async function getUvVersion(): Promise<string | null> {
   }
 
   return new Promise((resolve) => {
-    const child = spawn(uvPath, ['--version'], {
+    const child = spawn(uvPath, ['--version'], buildUvSpawnOptions({
       stdio: ['ignore', 'pipe', 'pipe'],
       timeout: EXEC_TIMEOUTS.PYTHON_VERSION,
-    });
+    }));
 
     let stdout = '';
 
-    child.stdout.on('data', (data) => {
+    child.stdout?.on('data', (data) => {
       stdout += data.toString();
     });
 
@@ -207,19 +269,29 @@ export async function ensureUvInstalled(): Promise<boolean> {
     if (IS_WINDOWS) {
       // Windows: Use PowerShell to run the install script
       const installCmd = 'irm https://astral.sh/uv/install.ps1 | iex';
-      child = spawn('powershell', ['-ExecutionPolicy', 'ByPass', '-Command', installCmd], {
-        stdio: 'inherit',
+      child = spawn('powershell', ['-ExecutionPolicy', 'ByPass', '-Command', installCmd], buildUvSpawnOptions({
+        stdio: ['pipe', 'pipe', 'pipe'],
         timeout: EXEC_TIMEOUTS.PACKAGE_INSTALL,
-      });
+      }));
     } else {
       // Unix: Use curl and sh
       const installCmd = 'curl -LsSf https://astral.sh/uv/install.sh | sh';
-      child = spawn('sh', ['-c', installCmd], {
-        stdio: 'inherit',
+      child = spawn('sh', ['-c', installCmd], buildUvSpawnOptions({
+        stdio: ['pipe', 'pipe', 'pipe'],
         timeout: EXEC_TIMEOUTS.PACKAGE_INSTALL,
-      });
+      }));
     }
 
+    child.stdout?.on('data', (data) => {
+      const line = data.toString().trim();
+      if (line) console.log(`[UV] ${line}`);
+    });
+
+    child.stderr?.on('data', (data) => {
+      const line = data.toString().trim();
+      if (line) console.log(`[UV] ${line}`);
+    });
+
     child.on('close', (code) => {
       if (code === 0) {
         console.log('[UV] UV installed successfully');
@@ -315,21 +387,21 @@ export class UvManager {
         console.log(`[UV] Python version: ${this.pythonVersion}`);
       }
 
-      const child = spawn(uvPath, args, {
+      const child = spawn(uvPath, args, buildUvSpawnOptions({
         stdio: ['ignore', 'pipe', 'pipe'],
         timeout: EXEC_TIMEOUTS.PROCESS_SPAWN,
-      });
+      }));
 
       let stderr = '';
 
-      child.stdout.on('data', (data) => {
+      child.stdout?.on('data', (data) => {
         const line = data.toString().trim();
         if (line) {
           console.log(`[UV] ${line}`);
         }
       });
 
-      child.stderr.on('data', (data) => {
+      child.stderr?.on('data', (data) => {
         stderr += data.toString();
         const line = data.toString().trim();
         if (line) {
@@ -390,22 +462,22 @@ export class UvManager {
 
       console.log(`[UV] Installing from project: ${installSpec} (editable: ${editable})`);
 
-      const child = spawn(uvPath, args, {
+      const child = spawn(uvPath, args, buildUvSpawnOptions({
         stdio: ['ignore', 'pipe', 'pipe'],
         timeout: EXEC_TIMEOUTS.PACKAGE_INSTALL,
         cwd: projectPath,
-      });
+      }));
 
       let stderr = '';
 
-      child.stdout.on('data', (data) => {
+      child.stdout?.on('data', (data) => {
         const line = data.toString().trim();
         if (line) {
           console.log(`[UV] ${line}`);
         }
       });
 
-      child.stderr.on('data', (data) => {
+      child.stderr?.on('data', (data) => {
         stderr += data.toString();
         const line = data.toString().trim();
         if (line && !line.startsWith('Resolved') && !line.startsWith('Prepared') && !line.startsWith('Installed')) {
@@ -460,21 +532,21 @@ export class UvManager {
 
       console.log(`[UV] Installing packages: ${packages.join(', ')}`);
 
-      const child = spawn(uvPath, args, {
+      const child = spawn(uvPath, args, buildUvSpawnOptions({
         stdio: ['ignore', 'pipe', 'pipe'],
         timeout: EXEC_TIMEOUTS.PACKAGE_INSTALL,
-      });
+      }));
 
       let stderr = '';
 
-      child.stdout.on('data', (data) => {
+      child.stdout?.on('data', (data) => {
         const line = data.toString().trim();
         if (line) {
           console.log(`[UV] ${line}`);
         }
       });
 
-      child.stderr.on('data', (data) => {
+      child.stderr?.on('data', (data) => {
         stderr += data.toString();
       });
 
@@ -524,21 +596,21 @@ export class UvManager {
 
       console.log(`[UV] Uninstalling packages: ${packages.join(', ')}`);
 
-      const child = spawn(uvPath, args, {
+      const child = spawn(uvPath, args, buildUvSpawnOptions({
         stdio: ['ignore', 'pipe', 'pipe'],
         timeout: EXEC_TIMEOUTS.PACKAGE_INSTALL,
-      });
+      }));
 
       let stderr = '';
 
-      child.stdout.on('data', (data) => {
+      child.stdout?.on('data', (data) => {
         const line = data.toString().trim();
         if (line) {
           console.log(`[UV] ${line}`);
         }
       });
 
-      child.stderr.on('data', (data) => {
+      child.stderr?.on('data', (data) => {
         stderr += data.toString();
       });
 
@@ -585,21 +657,21 @@ export class UvManager {
 
       console.log(`[UV] Syncing dependencies from: ${requirementsPath}`);
 
-      const child = spawn(uvPath, args, {
+      const child = spawn(uvPath, args, buildUvSpawnOptions({
         stdio: ['ignore', 'pipe', 'pipe'],
         timeout: EXEC_TIMEOUTS.PACKAGE_INSTALL,
-      });
+      }));
 
       let stderr = '';
 
-      child.stdout.on('data', (data) => {
+      child.stdout?.on('data', (data) => {
         const line = data.toString().trim();
         if (line) {
           console.log(`[UV] ${line}`);
         }
       });
 
-      child.stderr.on('data', (data) => {
+      child.stderr?.on('data', (data) => {
         stderr += data.toString();
       });
 
@@ -640,14 +712,14 @@ export class UvManager {
     return new Promise((resolve) => {
       const args = ['pip', 'list', '--format', 'json', '--python', this.getVenvPython()];
 
-      const child = spawn(uvPath, args, {
+      const child = spawn(uvPath, args, buildUvSpawnOptions({
         stdio: ['ignore', 'pipe', 'pipe'],
         timeout: EXEC_TIMEOUTS.PROCESS_SPAWN,
-      });
+      }));
 
       let stdout = '';
 
-      child.stdout.on('data', (data) => {
+      child.stdout?.on('data', (data) => {
         stdout += data.toString();
       });
 
@@ -704,20 +776,20 @@ export class UvManager {
     }
 
     return new Promise((resolve) => {
-      const child = spawn(pythonPath, args, {
+      const child = spawn(pythonPath, args, buildUvSpawnOptions({
         stdio: ['ignore', 'pipe', 'pipe'],
         timeout: options.timeout ?? EXEC_TIMEOUTS.PROCESS_SPAWN,
         cwd: options.cwd,
-      });
+      }));
 
       let stdout = '';
       let stderr = '';
 
-      child.stdout.on('data', (data) => {
+      child.stdout?.on('data', (data) => {
         stdout += data.toString();
       });
 
-      child.stderr.on('data', (data) => {
+      child.stderr?.on('data', (data) => {
         stderr += data.toString();
       });
 
@@ -779,17 +851,8 @@ export function getPreferredCodexLensPythonSpec(): string {
   // depend on onnxruntime 1.15.x wheels, which are not consistently available for cp312.
   const preferredVersions = ['3.11', '3.10', '3.12'];
   for (const version of preferredVersions) {
-    try {
-      const output = execSync(`py -${version} --version`, {
-        encoding: 'utf-8',
-        timeout: EXEC_TIMEOUTS.PYTHON_VERSION,
-        stdio: ['pipe', 'pipe', 'pipe'],
-      });
-      if (output.includes(`Python ${version}`)) {
-        return version;
-      }
-    } catch {
-      // Try next installed version
+    if (hasWindowsPythonLauncherVersion(version)) {
+      return version;
     }
   }
 
@@ -830,3 +893,10 @@ export async function bootstrapUvVenv(
   const manager = new UvManager({ venvPath, pythonVersion });
   return manager.createVenv();
 }
+
+export const __testables = {
+  buildUvSpawnOptions,
+  buildUvSpawnSyncOptions,
+  findExecutableOnPath,
+  hasWindowsPythonLauncherVersion,
+};
diff --git a/ccw/tests/cli-history-cross-project.test.js b/ccw/tests/cli-history-cross-project.test.js
new file mode 100644
index 00000000..b86fbd30
--- /dev/null
+++ b/ccw/tests/cli-history-cross-project.test.js
@@ -0,0 +1,118 @@
+/**
+ * Cross-project regression coverage for `ccw cli history` and `ccw cli detail`.
+ */
+
+import { after, afterEach, before, describe, it, mock } from 'node:test';
+import assert from 'node:assert/strict';
+import { mkdtempSync, rmSync } from 'node:fs';
+import { tmpdir } from 'node:os';
+import { join } from 'node:path';
+
+const TEST_CCW_HOME = mkdtempSync(join(tmpdir(), 'ccw-cli-history-cross-home-'));
+process.env.CCW_DATA_DIR = TEST_CCW_HOME;
+
+const cliCommandPath = new URL('../dist/commands/cli.js', import.meta.url).href;
+const cliExecutorPath = new URL('../dist/tools/cli-executor.js', import.meta.url).href;
+const historyStorePath = new URL('../dist/tools/cli-history-store.js', import.meta.url).href;
+
+function createConversation({ id, prompt, updatedAt }) {
+  return {
+    id,
+    created_at: updatedAt,
+    updated_at: updatedAt,
+    tool: 'gemini',
+    model: 'default',
+    mode: 'analysis',
+    category: 'user',
+    total_duration_ms: 456,
+    turn_count: 1,
+    latest_status: 'success',
+    turns: [
+      {
+        turn: 1,
+        timestamp: updatedAt,
+        prompt,
+        duration_ms: 456,
+        status: 'success',
+        exit_code: 0,
+        output: {
+          stdout: 'CROSS PROJECT OK',
+          stderr: '',
+          truncated: false,
+          cached: false,
+        },
+      },
+    ],
+  };
+}
+
+describe('ccw cli history/detail cross-project', async () => {
+  let cliModule;
+  let cliExecutorModule;
+  let historyStoreModule;
+
+  before(async () => {
+    cliModule = await import(cliCommandPath);
+    cliExecutorModule = await import(cliExecutorPath);
+    historyStoreModule = await import(historyStorePath);
+  });
+
+  afterEach(() => {
+    mock.restoreAll();
+    try {
+      historyStoreModule?.closeAllStores?.();
+    } catch {
+      // ignore
+    }
+  });
+
+  after(() => {
+    try {
+      historyStoreModule?.closeAllStores?.();
+    } catch {
+      // ignore
+    }
+    rmSync(TEST_CCW_HOME, { recursive: true, force: true });
+  });
+
+  it('finds history and detail for executions stored in another registered project', async () => {
+    const projectRoot = mkdtempSync(join(tmpdir(), 'ccw-cli-cross-project-history-'));
+    const unrelatedCwd = mkdtempSync(join(tmpdir(), 'ccw-cli-cross-project-cwd-'));
+    const previousCwd = process.cwd();
+
+    try {
+      const store = new historyStoreModule.CliHistoryStore(projectRoot);
+      store.saveConversation(createConversation({
+        id: 'CONV-CROSS-PROJECT-1',
+        prompt: 'Cross project prompt',
+        updatedAt: new Date('2025-02-01T00:00:01.000Z').toISOString(),
+      }));
+      store.close();
+
+      const logs = [];
+      mock.method(console, 'log', (...args) => {
+        logs.push(args.map(String).join(' '));
+      });
+      mock.method(console, 'error', (...args) => {
+        logs.push(args.map(String).join(' '));
+      });
+
+      process.chdir(unrelatedCwd);
+
+      await cliModule.cliCommand('history', [], { limit: '20' });
+      assert.ok(logs.some((line) => line.includes('CONV-CROSS-PROJECT-1')));
+
+      await cliExecutorModule.getExecutionHistoryAsync(projectRoot, { limit: 1 });
+
+      logs.length = 0;
+      await cliModule.cliCommand('detail', ['CONV-CROSS-PROJECT-1'], {});
+      assert.ok(logs.some((line) => line.includes('Conversation Detail')));
+      assert.ok(logs.some((line) => line.includes('CONV-CROSS-PROJECT-1')));
+      assert.ok(logs.some((line) => line.includes('Cross project prompt')));
+    } finally {
+      process.chdir(previousCwd);
+      rmSync(projectRoot, { recursive: true, force: true });
+      rmSync(unrelatedCwd, { recursive: true, force: true });
+    }
+  });
+});
diff --git a/ccw/tests/cli-output-command-final.test.js b/ccw/tests/cli-output-command-final.test.js
index 7bc6f9c8..b4c70456 100644
--- a/ccw/tests/cli-output-command-final.test.js
+++ b/ccw/tests/cli-output-command-final.test.js
@@ -123,6 +123,39 @@ describe('ccw cli output --final', async () => {
     }
   });
 
+  it('loads cached output from another registered project without --project', async () => {
+    const projectRoot = createTestProjectRoot();
+    const unrelatedCwd = createTestProjectRoot();
+    const previousCwd = process.cwd();
+    const store = new historyStoreModule.CliHistoryStore(projectRoot);
+
+    try {
+      store.saveConversation(createConversation({
+        id: 'EXEC-CROSS-PROJECT-OUTPUT',
+        stdoutFull: 'cross project raw output',
+        parsedOutput: 'cross project parsed output',
+        finalOutput: 'cross project final output',
+      }));
+
+      process.chdir(unrelatedCwd);
+
+      const logs = [];
+      mock.method(console, 'log', (...args) => {
+        logs.push(args.map(String).join(' '));
+      });
+      mock.method(console, 'error', () => {});
+
+      await cliModule.cliCommand('output', ['EXEC-CROSS-PROJECT-OUTPUT'], {});
+
+      assert.equal(logs.at(-1), 'cross project final output');
+    } finally {
+      process.chdir(previousCwd);
+      store.close();
+      rmSync(projectRoot, { recursive: true, force: true });
+      rmSync(unrelatedCwd, { recursive: true, force: true });
+    }
+  });
+
   it('fails fast for explicit --final when no final agent result can be recovered', async () => {
     const projectRoot = createTestProjectRoot();
     const store = new historyStoreModule.CliHistoryStore(projectRoot);
@@ -159,4 +192,34 @@ describe('ccw cli output --final', async () => {
       rmSync(projectRoot, { recursive: true, force: true });
     }
   });
+
+  it('prints CCW execution ID guidance when output cannot find the requested execution', async () => {
+    const projectRoot = createTestProjectRoot();
+    const previousCwd = process.cwd();
+
+    try {
+      process.chdir(projectRoot);
+
+      const errors = [];
+      const exitCodes = [];
+
+      mock.method(console, 'log', () => {});
+      mock.method(console, 'error', (...args) => {
+        errors.push(args.map(String).join(' '));
+      });
+      mock.method(process, 'exit', (code) => {
+        exitCodes.push(code);
+      });
+
+      await cliModule.cliCommand('output', ['rebuttal-structure-analysis'], {});
+
+      assert.deepEqual(exitCodes, [1]);
+      assert.ok(errors.some((line) => line.includes('real CCW execution ID')));
+      assert.ok(errors.some((line) => line.includes('CCW_EXEC_ID')));
+      assert.ok(errors.some((line) => line.includes('ccw cli show or ccw cli history')));
+    } finally {
+      process.chdir(previousCwd);
+      rmSync(projectRoot, { recursive: true, force: true });
+    }
+  });
 });
diff --git a/ccw/tests/cli-show-running-time.test.js b/ccw/tests/cli-show-running-time.test.js
index fec65e4a..158b0bde 100644
--- a/ccw/tests/cli-show-running-time.test.js
+++ b/ccw/tests/cli-show-running-time.test.js
@@ -163,6 +163,42 @@ describe('ccw cli show running time formatting', async () => {
     assert.match(rendered, /1h\.\.\./);
   });
 
+  it('lists executions from other registered projects in show output', async () => {
+    const projectRoot = mkdtempSync(join(tmpdir(), 'ccw-cli-show-cross-project-'));
+    const unrelatedCwd = mkdtempSync(join(tmpdir(), 'ccw-cli-show-cross-cwd-'));
+    const previousCwd = process.cwd();
+
+    try {
+      process.chdir(unrelatedCwd);
+      const store = new historyStoreModule.CliHistoryStore(projectRoot);
+      store.saveConversation(createConversationRecord({
+        id: 'EXEC-CROSS-PROJECT-SHOW',
+        prompt: 'cross project show prompt',
+        updatedAt: new Date('2025-02-02T00:00:00.000Z').toISOString(),
+        durationMs: 1800,
+      }));
+      store.close();
+
+      stubActiveExecutionsResponse([]);
+
+      const logs = [];
+      mock.method(console, 'log', (...args) => {
+        logs.push(args.map(String).join(' '));
+      });
+      mock.method(console, 'error', () => {});
+
+      await cliModule.cliCommand('show', [], {});
+
+      const rendered = logs.join('\n');
+      assert.match(rendered, /EXEC-CROSS-PROJECT-SHOW/);
+      assert.match(rendered, /cross project show prompt/);
+    } finally {
+      process.chdir(previousCwd);
+      rmSync(projectRoot, { recursive: true, force: true });
+      rmSync(unrelatedCwd, { recursive: true, force: true });
+    }
+  });
+
   it('suppresses stale running rows when saved history is newer than the active start time', async () => {
     const projectRoot = mkdtempSync(join(tmpdir(), 'ccw-cli-show-stale-project-'));
     const previousCwd = process.cwd();
diff --git a/ccw/tests/codex-lens-cli-compat.test.js b/ccw/tests/codex-lens-cli-compat.test.js
index 96a0cedc..51573a1e 100644
--- a/ccw/tests/codex-lens-cli-compat.test.js
+++ b/ccw/tests/codex-lens-cli-compat.test.js
@@ -13,6 +13,38 @@ after(() => {
 });
 
 describe('CodexLens CLI compatibility retries', () => {
+  it('builds hidden Python spawn options for CLI invocations', async () => {
+    const moduleUrl = new URL(`../dist/tools/codex-lens.js?spawn-opts=${Date.now()}`, import.meta.url).href;
+    const { __testables } = await import(moduleUrl);
+
+    const options = __testables.buildCodexLensSpawnOptions(tmpdir(), 12345);
+
+    assert.equal(options.cwd, tmpdir());
+    assert.equal(options.shell, false);
+    assert.equal(options.timeout, 12345);
+    assert.equal(options.windowsHide, true);
+    assert.equal(options.env.PYTHONIOENCODING, 'utf-8');
+  });
+
+  it('probes Python version without a shell-backed console window', async () => {
+    const moduleUrl = new URL(`../dist/tools/codex-lens.js?python-probe=${Date.now()}`, import.meta.url).href;
+    const { __testables } = await import(moduleUrl);
+    const probeCalls = [];
+
+    const version = __testables.probePythonVersion({ command: 'python', args: [], display: 'python' }, (command, args, options) => {
+      probeCalls.push({ command, args, options });
+      return { status: 0, stdout: '', stderr: 'Python 3.11.9\n' };
+    });
+
+    assert.equal(version, 'Python 3.11.9');
+    assert.equal(probeCalls.length, 1);
+    assert.equal(probeCalls[0].command, 'python');
+    assert.deepEqual(probeCalls[0].args, ['--version']);
+    assert.equal(probeCalls[0].options.shell, false);
+    assert.equal(probeCalls[0].options.windowsHide, true);
+    assert.equal(probeCalls[0].options.env.PYTHONIOENCODING, 'utf-8');
+  });
+
   it('initializes a tiny index even when CLI emits compatibility conflicts first', async () => {
     const moduleUrl = new URL(`../dist/tools/codex-lens.js?compat=${Date.now()}`, import.meta.url).href;
     const { checkVenvStatus, executeCodexLens } = await import(moduleUrl);
@@ -32,4 +64,76 @@ describe('CodexLens CLI compatibility retries', () => {
     assert.equal(result.success, true, result.error ?? 'Expected init to succeed');
     assert.ok((result.output ?? '').length > 0 || (result.warning ?? '').length > 0, 'Expected init output or compatibility warning');
   });
+
+  it('synthesizes a machine-readable fallback when JSON search output is empty', async () => {
+    const moduleUrl = new URL(`../dist/tools/codex-lens.js?compat-empty=${Date.now()}`, import.meta.url).href;
+    const { __testables } = await import(moduleUrl);
+
+    const normalized = __testables.normalizeSearchCommandResult(
+      { success: true },
+      { query: 'missing symbol', cwd: tmpdir(), limit: 5, filesOnly: false },
+    );
+
+    assert.equal(normalized.success, true);
+    assert.match(normalized.warning ?? '', /empty stdout/i);
+    assert.deepEqual(normalized.results, {
+      success: true,
+      result: {
+        query: 'missing symbol',
+        count: 0,
+        results: [],
+      },
+    });
+  });
+
+  it('returns structured semantic search results for a local embedded workspace', async () => {
+    const codexLensUrl = new URL(`../dist/tools/codex-lens.js?compat-search=${Date.now()}`, import.meta.url).href;
+    const smartSearchUrl = new URL(`../dist/tools/smart-search.js?compat-search=${Date.now()}`, import.meta.url).href;
+    const codexLensModule = await import(codexLensUrl);
+    const smartSearchModule = await import(smartSearchUrl);
+
+    const ready = await codexLensModule.checkVenvStatus(true);
+    if (!ready.ready) {
+      console.log('Skipping: CodexLens not ready');
+      return;
+    }
+
+    const semantic = await codexLensModule.checkSemanticStatus();
+    if (!semantic.available) {
+      console.log('Skipping: semantic dependencies not ready');
+      return;
+    }
+
+    const projectDir = mkdtempSync(join(tmpdir(), 'codexlens-search-'));
+    tempDirs.push(projectDir);
+    writeFileSync(
+      join(projectDir, 'sample.ts'),
+      'export function greet(name) { return `hello ${name}`; }\nexport const sum = (a, b) => a + b;\n',
+    );
+
+    const init = await smartSearchModule.handler({ action: 'init', path: projectDir });
+    assert.equal(init.success, true, init.error ?? 'Expected smart-search init to succeed');
+
+    const embed = await smartSearchModule.handler({
+      action: 'embed',
+      path: projectDir,
+      embeddingBackend: 'local',
+      force: true,
+    });
+    assert.equal(embed.success, true, embed.error ?? 'Expected smart-search embed to succeed');
+
+    const result = await codexLensModule.codexLensTool.execute({
+      action: 'search',
+      path: projectDir,
+      query: 'greet function',
+      mode: 'semantic',
+      format: 'json',
+    });
+
+    assert.equal(result.success, true, result.error ?? 'Expected semantic search compatibility fallback to succeed');
+    const payload = result.results?.result ?? result.results;
+    assert.ok(Array.isArray(payload?.results), 'Expected structured search results payload');
+    assert.ok(payload.results.length > 0, 'Expected at least one structured semantic search result');
+    assert.doesNotMatch(result.error ?? '', /unexpected extra arguments/i);
+  });
 });
diff --git a/ccw/tests/codexlens-path.test.js b/ccw/tests/codexlens-path.test.js
new file mode 100644
index 00000000..34f6163c
--- /dev/null
+++ b/ccw/tests/codexlens-path.test.js
@@ -0,0 +1,66 @@
+import { after, afterEach, describe, it } from 'node:test';
+import assert from 'node:assert/strict';
+import { mkdtempSync, rmSync } from 'node:fs';
+import { createRequire, syncBuiltinESMExports } from 'node:module';
+import { tmpdir } from 'node:os';
+import { join } from 'node:path';
+
+const require = createRequire(import.meta.url);
+// eslint-disable-next-line @typescript-eslint/no-var-requires
+const fs = require('node:fs');
+
+const originalExistsSync = fs.existsSync;
+const originalCodexLensDataDir = process.env.CODEXLENS_DATA_DIR;
+const tempDirs = [];
+
+afterEach(() => {
+  fs.existsSync = originalExistsSync;
+  syncBuiltinESMExports();
+
+  if (originalCodexLensDataDir === undefined) {
+    delete process.env.CODEXLENS_DATA_DIR;
+  } else {
+    process.env.CODEXLENS_DATA_DIR = originalCodexLensDataDir;
+  }
+});
+
+after(() => {
+  while (tempDirs.length > 0) {
+    rmSync(tempDirs.pop(), { recursive: true, force: true });
+  }
+});
+
+describe('codexlens-path hidden python selection', () => {
+  it('prefers pythonw.exe for hidden Windows subprocesses when available', async () => {
+    if (process.platform !== 'win32') {
+      return;
+    }
+
+    const dataDir = mkdtempSync(join(tmpdir(), 'ccw-codexlens-hidden-python-'));
+    tempDirs.push(dataDir);
+    process.env.CODEXLENS_DATA_DIR = dataDir;
+
+    const expectedPythonw = join(dataDir, 'venv', 'Scripts', 'pythonw.exe');
+    fs.existsSync = (path) => String(path) === expectedPythonw;
+    syncBuiltinESMExports();
+
+    const moduleUrl = new URL(`../dist/utils/codexlens-path.js?t=${Date.now()}`, import.meta.url);
+    const mod = await import(moduleUrl.href);
+
+    assert.equal(mod.getCodexLensHiddenPython(), expectedPythonw);
+  });
+
+  it('falls back to python.exe when pythonw.exe is unavailable', async () => {
+    const dataDir = mkdtempSync(join(tmpdir(), 'ccw-codexlens-hidden-fallback-'));
+    tempDirs.push(dataDir);
+    process.env.CODEXLENS_DATA_DIR = dataDir;
+
+    fs.existsSync = () => false;
+    syncBuiltinESMExports();
+
+    const moduleUrl = new URL(`../dist/utils/codexlens-path.js?t=${Date.now()}`, import.meta.url);
+    const mod = await import(moduleUrl.href);
+
+    assert.equal(mod.getCodexLensHiddenPython(), mod.getCodexLensPython());
+  });
+});
diff --git a/ccw/tests/embedding-batch.test.ts b/ccw/tests/embedding-batch.test.ts
index 59e0768c..ded8a82e 100644
--- a/ccw/tests/embedding-batch.test.ts
+++ b/ccw/tests/embedding-batch.test.ts
@@ -105,7 +105,10 @@ describe('memory-embedder-bridge', () => {
     assert.equal(spawnCalls.length, 1);
     assert.equal(spawnCalls[0].args.at(-2), 'status');
     assert.equal(spawnCalls[0].args.at(-1), 'C:\\tmp\\db.sqlite');
+    assert.equal(spawnCalls[0].options.shell, false);
     assert.equal(spawnCalls[0].options.timeout, 30000);
+    assert.equal(spawnCalls[0].options.windowsHide, true);
+    assert.equal(spawnCalls[0].options.env.PYTHONIOENCODING, 'utf-8');
   });
 
   it('generateEmbeddings builds args for sourceId, batchSize, and force', async () => {
@@ -138,7 +141,10 @@ describe('memory-embedder-bridge', () => {
     assert.equal(args[batchSizeIndex + 1], '4');
 
     assert.ok(args.includes('--force'));
+    assert.equal(spawnCalls[0].options.shell, false);
     assert.equal(spawnCalls[0].options.timeout, 300000);
+    assert.equal(spawnCalls[0].options.windowsHide, true);
+    assert.equal(spawnCalls[0].options.env.PYTHONIOENCODING, 'utf-8');
 
     spawnCalls.length = 0;
     spawnPlan.push({
diff --git a/ccw/tests/litellm-client.test.ts b/ccw/tests/litellm-client.test.ts
index 54d757a1..5400ebd3 100644
--- a/ccw/tests/litellm-client.test.ts
+++ b/ccw/tests/litellm-client.test.ts
@@ -103,7 +103,7 @@ describe('LiteLLM client bridge', () => {
 
     assert.equal(available, true);
     assert.equal(spawnCalls.length, 1);
-    assert.equal(spawnCalls[0].command, 'python');
+    assert.equal(spawnCalls[0].command, mod.getCodexLensVenvPython());
     assert.deepEqual(spawnCalls[0].args, ['-m', 'ccw_litellm.cli', 'version']);
   });
 
@@ -117,6 +117,19 @@ describe('LiteLLM client bridge', () => {
     assert.equal(spawnCalls[0].command, 'python3');
   });
 
+  it('spawns LiteLLM Python with hidden window options', async () => {
+    spawnPlan.push({ type: 'close', code: 0, stdout: '1.2.3\n' });
+
+    const client = new mod.LiteLLMClient({ timeout: 10 });
+    const available = await client.isAvailable();
+
+    assert.equal(available, true);
+    assert.equal(spawnCalls.length, 1);
+    assert.equal(spawnCalls[0].options.shell, false);
+    assert.equal(spawnCalls[0].options.windowsHide, true);
+    assert.equal(spawnCalls[0].options.env.PYTHONIOENCODING, 'utf-8');
+  });
+
   it('isAvailable returns false on spawn error', async () => {
     spawnPlan.push({ type: 'error', error: new Error('ENOENT') });
 
@@ -154,7 +167,7 @@ describe('LiteLLM client bridge', () => {
 
     assert.deepEqual(cfg, { ok: true });
     assert.equal(spawnCalls.length, 1);
-    assert.deepEqual(spawnCalls[0].args, ['-m', 'ccw_litellm.cli', 'config', '--json']);
+    assert.deepEqual(spawnCalls[0].args, ['-m', 'ccw_litellm.cli', 'config']);
   });
 
   it('getConfig throws on malformed JSON', async () => {
diff --git a/ccw/tests/smart-search-intent.test.js b/ccw/tests/smart-search-intent.test.js
index 7ae159dc..0c51f6bf 100644
--- a/ccw/tests/smart-search-intent.test.js
+++ b/ccw/tests/smart-search-intent.test.js
@@ -76,6 +76,26 @@ describe('Smart Search - Query Intent + RRF Weights', async () => {
     });
   });
 
+  describe('classifyIntent lexical routing', () => {
+    it('routes config/backend queries to exact when index and embeddings are available', () => {
+      if (!smartSearchModule) return;
+      const classification = smartSearchModule.__testables.classifyIntent(
+        'embedding backend fastembed local litellm api config',
+        true,
+        true,
+      );
+      assert.strictEqual(classification.mode, 'exact');
+      assert.match(classification.reasoning, /lexical priority/i);
+    });
+
+    it('routes generated artifact queries to exact when index and embeddings are available', () => {
+      if (!smartSearchModule) return;
+      const classification = smartSearchModule.__testables.classifyIntent('dist bundle output', true, true);
+      assert.strictEqual(classification.mode, 'exact');
+      assert.match(classification.reasoning, /generated artifact/i);
+    });
+  });
+
   describe('adjustWeightsByIntent', () => {
     it('maps keyword intent to exact-heavy weights', () => {
       if (!smartSearchModule) return;
@@ -119,4 +139,3 @@ describe('Smart Search - Query Intent + RRF Weights', async () => {
     });
   });
 });
-
diff --git a/ccw/tests/smart-search-mcp-usage.test.js b/ccw/tests/smart-search-mcp-usage.test.js
index 2097d5e7..889af31d 100644
--- a/ccw/tests/smart-search-mcp-usage.test.js
+++ b/ccw/tests/smart-search-mcp-usage.test.js
@@ -1,16 +1,19 @@
-import { afterEach, before, describe, it } from 'node:test';
+import { after, afterEach, before, describe, it } from 'node:test';
 import assert from 'node:assert/strict';
 import { mkdirSync, mkdtempSync, rmSync, writeFileSync } from 'node:fs';
 import { tmpdir } from 'node:os';
 import { join } from 'node:path';
 
 const smartSearchPath = new URL('../dist/tools/smart-search.js', import.meta.url).href;
+const originalAutoInitMissing = process.env.CODEXLENS_AUTO_INIT_MISSING;
+const originalAutoEmbedMissing = process.env.CODEXLENS_AUTO_EMBED_MISSING;
 
 describe('Smart Search MCP usage defaults and path handling', async () => {
   let smartSearchModule;
   const tempDirs = [];
 
   before(async () => {
+    process.env.CODEXLENS_AUTO_INIT_MISSING = 'false';
     try {
       smartSearchModule = await import(smartSearchPath);
     } catch (err) {
@@ -18,10 +21,30 @@ describe('Smart Search MCP usage defaults and path handling', async () => {
     }
   });
 
+  after(() => {
+    if (originalAutoInitMissing === undefined) {
+      delete process.env.CODEXLENS_AUTO_INIT_MISSING;
+    } else {
+      process.env.CODEXLENS_AUTO_INIT_MISSING = originalAutoInitMissing;
+    }
+
+    if (originalAutoEmbedMissing === undefined) {
+      delete process.env.CODEXLENS_AUTO_EMBED_MISSING;
+      return;
+    }
+    process.env.CODEXLENS_AUTO_EMBED_MISSING = originalAutoEmbedMissing;
+  });
+
   afterEach(() => {
     while (tempDirs.length > 0) {
       rmSync(tempDirs.pop(), { recursive: true, force: true });
     }
+    if (smartSearchModule?.__testables) {
+      smartSearchModule.__testables.__resetRuntimeOverrides();
+      smartSearchModule.__testables.__resetBackgroundJobs();
+    }
+    process.env.CODEXLENS_AUTO_INIT_MISSING = 'false';
+    delete process.env.CODEXLENS_AUTO_EMBED_MISSING;
   });
 
   function createWorkspace() {
@@ -30,6 +53,15 @@ describe('Smart Search MCP usage defaults and path handling', async () => {
     return dir;
   }
 
+  function createDetachedChild() {
+    return {
+      on() {
+        return this;
+      },
+      unref() {},
+    };
+  }
+
   it('keeps schema defaults aligned with runtime docs', () => {
     if (!smartSearchModule) return;
 
@@ -50,14 +82,202 @@ describe('Smart Search MCP usage defaults and path handling', async () => {
     assert.equal(props.output_mode.default, 'ace');
   });
 
-  it('defaults auto embedding warmup to enabled unless explicitly disabled', () => {
+  it('defaults auto embedding warmup off on Windows unless explicitly enabled', () => {
     if (!smartSearchModule) return;
 
     const { __testables } = smartSearchModule;
-    assert.equal(__testables.isAutoEmbedMissingEnabled(undefined), true);
-    assert.equal(__testables.isAutoEmbedMissingEnabled({}), true);
-    assert.equal(__testables.isAutoEmbedMissingEnabled({ embedding_auto_embed_missing: true }), true);
+    delete process.env.CODEXLENS_AUTO_EMBED_MISSING;
+    assert.equal(__testables.isAutoEmbedMissingEnabled(undefined), process.platform !== 'win32');
+    assert.equal(__testables.isAutoEmbedMissingEnabled({}), process.platform !== 'win32');
+    assert.equal(
+      __testables.isAutoEmbedMissingEnabled({ embedding_auto_embed_missing: true }),
+      process.platform === 'win32' ? false : true,
+    );
     assert.equal(__testables.isAutoEmbedMissingEnabled({ embedding_auto_embed_missing: false }), false);
+    process.env.CODEXLENS_AUTO_EMBED_MISSING = 'true';
+    assert.equal(__testables.isAutoEmbedMissingEnabled({ embedding_auto_embed_missing: false }), true);
+    process.env.CODEXLENS_AUTO_EMBED_MISSING = 'off';
+    assert.equal(__testables.isAutoEmbedMissingEnabled({ embedding_auto_embed_missing: true }), false);
+  });
+
+  it('defaults auto index warmup off on Windows unless explicitly enabled', () => {
+    if (!smartSearchModule) return;
+
+    const { __testables } = smartSearchModule;
+    delete process.env.CODEXLENS_AUTO_INIT_MISSING;
+    assert.equal(__testables.isAutoInitMissingEnabled(), process.platform !== 'win32');
+    process.env.CODEXLENS_AUTO_INIT_MISSING = 'off';
+    assert.equal(__testables.isAutoInitMissingEnabled(), false);
+    process.env.CODEXLENS_AUTO_INIT_MISSING = '1';
+    assert.equal(__testables.isAutoInitMissingEnabled(), true);
+  });
+
+  it('explains when Windows disables background warmup by default', () => {
+    if (!smartSearchModule) return;
+
+    const { __testables } = smartSearchModule;
+    delete process.env.CODEXLENS_AUTO_INIT_MISSING;
+    delete process.env.CODEXLENS_AUTO_EMBED_MISSING;
+
+    const initReason = __testables.getAutoInitMissingDisabledReason();
+    const embedReason = __testables.getAutoEmbedMissingDisabledReason({});
+
+    if (process.platform === 'win32') {
+      assert.match(initReason, /disabled by default on Windows/i);
+      assert.match(embedReason, /disabled by default on Windows/i);
+      assert.match(embedReason, /auto_embed_missing=true/i);
+    } else {
+      assert.match(initReason, /disabled/i);
+      assert.match(embedReason, /disabled/i);
+    }
+  });
+
+  it('builds hidden subprocess options for Smart Search child processes', () => {
+    if (!smartSearchModule) return;
+
+    const options = smartSearchModule.__testables.buildSmartSearchSpawnOptions(tmpdir(), {
+      detached: true,
+      stdio: 'ignore',
+      timeout: 12345,
+    });
+
+    assert.equal(options.cwd, tmpdir());
+    assert.equal(options.shell, false);
+    assert.equal(options.windowsHide, true);
+    assert.equal(options.detached, true);
+    assert.equal(options.timeout, 12345);
+    assert.equal(options.env.PYTHONIOENCODING, 'utf-8');
+  });
+
+  it('avoids detached background warmup children on Windows consoles', () => {
+    if (!smartSearchModule) return;
+
+    assert.equal(
+      smartSearchModule.__testables.shouldDetachBackgroundSmartSearchProcess(),
+      process.platform !== 'win32',
+    );
+  });
+
+  it('checks tool availability without shell-based lookup popups', () => {
+    if (!smartSearchModule) return;
+
+    const lookupCalls = [];
+    const available = smartSearchModule.__testables.checkToolAvailability(
+      'rg',
+      (command, args, options) => {
+        lookupCalls.push({ command, args, options });
+        return { status: 0, stdout: '', stderr: '' };
+      },
+    );
+
+    assert.equal(available, true);
+    assert.equal(lookupCalls.length, 1);
+    assert.equal(lookupCalls[0].command, process.platform === 'win32' ? 'where' : 'which');
+    assert.deepEqual(lookupCalls[0].args, ['rg']);
+    assert.equal(lookupCalls[0].options.shell, false);
+    assert.equal(lookupCalls[0].options.windowsHide, true);
+    assert.equal(lookupCalls[0].options.stdio, 'ignore');
+    assert.equal(lookupCalls[0].options.env.PYTHONIOENCODING, 'utf-8');
+  });
+
+  it('starts background static index build once for unindexed paths', async () => {
+    if (!smartSearchModule) return;
+
+    const { __testables } = smartSearchModule;
+    const dir = createWorkspace();
+    const fakePython = join(dir, 'python.exe');
+    writeFileSync(fakePython, '');
+    process.env.CODEXLENS_AUTO_INIT_MISSING = 'true';
+
+    const spawnCalls = [];
+    __testables.__setRuntimeOverrides({
+      getVenvPythonPath: () => fakePython,
+      now: () => 1234567890,
+      spawnProcess: (command, args, options) => {
+        spawnCalls.push({ command, args, options });
+        return createDetachedChild();
+      },
+    });
+
+    const scope = { workingDirectory: dir, searchPaths: ['.'] };
+    const indexStatus = { indexed: false, has_embeddings: false };
+
+    const first = await __testables.maybeStartBackgroundAutoInit(scope, indexStatus);
+    const second = await __testables.maybeStartBackgroundAutoInit(scope, indexStatus);
+
+    assert.match(first.note, /started/i);
+    assert.match(second.note, /already running/i);
+    assert.equal(spawnCalls.length, 1);
+    assert.equal(spawnCalls[0].command, fakePython);
+    assert.deepEqual(spawnCalls[0].args, ['-m', 'codexlens', 'index', 'init', dir, '--no-embeddings']);
+    assert.equal(spawnCalls[0].options.cwd, dir);
+    assert.equal(
+      spawnCalls[0].options.detached,
+      smartSearchModule.__testables.shouldDetachBackgroundSmartSearchProcess(),
+    );
+    assert.equal(spawnCalls[0].options.windowsHide, true);
+  });
+
+  it('starts background embedding build without detached Windows consoles', async () => {
+    if (!smartSearchModule) return;
+
+    const { __testables } = smartSearchModule;
+    const dir = createWorkspace();
+    const fakePython = join(dir, 'python.exe');
+    writeFileSync(fakePython, '');
+    process.env.CODEXLENS_AUTO_EMBED_MISSING = 'true';
+
+    const spawnCalls = [];
+    __testables.__setRuntimeOverrides({
+      getVenvPythonPath: () => fakePython,
+      checkSemanticStatus: async () => ({ available: true, litellmAvailable: true }),
+      now: () => 1234567890,
+      spawnProcess: (command, args, options) => {
+        spawnCalls.push({ command, args, options });
+        return createDetachedChild();
+      },
+    });
+
+    const status = await __testables.maybeStartBackgroundAutoEmbed(
+      { workingDirectory: dir, searchPaths: ['.'] },
+      {
+        indexed: true,
+        has_embeddings: false,
+        config: { embedding_backend: 'fastembed' },
+      },
+    );
+
+    assert.match(status.note, /started/i);
+    assert.equal(spawnCalls.length, 1);
+    assert.equal(spawnCalls[0].command, fakePython);
+    assert.deepEqual(spawnCalls[0].args.slice(0, 1), ['-c']);
+    assert.equal(spawnCalls[0].options.cwd, dir);
+    assert.equal(
+      spawnCalls[0].options.detached,
+      smartSearchModule.__testables.shouldDetachBackgroundSmartSearchProcess(),
+    );
+    assert.equal(spawnCalls[0].options.windowsHide, true);
+    assert.equal(spawnCalls[0].options.stdio, 'ignore');
+  });
+
+  it('surfaces warnings when background static index warmup cannot start', async () => {
+    if (!smartSearchModule) return;
+
+    const { __testables } = smartSearchModule;
+    const dir = createWorkspace();
+    process.env.CODEXLENS_AUTO_INIT_MISSING = 'true';
+
+    __testables.__setRuntimeOverrides({
+      getVenvPythonPath: () => join(dir, 'missing-python.exe'),
+    });
+
+    const status = await __testables.maybeStartBackgroundAutoInit(
+      { workingDirectory: dir, searchPaths: ['.'] },
+      { indexed: false, has_embeddings: false },
+    );
+
+    assert.match(status.warning, /Automatic static index warmup could not start/i);
+    assert.match(status.warning, /not ready yet/i);
   });
 
   it('honors explicit small limit values', async () => {
@@ -246,15 +466,98 @@ describe('Smart Search MCP usage defaults and path handling', async () => {
     assert.match(String(matches[0].file).replace(/\\/g, '/'), /target\.ts$/);
   });
 
-  it('detects centralized vector artifacts as full embedding coverage evidence', () => {
+  it('uses root-scoped embedding status instead of subtree artifacts', () => {
     if (!smartSearchModule) return;
 
-    const dir = createWorkspace();
-    writeFileSync(join(dir, '_vectors.hnsw'), 'hnsw');
-    writeFileSync(join(dir, '_vectors_meta.db'), 'meta');
-    writeFileSync(join(dir, '_binary_vectors.mmap'), 'mmap');
+    const summary = smartSearchModule.__testables.extractEmbeddingsStatusSummary({
+      total_indexes: 3,
+      indexes_with_embeddings: 2,
+      total_chunks: 24,
+      coverage_percent: 66.7,
+      root: {
+        total_files: 4,
+        files_with_embeddings: 0,
+        total_chunks: 0,
+        coverage_percent: 0,
+        has_embeddings: false,
+      },
+      subtree: {
+        total_indexes: 3,
+        indexes_with_embeddings: 2,
+        total_files: 12,
+        files_with_embeddings: 8,
+        total_chunks: 24,
+        coverage_percent: 66.7,
+      },
+      centralized: {
+        dense_index_exists: true,
+        binary_index_exists: true,
+        meta_db_exists: true,
+        usable: false,
+      },
+    });
 
-    assert.equal(smartSearchModule.__testables.hasCentralizedVectorArtifacts(dir), true);
+    assert.equal(summary.coveragePercent, 0);
+    assert.equal(summary.totalChunks, 0);
+    assert.equal(summary.hasEmbeddings, false);
+  });
+
+  it('accepts validated root centralized readiness from CLI status payloads', () => {
+    if (!smartSearchModule) return;
+
+    const summary = smartSearchModule.__testables.extractEmbeddingsStatusSummary({
+      total_indexes: 2,
+      indexes_with_embeddings: 1,
+      total_chunks: 10,
+      coverage_percent: 25,
+      root: {
+        total_files: 2,
+        files_with_embeddings: 1,
+        total_chunks: 3,
+        coverage_percent: 50,
+        has_embeddings: true,
+      },
+      centralized: {
+        usable: true,
+        dense_ready: true,
+        chunk_metadata_rows: 3,
+      },
+    });
+
+    assert.equal(summary.coveragePercent, 50);
+    assert.equal(summary.totalChunks, 3);
+    assert.equal(summary.hasEmbeddings, true);
+  });
+
+  it('prefers embeddings_status over legacy embeddings summary payloads', () => {
+    if (!smartSearchModule) return;
+
+    const payload = smartSearchModule.__testables.selectEmbeddingsStatusPayload({
+      embeddings: {
+        total_indexes: 7,
+        indexes_with_embeddings: 4,
+        total_chunks: 99,
+      },
+      embeddings_status: {
+        total_indexes: 7,
+        total_chunks: 3,
+        root: {
+          total_files: 2,
+          files_with_embeddings: 1,
+          total_chunks: 3,
+          coverage_percent: 50,
+          has_embeddings: true,
+        },
+        centralized: {
+          usable: true,
+          dense_ready: true,
+          chunk_metadata_rows: 3,
+        },
+      },
+    });
+
+    assert.equal(payload.root.total_chunks, 3);
+    assert.equal(payload.centralized.usable, true);
   });
 
   it('recognizes CodexLens CLI compatibility failures and invalid regex fallback', () => {
@@ -281,6 +584,37 @@ describe('Smart Search MCP usage defaults and path handling', async () => {
     assert.match(resolution.warning, /literal ripgrep matching/i);
   });
 
+  it('suppresses compatibility-only fuzzy warnings when ripgrep already produced hits', () => {
+    if (!smartSearchModule) return;
+
+    assert.equal(
+      smartSearchModule.__testables.shouldSurfaceCodexLensFtsCompatibilityWarning({
+        compatibilityTriggeredThisQuery: true,
+        skipExactDueToCompatibility: false,
+        ripgrepResultCount: 2,
+      }),
+      false,
+    );
+
+    assert.equal(
+      smartSearchModule.__testables.shouldSurfaceCodexLensFtsCompatibilityWarning({
+        compatibilityTriggeredThisQuery: true,
+        skipExactDueToCompatibility: false,
+        ripgrepResultCount: 0,
+      }),
+      true,
+    );
+
+    assert.equal(
+      smartSearchModule.__testables.shouldSurfaceCodexLensFtsCompatibilityWarning({
+        compatibilityTriggeredThisQuery: false,
+        skipExactDueToCompatibility: true,
+        ripgrepResultCount: 0,
+      }),
+      true,
+    );
+  });
+
   it('builds actionable index suggestions for unhealthy index states', () => {
     if (!smartSearchModule) return;
 
@@ -318,4 +652,52 @@ describe('Smart Search MCP usage defaults and path handling', async () => {
     assert.match(toolResult.error, /Both search backends failed:/);
     assert.match(toolResult.error, /(FTS|Ripgrep)/);
   });
+
+  it('returns structured semantic results after local init and embed without JSON parse warnings', async () => {
+    if (!smartSearchModule) return;
+
+    const codexLensModule = await import(new URL(`../dist/tools/codex-lens.js?smart-semantic=${Date.now()}`, import.meta.url).href);
+    const ready = await codexLensModule.checkVenvStatus(true);
+    if (!ready.ready) {
+      console.log('Skipping: CodexLens not ready');
+      return;
+    }
+
+    const semantic = await codexLensModule.checkSemanticStatus();
+    if (!semantic.available) {
+      console.log('Skipping: semantic dependencies not ready');
+      return;
+    }
+
+    const dir = createWorkspace();
+    writeFileSync(
+      join(dir, 'sample.ts'),
+      'export function parseCodexLensOutput() { return stripAnsiOutput(); }\nexport const sum = (a, b) => a + b;\n',
+    );
+
+    const init = await smartSearchModule.handler({ action: 'init', path: dir });
+    assert.equal(init.success, true, init.error ?? 'Expected init to succeed');
+
+    const embed = await smartSearchModule.handler({
+      action: 'embed',
+      path: dir,
+      embeddingBackend: 'local',
+      force: true,
+    });
+    assert.equal(embed.success, true, embed.error ?? 'Expected local embed to succeed');
+
+    const search = await smartSearchModule.handler({
+      action: 'search',
+      mode: 'semantic',
+      path: dir,
+      query: 'parse CodexLens output strip ANSI',
+      limit: 5,
+    });
+
+    assert.equal(search.success, true, search.error ?? 'Expected semantic search to succeed');
+    assert.equal(search.result.success, true);
+    assert.equal(search.result.results.format, 'ace');
+    assert.ok(search.result.results.total >= 1, 'Expected at least one structured semantic match');
+    assert.doesNotMatch(search.result.metadata?.warning ?? '', /Failed to parse JSON output/i);
+  });
 });
diff --git a/ccw/tests/unified-vector-index.test.ts b/ccw/tests/unified-vector-index.test.ts
new file mode 100644
index 00000000..f14acc02
--- /dev/null
+++ b/ccw/tests/unified-vector-index.test.ts
@@ -0,0 +1,97 @@
+import { after, beforeEach, describe, it } from 'node:test';
+import assert from 'node:assert/strict';
+import { EventEmitter } from 'node:events';
+import { createRequire } from 'node:module';
+import { mkdtempSync, rmSync } from 'node:fs';
+import { tmpdir } from 'node:os';
+import { join } from 'node:path';
+
+const require = createRequire(import.meta.url);
+// eslint-disable-next-line @typescript-eslint/no-var-requires
+const fs = require('node:fs') as typeof import('node:fs');
+// eslint-disable-next-line @typescript-eslint/no-var-requires
+const childProcess = require('node:child_process') as typeof import('node:child_process');
+
+class FakeChildProcess extends EventEmitter {
+  stdout = new EventEmitter();
+  stderr = new EventEmitter();
+  stdinChunks: string[] = [];
+  stdin = {
+    write: (chunk: string | Buffer) => {
+      this.stdinChunks.push(String(chunk));
+      return true;
+    },
+    end: () => undefined,
+  };
+}
+
+type SpawnCall = {
+  command: string;
+  args: string[];
+  // eslint-disable-next-line @typescript-eslint/no-explicit-any
+  options: any;
+  child: FakeChildProcess;
+};
+
+const spawnCalls: SpawnCall[] = [];
+const tempDirs: string[] = [];
+let embedderAvailable = true;
+
+const originalExistsSync = fs.existsSync;
+const originalSpawn = childProcess.spawn;
+
+fs.existsSync = ((..._args: unknown[]) => embedderAvailable) as typeof fs.existsSync;
+
+childProcess.spawn = ((command: string, args: string[] = [], options: unknown = {}) => {
+  const child = new FakeChildProcess();
+  spawnCalls.push({ command: String(command), args: args.map(String), options, child });
+
+  queueMicrotask(() => {
+    child.stdout.emit('data', JSON.stringify({
+      success: true,
+      total_chunks: 4,
+      hnsw_available: true,
+      hnsw_count: 4,
+      dimension: 384,
+    }));
+    child.emit('close', 0);
+  });
+
+  return child as unknown as ReturnType<typeof childProcess.spawn>;
+}) as typeof childProcess.spawn;
+
+after(() => {
+  fs.existsSync = originalExistsSync;
+  childProcess.spawn = originalSpawn;
+  while (tempDirs.length > 0) {
+    rmSync(tempDirs.pop() as string, { recursive: true, force: true });
+  }
+});
+
+describe('unified-vector-index', () => {
+  beforeEach(() => {
+    embedderAvailable = true;
+    spawnCalls.length = 0;
+  });
+
+  it('spawns CodexLens venv python with hidden window options', async () => {
+    const projectDir = mkdtempSync(join(tmpdir(), 'ccw-unified-vector-index-'));
+    tempDirs.push(projectDir);
+
+    const moduleUrl = new URL('../dist/core/unified-vector-index.js', import.meta.url);
+    moduleUrl.searchParams.set('t', String(Date.now()));
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    const mod: any = await import(moduleUrl.href);
+
+    const index = new mod.UnifiedVectorIndex(projectDir);
+    const status = await index.getStatus();
+
+    assert.equal(status.success, true);
+    assert.equal(spawnCalls.length, 1);
+    assert.equal(spawnCalls[0].options.shell, false);
+    assert.equal(spawnCalls[0].options.windowsHide, true);
+    assert.equal(spawnCalls[0].options.env.PYTHONIOENCODING, 'utf-8');
+    assert.deepEqual(spawnCalls[0].options.stdio, ['pipe', 'pipe', 'pipe']);
+    assert.match(spawnCalls[0].child.stdinChunks.join(''), /"operation":"status"/);
+  });
+});
diff --git a/ccw/tests/uv-manager-codexlens-python.test.js b/ccw/tests/uv-manager-codexlens-python.test.js
index 648ca823..ad0eae30 100644
--- a/ccw/tests/uv-manager-codexlens-python.test.js
+++ b/ccw/tests/uv-manager-codexlens-python.test.js
@@ -3,13 +3,16 @@ import assert from 'node:assert/strict';
 import { execSync } from 'node:child_process';
 
 const uvManagerPath = new URL('../dist/utils/uv-manager.js', import.meta.url).href;
+const pythonUtilsPath = new URL('../dist/utils/python-utils.js', import.meta.url).href;
 
 describe('CodexLens UV python preference', async () => {
   let mod;
+  let pythonUtils;
   const originalPython = process.env.CCW_PYTHON;
 
   before(async () => {
     mod = await import(uvManagerPath);
+    pythonUtils = await import(pythonUtilsPath);
   });
 
   afterEach(() => {
@@ -25,6 +28,73 @@ describe('CodexLens UV python preference', async () => {
     assert.equal(mod.getPreferredCodexLensPythonSpec(), 'C:/Custom/Python/python.exe');
   });
 
+  it('parses py launcher commands into spawn-safe command specs', () => {
+    const spec = pythonUtils.parsePythonCommandSpec('py -3.11');
+
+    assert.equal(spec.command, 'py');
+    assert.deepEqual(spec.args, ['-3.11']);
+    assert.equal(spec.display, 'py -3.11');
+  });
+
+  it('treats unquoted Windows-style executable paths as a single command', () => {
+    const spec = pythonUtils.parsePythonCommandSpec('C:/Program Files/Python311/python.exe');
+
+    assert.equal(spec.command, 'C:/Program Files/Python311/python.exe');
+    assert.deepEqual(spec.args, []);
+    assert.equal(spec.display, '"C:/Program Files/Python311/python.exe"');
+  });
+
+  it('probes Python launcher versions without opening a shell window', () => {
+    const probeCalls = [];
+    const version = pythonUtils.probePythonCommandVersion(
+      { command: 'py', args: ['-3.11'], display: 'py -3.11' },
+      (command, args, options) => {
+        probeCalls.push({ command, args, options });
+        return { status: 0, stdout: '', stderr: 'Python 3.11.9\n' };
+      },
+    );
+
+    assert.equal(version, 'Python 3.11.9');
+    assert.equal(probeCalls.length, 1);
+    assert.equal(probeCalls[0].command, 'py');
+    assert.deepEqual(probeCalls[0].args, ['-3.11', '--version']);
+    assert.equal(probeCalls[0].options.shell, false);
+    assert.equal(probeCalls[0].options.windowsHide, true);
+    assert.equal(probeCalls[0].options.env.PYTHONIOENCODING, 'utf-8');
+  });
+
+  it('looks up uv on PATH without spawning a visible shell window', () => {
+    const lookupCalls = [];
+    const found = mod.__testables.findExecutableOnPath('uv', (command, args, options) => {
+      lookupCalls.push({ command, args, options });
+      return { status: 0, stdout: 'C:/Tools/uv.exe\n', stderr: '' };
+    });
+
+    assert.equal(found, 'C:/Tools/uv.exe');
+    assert.equal(lookupCalls.length, 1);
+    assert.equal(lookupCalls[0].command, process.platform === 'win32' ? 'where' : 'which');
+    assert.deepEqual(lookupCalls[0].args, ['uv']);
+    assert.equal(lookupCalls[0].options.shell, false);
+    assert.equal(lookupCalls[0].options.windowsHide, true);
+    assert.equal(lookupCalls[0].options.env.PYTHONIOENCODING, 'utf-8');
+  });
+
+  it('checks Windows launcher preferences with hidden subprocess options', () => {
+    const probeCalls = [];
+    const available = mod.__testables.hasWindowsPythonLauncherVersion('3.11', (command, args, options) => {
+      probeCalls.push({ command, args, options });
+      return { status: 0, stdout: '', stderr: 'Python 3.11.9\n' };
+    });
+
+    assert.equal(available, true);
+    assert.equal(probeCalls.length, 1);
+    assert.equal(probeCalls[0].command, 'py');
+    assert.deepEqual(probeCalls[0].args, ['-3.11', '--version']);
+    assert.equal(probeCalls[0].options.shell, false);
+    assert.equal(probeCalls[0].options.windowsHide, true);
+    assert.equal(probeCalls[0].options.env.PYTHONIOENCODING, 'utf-8');
+  });
+
   it('prefers Python 3.11 or 3.10 on Windows when available', () => {
     if (process.platform !== 'win32') return;
     delete process.env.CCW_PYTHON;
diff --git a/codex-lens/README.md b/codex-lens/README.md
index 66f754f1..823ab5fa 100644
--- a/codex-lens/README.md
+++ b/codex-lens/README.md
@@ -41,6 +41,56 @@ pip install codex-lens[semantic-directml]
 pip install codex-lens[full]
 ```
 
+### Local ONNX Reranker Bootstrap
+
+Use the pinned bootstrap flow when you want the local-only reranker backend in an
+existing CodexLens virtual environment without asking pip to resolve the whole
+project extras set at once.
+
+1. Start from the CodexLens repo root and create or activate the project venv.
+2. Review the pinned install manifest in `scripts/requirements-reranker-local.txt`.
+3. Render the deterministic setup plan:
+
+```bash
+python scripts/bootstrap_reranker_local.py --dry-run
+```
+
+The bootstrap script always targets the selected venv Python, installs the local
+ONNX reranker stack in a fixed order, and keeps the package set pinned to the
+validated Python 3.13-compatible combination:
+
+- `numpy==2.4.0`
+- `onnxruntime==1.23.2`
+- `huggingface-hub==0.36.2`
+- `transformers==4.53.3`
+- `optimum[onnxruntime]==2.1.0`
+
+When you are ready to apply it to the CodexLens venv, use:
+
+```bash
+python scripts/bootstrap_reranker_local.py --apply
+```
+
+To pre-download the default local reranker model (`Xenova/ms-marco-MiniLM-L-6-v2`)
+into the repo-local Hugging Face cache, use:
+
+```bash
+python scripts/bootstrap_reranker_local.py --apply --download-model
+```
+
+The dry-run plan also prints the equivalent explicit model download command. On
+Windows PowerShell with the default repo venv, it looks like:
+
+```bash
+.venv/Scripts/hf.exe download Xenova/ms-marco-MiniLM-L-6-v2 --local-dir .cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2
+```
+
+After installation, probe the backend from the same venv:
+
+```bash
+python scripts/bootstrap_reranker_local.py --apply --probe
+```
+
 ## Requirements
 
 - Python >= 3.10
diff --git a/codex-lens/benchmarks/accuracy_queries_ccw_smart_search.jsonl b/codex-lens/benchmarks/accuracy_queries_ccw_smart_search.jsonl
new file mode 100644
index 00000000..737f88b6
--- /dev/null
+++ b/codex-lens/benchmarks/accuracy_queries_ccw_smart_search.jsonl
@@ -0,0 +1,16 @@
+{"query":"executeHybridMode dense_rerank semantic smart_search","relevant_paths":["ccw/src/tools/smart-search.ts"],"intent":"ccw-semantic-routing","notes":"CCW semantic mode delegates to CodexLens dense_rerank."}
+{"query":"parse CodexLens JSON output strip ANSI smart_search","relevant_paths":["ccw/src/tools/smart-search.ts"],"intent":"ccw-json-fallback","notes":"Covers JSON/plain-text fallback handling for CodexLens output."}
+{"query":"smart_search init embed search action schema","relevant_paths":["ccw/src/tools/smart-search.ts"],"intent":"ccw-action-schema","notes":"Find the Zod schema that defines init/embed/search actions."}
+{"query":"auto init missing job dedupe smart_search","relevant_paths":["ccw/src/tools/smart-search.ts"],"intent":"ccw-auto-init","notes":"Targets background init/embed warmup and dedupe state."}
+{"query":"smart_search exact mode fallback to CodexLens fts","relevant_paths":["ccw/src/tools/smart-search.ts"],"intent":"ccw-exact-fallback","notes":"Tracks the exact-mode fallback path into CodexLens FTS."}
+{"query":"smart_search settings snapshot embedding backend reranker backend staged stage2 mode","relevant_paths":["ccw/src/tools/smart-search.ts"],"intent":"ccw-config-snapshot","notes":"Reads local config snapshot for embedding/reranker/staged pipeline settings."}
+{"query":"embedding backend fastembed local litellm api config","relevant_paths":["codex-lens/src/codexlens/config.py"],"intent":"codexlens-embedding-config","notes":"Local-only benchmark should resolve to fastembed defaults."}
+{"query":"reranker backend onnx api legacy configuration","relevant_paths":["codex-lens/src/codexlens/config.py","codex-lens/src/codexlens/env_config.py"],"intent":"codexlens-reranker-config","notes":"Covers both config dataclass fields and env overrides."}
+{"query":"staged stage2 mode precomputed realtime static_global_graph","relevant_paths":["codex-lens/src/codexlens/config.py","codex-lens/src/codexlens/env_config.py"],"intent":"codexlens-stage2-config","notes":"Benchmark matrix should exercise the three supported stage2 modes."}
+{"query":"enable staged rerank stage 4 config","relevant_paths":["codex-lens/src/codexlens/config.py"],"intent":"codexlens-stage4-rerank","notes":"Stage 4 rerank flag needs to stay enabled for local benchmarks."}
+{"query":"cascade_search dense_rerank staged pipeline ChainSearchEngine","relevant_paths":["codex-lens/src/codexlens/search/chain_search.py"],"intent":"chain-search-cascade","notes":"Baseline query for the central retrieval engine."}
+{"query":"realtime LSP expand stage2 search pipeline","relevant_paths":["codex-lens/src/codexlens/search/chain_search.py"],"intent":"chain-search-stage2-realtime","notes":"Targets realtime stage2 expansion logic."}
+{"query":"static global graph stage2 expansion implementation","relevant_paths":["codex-lens/src/codexlens/search/chain_search.py"],"intent":"chain-search-stage2-static","notes":"Targets static_global_graph stage2 expansion logic."}
+{"query":"cross encoder rerank stage 4 implementation","relevant_paths":["codex-lens/src/codexlens/search/chain_search.py"],"intent":"chain-search-rerank","notes":"Relevant for dense_rerank and staged rerank latency comparisons."}
+{"query":"get_reranker factory onnx backend selection","relevant_paths":["codex-lens/src/codexlens/semantic/reranker/factory.py"],"intent":"reranker-factory","notes":"Keeps the benchmark aligned with local ONNX reranker selection."}
+{"query":"EMBEDDING_BACKEND and RERANKER_BACKEND environment variables","relevant_paths":["codex-lens/src/codexlens/env_config.py"],"intent":"env-overrides","notes":"Covers CCW/CodexLens local-only environment overrides."}
diff --git a/codex-lens/benchmarks/compare_accuracy_labeled.py b/codex-lens/benchmarks/compare_accuracy_labeled.py
index 079815fd..7000a181 100644
--- a/codex-lens/benchmarks/compare_accuracy_labeled.py
+++ b/codex-lens/benchmarks/compare_accuracy_labeled.py
@@ -239,6 +239,7 @@ def main() -> None:
     config.staged_clustering_strategy = str(args.staged_cluster_strategy or "path").strip().lower()
     # Stability: on some Windows setups, DirectML/ONNX can crash under load.
     config.embedding_use_gpu = False
+    config.reranker_use_gpu = False
 
     registry = RegistryStore()
     registry.initialize()
@@ -362,4 +363,3 @@ def main() -> None:
 
 if __name__ == "__main__":
     main()
-
diff --git a/codex-lens/benchmarks/compare_ccw_smart_search_stage2.py b/codex-lens/benchmarks/compare_ccw_smart_search_stage2.py
new file mode 100644
index 00000000..b6776bfd
--- /dev/null
+++ b/codex-lens/benchmarks/compare_ccw_smart_search_stage2.py
@@ -0,0 +1,980 @@
+#!/usr/bin/env python
+"""Benchmark local-only staged stage2 modes for CCW smart_search queries.
+
+This benchmark reuses the existing CodexLens benchmark style, but focuses on
+the real search intents that drive CCW `smart_search`. It evaluates:
+
+1. `dense_rerank` baseline
+2. `staged` + `precomputed`
+3. `staged` + `realtime`
+4. `staged` + `static_global_graph`
+
+Metrics:
+  - Hit@K
+  - MRR@K
+  - Recall@K
+  - latency (avg/p50/p95)
+
+The runner is intentionally local-only. By default it uses:
+  - embedding backend: `fastembed`
+  - reranker backend: `onnx`
+
+Examples:
+  python benchmarks/compare_ccw_smart_search_stage2.py --dry-run
+  python benchmarks/compare_ccw_smart_search_stage2.py --self-check
+  python benchmarks/compare_ccw_smart_search_stage2.py --source .. --k 10
+  python benchmarks/compare_ccw_smart_search_stage2.py --embedding-model code --reranker-model cross-encoder/ms-marco-MiniLM-L-6-v2
+"""
+
+from __future__ import annotations
+
+import argparse
+from copy import deepcopy
+import gc
+import json
+import os
+import re
+import statistics
+import sys
+import time
+from dataclasses import asdict, dataclass
+from pathlib import Path
+from typing import Any, Dict, Iterable, List, Optional, Sequence, Tuple
+
+sys.path.insert(0, str(Path(__file__).parent.parent / "src"))
+
+from codexlens.config import Config
+from codexlens.search.chain_search import ChainSearchEngine, SearchOptions
+from codexlens.search.ranking import (
+    QueryIntent,
+    detect_query_intent,
+    is_generated_artifact_path,
+    is_test_file,
+    query_prefers_lexical_search,
+    query_targets_generated_files,
+)
+from codexlens.storage.path_mapper import PathMapper
+from codexlens.storage.registry import RegistryStore
+
+
+DEFAULT_SOURCE = Path(__file__).resolve().parents[2]
+DEFAULT_QUERIES_FILE = Path(__file__).parent / "accuracy_queries_ccw_smart_search.jsonl"
+DEFAULT_OUTPUT = Path(__file__).parent / "results" / "ccw_smart_search_stage2.json"
+
+VALID_STAGE2_MODES = ("precomputed", "realtime", "static_global_graph")
+VALID_LOCAL_EMBEDDING_BACKENDS = ("fastembed",)
+VALID_LOCAL_RERANKER_BACKENDS = ("onnx", "fastembed", "legacy")
+VALID_BASELINE_METHODS = ("auto", "fts", "hybrid")
+DEFAULT_LOCAL_ONNX_RERANKER_MODEL = "Xenova/ms-marco-MiniLM-L-6-v2"
+
+
+def _now_ms() -> float:
+    return time.perf_counter() * 1000.0
+
+
+def _normalize_path_key(path: str) -> str:
+    try:
+        candidate = Path(path)
+        if str(candidate) and (candidate.is_absolute() or re.match(r"^[A-Za-z]:", str(candidate))):
+            normalized = str(candidate.resolve())
+        else:
+            normalized = str(candidate)
+    except Exception:
+        normalized = path
+    normalized = normalized.replace("/", "\\")
+    if os.name == "nt":
+        normalized = normalized.lower()
+    return normalized
+
+
+def _dedup_topk(paths: Iterable[str], k: int) -> List[str]:
+    output: List[str] = []
+    seen: set[str] = set()
+    for path in paths:
+        if path in seen:
+            continue
+        seen.add(path)
+        output.append(path)
+        if len(output) >= k:
+            break
+    return output
+
+
+def _first_hit_rank(topk_paths: Sequence[str], relevant: set[str]) -> Optional[int]:
+    for index, path in enumerate(topk_paths, start=1):
+        if path in relevant:
+            return index
+    return None
+
+
+def _mrr(ranks: Sequence[Optional[int]]) -> float:
+    values = [1.0 / rank for rank in ranks if rank and rank > 0]
+    return statistics.mean(values) if values else 0.0
+
+
+def _mean(values: Sequence[float]) -> float:
+    return statistics.mean(values) if values else 0.0
+
+
+def _percentile(values: Sequence[float], percentile: float) -> float:
+    if not values:
+        return 0.0
+    ordered = sorted(values)
+    if len(ordered) == 1:
+        return ordered[0]
+    index = (len(ordered) - 1) * percentile
+    lower = int(index)
+    upper = min(lower + 1, len(ordered) - 1)
+    if lower == upper:
+        return ordered[lower]
+    fraction = index - lower
+    return ordered[lower] + (ordered[upper] - ordered[lower]) * fraction
+
+
+def _load_labeled_queries(path: Path, limit: Optional[int]) -> List[Dict[str, Any]]:
+    if not path.is_file():
+        raise SystemExit(f"Queries file does not exist: {path}")
+
+    output: List[Dict[str, Any]] = []
+    for raw_line in path.read_text(encoding="utf-8", errors="ignore").splitlines():
+        line = raw_line.strip()
+        if not line or line.startswith("#"):
+            continue
+        try:
+            item = json.loads(line)
+        except Exception as exc:
+            raise SystemExit(f"Invalid JSONL line in {path}: {raw_line!r} ({exc})") from exc
+        if not isinstance(item, dict) or "query" not in item or "relevant_paths" not in item:
+            raise SystemExit(f"Invalid query item (expected object with query/relevant_paths): {item!r}")
+        relevant_paths = item.get("relevant_paths")
+        if not isinstance(relevant_paths, list) or not relevant_paths:
+            raise SystemExit(f"Query item must include non-empty relevant_paths[]: {item!r}")
+        output.append(item)
+        if limit is not None and len(output) >= limit:
+            break
+    return output
+
+
+def _resolve_expected_paths(source_root: Path, paths: Sequence[str]) -> Tuple[List[str], set[str], List[str]]:
+    resolved_display: List[str] = []
+    resolved_keys: set[str] = set()
+    missing: List[str] = []
+
+    for raw_path in paths:
+        candidate = Path(raw_path)
+        if not candidate.is_absolute():
+            candidate = (source_root / candidate).resolve()
+        if not candidate.exists():
+            missing.append(str(candidate))
+        resolved_display.append(str(candidate))
+        resolved_keys.add(_normalize_path_key(str(candidate)))
+    return resolved_display, resolved_keys, missing
+
+
+def _validate_local_only_backends(embedding_backend: str, reranker_backend: str) -> None:
+    if embedding_backend not in VALID_LOCAL_EMBEDDING_BACKENDS:
+        raise SystemExit(
+            "This runner is local-only. "
+            f"--embedding-backend must be one of {', '.join(VALID_LOCAL_EMBEDDING_BACKENDS)}; got {embedding_backend!r}"
+        )
+    if reranker_backend not in VALID_LOCAL_RERANKER_BACKENDS:
+        raise SystemExit(
+            "This runner is local-only. "
+            f"--reranker-backend must be one of {', '.join(VALID_LOCAL_RERANKER_BACKENDS)}; got {reranker_backend!r}"
+        )
+
+
+def _validate_stage2_modes(stage2_modes: Sequence[str]) -> List[str]:
+    normalized = [str(mode).strip().lower() for mode in stage2_modes if str(mode).strip()]
+    if not normalized:
+        raise SystemExit("At least one --stage2-modes entry is required")
+    invalid = [mode for mode in normalized if mode not in VALID_STAGE2_MODES]
+    if invalid:
+        raise SystemExit(
+            f"Invalid --stage2-modes entry: {invalid[0]} "
+            f"(valid: {', '.join(VALID_STAGE2_MODES)})"
+        )
+    deduped: List[str] = []
+    seen: set[str] = set()
+    for mode in normalized:
+        if mode in seen:
+            continue
+        seen.add(mode)
+        deduped.append(mode)
+    return deduped
+
+
+def _validate_baseline_methods(methods: Sequence[str]) -> List[str]:
+    normalized = [str(method).strip().lower() for method in methods if str(method).strip()]
+    invalid = [method for method in normalized if method not in VALID_BASELINE_METHODS]
+    if invalid:
+        raise SystemExit(
+            f"Invalid --baseline-methods entry: {invalid[0]} "
+            f"(valid: {', '.join(VALID_BASELINE_METHODS)})"
+        )
+    deduped: List[str] = []
+    seen: set[str] = set()
+    for method in normalized:
+        if method in seen:
+            continue
+        seen.add(method)
+        deduped.append(method)
+    return deduped
+
+
+@dataclass
+class StrategyRun:
+    strategy_key: str
+    strategy: str
+    stage2_mode: Optional[str]
+    effective_method: str
+    execution_method: str
+    latency_ms: float
+    topk_paths: List[str]
+    first_hit_rank: Optional[int]
+    hit_at_k: bool
+    recall_at_k: float
+    generated_artifact_count: int
+    test_file_count: int
+    error: Optional[str] = None
+
+
+@dataclass
+class QueryEvaluation:
+    query: str
+    intent: Optional[str]
+    notes: Optional[str]
+    relevant_paths: List[str]
+    runs: Dict[str, StrategyRun]
+
+
+@dataclass
+class PairwiseDelta:
+    mode_a: str
+    mode_b: str
+    hit_at_k_delta: float
+    mrr_at_k_delta: float
+    avg_recall_at_k_delta: float
+    avg_latency_ms_delta: float
+
+
+@dataclass
+class StrategySpec:
+    strategy_key: str
+    strategy: str
+    stage2_mode: Optional[str]
+
+
+@dataclass
+class StrategyRuntime:
+    strategy_spec: StrategySpec
+    config: Config
+    registry: RegistryStore
+    engine: ChainSearchEngine
+
+
+def _strategy_specs(
+    stage2_modes: Sequence[str],
+    include_dense_baseline: bool,
+    *,
+    baseline_methods: Sequence[str],
+) -> List[StrategySpec]:
+    specs: List[StrategySpec] = []
+    for method in baseline_methods:
+        specs.append(StrategySpec(strategy_key=method, strategy=method, stage2_mode=None))
+    if include_dense_baseline:
+        specs.append(StrategySpec(strategy_key="dense_rerank", strategy="dense_rerank", stage2_mode=None))
+    for stage2_mode in stage2_modes:
+        specs.append(
+            StrategySpec(
+                strategy_key=f"staged:{stage2_mode}",
+                strategy="staged",
+                stage2_mode=stage2_mode,
+            )
+        )
+    return specs
+
+
+def _build_strategy_runtime(base_config: Config, strategy_spec: StrategySpec) -> StrategyRuntime:
+    runtime_config = deepcopy(base_config)
+    registry = RegistryStore()
+    registry.initialize()
+    mapper = PathMapper()
+    engine = ChainSearchEngine(registry=registry, mapper=mapper, config=runtime_config)
+    return StrategyRuntime(
+        strategy_spec=strategy_spec,
+        config=runtime_config,
+        registry=registry,
+        engine=engine,
+    )
+
+
+def _select_effective_method(query: str, requested_method: str) -> str:
+    requested = str(requested_method).strip().lower()
+    if requested != "auto":
+        return requested
+    if query_targets_generated_files(query) or query_prefers_lexical_search(query):
+        return "fts"
+    intent = detect_query_intent(query)
+    if intent == QueryIntent.KEYWORD:
+        return "fts"
+    if intent == QueryIntent.SEMANTIC:
+        return "dense_rerank"
+    return "hybrid"
+
+
+def _filter_dataset_by_query_match(
+    dataset: Sequence[Dict[str, Any]],
+    query_match: Optional[str],
+) -> List[Dict[str, Any]]:
+    """Filter labeled queries by case-insensitive substring match."""
+    needle = str(query_match or "").strip().casefold()
+    if not needle:
+        return list(dataset)
+    return [
+        dict(item)
+        for item in dataset
+        if needle in str(item.get("query", "")).casefold()
+    ]
+
+
+def _apply_query_limit(
+    dataset: Sequence[Dict[str, Any]],
+    query_limit: Optional[int],
+) -> List[Dict[str, Any]]:
+    """Apply the optional query limit after any dataset-level filtering."""
+    if query_limit is None:
+        return list(dataset)
+    return [dict(item) for item in list(dataset)[: max(0, int(query_limit))]]
+
+
+def _write_json_payload(path: Path, payload: Dict[str, Any]) -> None:
+    """Persist a benchmark payload as UTF-8 JSON."""
+    path.parent.mkdir(parents=True, exist_ok=True)
+    path.write_text(json.dumps(payload, ensure_ascii=False, indent=2), encoding="utf-8")
+
+
+def _write_final_outputs(
+    *,
+    output_path: Path,
+    progress_output: Optional[Path],
+    payload: Dict[str, Any],
+) -> None:
+    """Persist the final completed payload to both result and progress outputs."""
+    _write_json_payload(output_path, payload)
+    if progress_output is not None:
+        _write_json_payload(progress_output, payload)
+
+
+def _make_progress_payload(
+    *,
+    args: argparse.Namespace,
+    source_root: Path,
+    strategy_specs: Sequence[StrategySpec],
+    evaluations: Sequence[QueryEvaluation],
+    query_index: int,
+    total_queries: int,
+    run_index: int,
+    total_runs: int,
+    current_query: str,
+    current_strategy_key: str,
+) -> Dict[str, Any]:
+    """Create a partial progress snapshot for long benchmark runs."""
+    return {
+        "status": "running",
+        "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
+        "source": str(source_root),
+        "queries_file": str(args.queries_file),
+        "query_count": len(evaluations),
+        "planned_query_count": total_queries,
+        "k": int(args.k),
+        "coarse_k": int(args.coarse_k),
+        "strategy_keys": [spec.strategy_key for spec in strategy_specs],
+        "progress": {
+            "completed_queries": query_index,
+            "total_queries": total_queries,
+            "completed_runs": run_index,
+            "total_runs": total_runs,
+            "current_query": current_query,
+            "current_strategy_key": current_strategy_key,
+        },
+        "evaluations": [
+            {
+                "query": evaluation.query,
+                "intent": evaluation.intent,
+                "notes": evaluation.notes,
+                "relevant_paths": evaluation.relevant_paths,
+                "runs": {key: asdict(run) for key, run in evaluation.runs.items()},
+            }
+            for evaluation in evaluations
+        ],
+    }
+
+
+def _make_search_options(method: str, *, k: int) -> SearchOptions:
+    normalized = str(method).strip().lower()
+    if normalized == "fts":
+        return SearchOptions(
+            total_limit=k,
+            hybrid_mode=False,
+            enable_fuzzy=False,
+            enable_vector=False,
+            pure_vector=False,
+            enable_cascade=False,
+        )
+    if normalized == "hybrid":
+        return SearchOptions(
+            total_limit=k,
+            hybrid_mode=True,
+            enable_fuzzy=False,
+            enable_vector=True,
+            pure_vector=False,
+            enable_cascade=False,
+        )
+    if normalized in {"dense_rerank", "staged"}:
+        return SearchOptions(
+            total_limit=k,
+            hybrid_mode=True,
+            enable_fuzzy=False,
+            enable_vector=True,
+            pure_vector=False,
+            enable_cascade=True,
+        )
+    raise ValueError(f"Unsupported benchmark method: {method}")
+
+
+def _run_strategy(
+    engine: ChainSearchEngine,
+    config: Config,
+    *,
+    strategy_spec: StrategySpec,
+    query: str,
+    source_path: Path,
+    k: int,
+    coarse_k: int,
+    relevant: set[str],
+) -> StrategyRun:
+    gc.collect()
+    effective_method = _select_effective_method(query, strategy_spec.strategy)
+    execution_method = "cascade" if effective_method in {"dense_rerank", "staged"} else effective_method
+    previous_cascade_strategy = getattr(config, "cascade_strategy", None)
+    previous_stage2_mode = getattr(config, "staged_stage2_mode", None)
+
+    start_ms = _now_ms()
+    try:
+        options = _make_search_options(
+            "staged" if strategy_spec.strategy == "staged" else effective_method,
+            k=k,
+        )
+        if strategy_spec.strategy == "staged":
+            config.cascade_strategy = "staged"
+            if strategy_spec.stage2_mode:
+                config.staged_stage2_mode = strategy_spec.stage2_mode
+            result = engine.cascade_search(
+                query=query,
+                source_path=source_path,
+                k=k,
+                coarse_k=coarse_k,
+                options=options,
+                strategy="staged",
+            )
+        elif effective_method == "dense_rerank":
+            config.cascade_strategy = "dense_rerank"
+            result = engine.cascade_search(
+                query=query,
+                source_path=source_path,
+                k=k,
+                coarse_k=coarse_k,
+                options=options,
+                strategy="dense_rerank",
+            )
+        else:
+            result = engine.search(
+                query=query,
+                source_path=source_path,
+                options=options,
+            )
+        latency_ms = _now_ms() - start_ms
+        paths_raw = [item.path for item in (result.results or []) if getattr(item, "path", None)]
+        topk = _dedup_topk((_normalize_path_key(path) for path in paths_raw), k=k)
+        rank = _first_hit_rank(topk, relevant)
+        recall = 0.0
+        if relevant:
+            recall = len(set(topk) & relevant) / float(len(relevant))
+        return StrategyRun(
+            strategy_key=strategy_spec.strategy_key,
+            strategy=strategy_spec.strategy,
+            stage2_mode=strategy_spec.stage2_mode,
+            effective_method=effective_method,
+            execution_method=execution_method,
+            latency_ms=latency_ms,
+            topk_paths=topk,
+            first_hit_rank=rank,
+            hit_at_k=rank is not None,
+            recall_at_k=recall,
+            generated_artifact_count=sum(1 for path in topk if is_generated_artifact_path(path)),
+            test_file_count=sum(1 for path in topk if is_test_file(path)),
+            error=None,
+        )
+    except Exception as exc:
+        latency_ms = _now_ms() - start_ms
+        return StrategyRun(
+            strategy_key=strategy_spec.strategy_key,
+            strategy=strategy_spec.strategy,
+            stage2_mode=strategy_spec.stage2_mode,
+            effective_method=effective_method,
+            execution_method=execution_method,
+            latency_ms=latency_ms,
+            topk_paths=[],
+            first_hit_rank=None,
+            hit_at_k=False,
+            recall_at_k=0.0,
+            generated_artifact_count=0,
+            test_file_count=0,
+            error=f"{type(exc).__name__}: {exc}",
+        )
+    finally:
+        config.cascade_strategy = previous_cascade_strategy
+        config.staged_stage2_mode = previous_stage2_mode
+
+
+def _summarize_runs(runs: Sequence[StrategyRun]) -> Dict[str, Any]:
+    latencies = [run.latency_ms for run in runs if not run.error]
+    ranks = [run.first_hit_rank for run in runs]
+    effective_method_counts: Dict[str, int] = {}
+    for run in runs:
+        effective_method_counts[run.effective_method] = effective_method_counts.get(run.effective_method, 0) + 1
+    return {
+        "query_count": len(runs),
+        "hit_at_k": _mean([1.0 if run.hit_at_k else 0.0 for run in runs]),
+        "mrr_at_k": _mrr(ranks),
+        "avg_recall_at_k": _mean([run.recall_at_k for run in runs]),
+        "avg_latency_ms": _mean(latencies),
+        "p50_latency_ms": _percentile(latencies, 0.50),
+        "p95_latency_ms": _percentile(latencies, 0.95),
+        "avg_generated_artifact_count": _mean([float(run.generated_artifact_count) for run in runs]),
+        "avg_test_file_count": _mean([float(run.test_file_count) for run in runs]),
+        "runs_with_generated_artifacts": sum(1 for run in runs if run.generated_artifact_count > 0),
+        "runs_with_test_files": sum(1 for run in runs if run.test_file_count > 0),
+        "effective_methods": effective_method_counts,
+        "errors": sum(1 for run in runs if run.error),
+    }
+
+
+def _build_pairwise_deltas(stage2_summaries: Dict[str, Dict[str, Any]]) -> List[PairwiseDelta]:
+    modes = list(stage2_summaries.keys())
+    deltas: List[PairwiseDelta] = []
+    for left_index in range(len(modes)):
+        for right_index in range(left_index + 1, len(modes)):
+            left = modes[left_index]
+            right = modes[right_index]
+            left_summary = stage2_summaries[left]
+            right_summary = stage2_summaries[right]
+            deltas.append(
+                PairwiseDelta(
+                    mode_a=left,
+                    mode_b=right,
+                    hit_at_k_delta=left_summary["hit_at_k"] - right_summary["hit_at_k"],
+                    mrr_at_k_delta=left_summary["mrr_at_k"] - right_summary["mrr_at_k"],
+                    avg_recall_at_k_delta=left_summary["avg_recall_at_k"] - right_summary["avg_recall_at_k"],
+                    avg_latency_ms_delta=left_summary["avg_latency_ms"] - right_summary["avg_latency_ms"],
+                )
+            )
+    return deltas
+
+
+def _make_plan_payload(
+    *,
+    args: argparse.Namespace,
+    source_root: Path,
+    dataset: Sequence[Dict[str, Any]],
+    baseline_methods: Sequence[str],
+    stage2_modes: Sequence[str],
+    strategy_specs: Sequence[StrategySpec],
+) -> Dict[str, Any]:
+    return {
+        "mode": "dry-run" if args.dry_run else "self-check",
+        "local_only": True,
+        "source": str(source_root),
+        "queries_file": str(args.queries_file),
+        "query_count": len(dataset),
+        "query_match": args.query_match,
+        "k": int(args.k),
+        "coarse_k": int(args.coarse_k),
+        "baseline_methods": list(baseline_methods),
+        "stage2_modes": list(stage2_modes),
+        "strategy_keys": [spec.strategy_key for spec in strategy_specs],
+        "local_backends": {
+            "embedding_backend": args.embedding_backend,
+            "embedding_model": args.embedding_model,
+            "reranker_backend": args.reranker_backend,
+            "reranker_model": args.reranker_model,
+            "embedding_use_gpu": bool(args.embedding_use_gpu),
+            "reranker_use_gpu": bool(args.reranker_use_gpu),
+        },
+        "output": str(args.output),
+        "progress_output": str(args.progress_output) if args.progress_output else None,
+        "dataset_preview": [
+            {
+                "query": item.get("query"),
+                "intent": item.get("intent"),
+                "relevant_paths": item.get("relevant_paths"),
+            }
+            for item in list(dataset)[: min(3, len(dataset))]
+        ],
+    }
+
+
+def build_parser() -> argparse.ArgumentParser:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument(
+        "--source",
+        type=Path,
+        default=DEFAULT_SOURCE,
+        help="Source root to benchmark. Defaults to the repository root so CCW and CodexLens paths resolve together.",
+    )
+    parser.add_argument(
+        "--queries-file",
+        type=Path,
+        default=DEFAULT_QUERIES_FILE,
+        help="Labeled JSONL dataset of CCW smart_search queries",
+    )
+    parser.add_argument("--query-limit", type=int, default=None, help="Optional query limit")
+    parser.add_argument(
+        "--query-match",
+        type=str,
+        default=None,
+        help="Optional case-insensitive substring filter for selecting specific benchmark queries.",
+    )
+    parser.add_argument("--k", type=int, default=10, help="Top-k to evaluate")
+    parser.add_argument("--coarse-k", type=int, default=100, help="Stage-1 coarse_k")
+    parser.add_argument(
+        "--baseline-methods",
+        nargs="*",
+        default=list(VALID_BASELINE_METHODS),
+        help="Requested smart_search baselines to compare before staged modes (valid: auto, fts, hybrid).",
+    )
+    parser.add_argument(
+        "--stage2-modes",
+        nargs="*",
+        default=list(VALID_STAGE2_MODES),
+        help="Stage-2 modes to compare",
+    )
+    parser.add_argument("--warmup", type=int, default=0, help="Warmup iterations per strategy")
+    parser.add_argument(
+        "--embedding-backend",
+        default="fastembed",
+        help="Local embedding backend. This runner only accepts fastembed.",
+    )
+    parser.add_argument(
+        "--embedding-model",
+        default="code",
+        help="Embedding model/profile for the local embedding backend",
+    )
+    parser.add_argument(
+        "--embedding-use-gpu",
+        action="store_true",
+        help="Enable GPU acceleration for local embeddings. Off by default for stability.",
+    )
+    parser.add_argument(
+        "--reranker-backend",
+        default="onnx",
+        help="Local reranker backend. Supported local values: onnx, fastembed, legacy.",
+    )
+    parser.add_argument(
+        "--reranker-model",
+        default=DEFAULT_LOCAL_ONNX_RERANKER_MODEL,
+        help="Reranker model name for the local reranker backend",
+    )
+    parser.add_argument(
+        "--reranker-use-gpu",
+        action="store_true",
+        help="Enable GPU acceleration for the local reranker. Off by default for stability.",
+    )
+    parser.add_argument(
+        "--skip-dense-baseline",
+        action="store_true",
+        help="Only compare staged stage2 modes and skip the dense_rerank baseline.",
+    )
+    parser.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Validate dataset/config and print the benchmark plan without running retrieval.",
+    )
+    parser.add_argument(
+        "--self-check",
+        action="store_true",
+        help="Smoke-check the entrypoint by validating dataset, source paths, and stage matrix wiring.",
+    )
+    parser.add_argument(
+        "--output",
+        type=Path,
+        default=DEFAULT_OUTPUT,
+        help="Output JSON path",
+    )
+    parser.add_argument(
+        "--progress-output",
+        type=Path,
+        default=None,
+        help="Optional JSON path updated after each query with partial progress and completed runs.",
+    )
+    return parser
+
+
+def main() -> None:
+    parser = build_parser()
+    args = parser.parse_args()
+
+    source_root = args.source.expanduser().resolve()
+    if not source_root.exists():
+        raise SystemExit(f"Source path does not exist: {source_root}")
+    if int(args.k) <= 0:
+        raise SystemExit("--k must be > 0")
+    if int(args.coarse_k) <= 0:
+        raise SystemExit("--coarse-k must be > 0")
+    if int(args.coarse_k) < int(args.k):
+        raise SystemExit("--coarse-k must be >= --k")
+    if int(args.warmup) < 0:
+        raise SystemExit("--warmup must be >= 0")
+
+    embedding_backend = str(args.embedding_backend).strip().lower()
+    reranker_backend = str(args.reranker_backend).strip().lower()
+    _validate_local_only_backends(embedding_backend, reranker_backend)
+    baseline_methods = _validate_baseline_methods(args.baseline_methods)
+    stage2_modes = _validate_stage2_modes(args.stage2_modes)
+
+    dataset = _load_labeled_queries(args.queries_file, None)
+    dataset = _filter_dataset_by_query_match(dataset, args.query_match)
+    dataset = _apply_query_limit(dataset, args.query_limit)
+    if not dataset:
+        raise SystemExit("No queries to run")
+
+    missing_paths: List[str] = []
+    for item in dataset:
+        _, _, item_missing = _resolve_expected_paths(source_root, [str(path) for path in item["relevant_paths"]])
+        missing_paths.extend(item_missing)
+    if missing_paths:
+        preview = ", ".join(missing_paths[:3])
+        raise SystemExit(
+            "Dataset relevant_paths do not resolve under the selected source root. "
+            f"Examples: {preview}"
+        )
+
+    strategy_specs = _strategy_specs(
+        stage2_modes,
+        include_dense_baseline=not args.skip_dense_baseline,
+        baseline_methods=baseline_methods,
+    )
+
+    if args.dry_run or args.self_check:
+        payload = _make_plan_payload(
+            args=args,
+            source_root=source_root,
+            dataset=dataset,
+            baseline_methods=baseline_methods,
+            stage2_modes=stage2_modes,
+            strategy_specs=strategy_specs,
+        )
+        if args.self_check:
+            payload["status"] = "ok"
+            payload["checks"] = {
+                "dataset_loaded": True,
+                "stage2_matrix_size": len(stage2_modes),
+                "local_only_validation": True,
+                "source_path_exists": True,
+            }
+        print(json.dumps(payload, ensure_ascii=False, indent=2))
+        return
+
+    config = Config.load()
+    config.cascade_strategy = "staged"
+    config.enable_staged_rerank = True
+    config.enable_cross_encoder_rerank = True
+    config.embedding_backend = embedding_backend
+    config.embedding_model = str(args.embedding_model).strip()
+    config.embedding_use_gpu = bool(args.embedding_use_gpu)
+    config.embedding_auto_embed_missing = False
+    config.reranker_backend = reranker_backend
+    config.reranker_model = str(args.reranker_model).strip()
+    config.reranker_use_gpu = bool(args.reranker_use_gpu)
+
+    strategy_runtimes = {
+        spec.strategy_key: _build_strategy_runtime(config, spec)
+        for spec in strategy_specs
+    }
+
+    evaluations: List[QueryEvaluation] = []
+    total_queries = len(dataset)
+    total_runs = total_queries * len(strategy_specs)
+    completed_runs = 0
+
+    try:
+        if int(args.warmup) > 0:
+            warm_query = str(dataset[0]["query"]).strip()
+            warm_relevant_paths = [str(path) for path in dataset[0]["relevant_paths"]]
+            _, warm_relevant, _ = _resolve_expected_paths(source_root, warm_relevant_paths)
+            for spec in strategy_specs:
+                runtime = strategy_runtimes[spec.strategy_key]
+                for _ in range(int(args.warmup)):
+                    _run_strategy(
+                        runtime.engine,
+                        runtime.config,
+                        strategy_spec=spec,
+                        query=warm_query,
+                        source_path=source_root,
+                        k=min(int(args.k), 5),
+                        coarse_k=min(int(args.coarse_k), 50),
+                        relevant=warm_relevant,
+                    )
+
+        for index, item in enumerate(dataset, start=1):
+            query = str(item.get("query", "")).strip()
+            if not query:
+                continue
+            print(f"[query {index}/{total_queries}] {query}", flush=True)
+            relevant_paths, relevant, _ = _resolve_expected_paths(
+                source_root,
+                [str(path) for path in item["relevant_paths"]],
+            )
+            runs: Dict[str, StrategyRun] = {}
+            for spec in strategy_specs:
+                if args.progress_output is not None:
+                    _write_json_payload(
+                        args.progress_output,
+                        _make_progress_payload(
+                            args=args,
+                            source_root=source_root,
+                            strategy_specs=strategy_specs,
+                            evaluations=evaluations,
+                            query_index=index - 1,
+                            total_queries=total_queries,
+                            run_index=completed_runs,
+                            total_runs=total_runs,
+                            current_query=query,
+                            current_strategy_key=spec.strategy_key,
+                        ),
+                    )
+                print(
+                    f"[run {completed_runs + 1}/{total_runs}] "
+                    f"strategy={spec.strategy_key} query={query}",
+                    flush=True,
+                )
+                runtime = strategy_runtimes[spec.strategy_key]
+                runs[spec.strategy_key] = _run_strategy(
+                    runtime.engine,
+                    runtime.config,
+                    strategy_spec=spec,
+                    query=query,
+                    source_path=source_root,
+                    k=int(args.k),
+                    coarse_k=int(args.coarse_k),
+                    relevant=relevant,
+                )
+                completed_runs += 1
+                run = runs[spec.strategy_key]
+                outcome = "error" if run.error else "ok"
+                print(
+                    f"[done {completed_runs}/{total_runs}] "
+                    f"strategy={spec.strategy_key} outcome={outcome} "
+                    f"latency_ms={run.latency_ms:.2f} "
+                    f"first_hit_rank={run.first_hit_rank}",
+                    flush=True,
+                )
+            evaluations.append(
+                QueryEvaluation(
+                    query=query,
+                    intent=str(item.get("intent")) if item.get("intent") is not None else None,
+                    notes=str(item.get("notes")) if item.get("notes") is not None else None,
+                    relevant_paths=relevant_paths,
+                    runs=runs,
+                )
+            )
+            if args.progress_output is not None:
+                _write_json_payload(
+                    args.progress_output,
+                    _make_progress_payload(
+                        args=args,
+                        source_root=source_root,
+                        strategy_specs=strategy_specs,
+                        evaluations=evaluations,
+                        query_index=index,
+                        total_queries=total_queries,
+                        run_index=completed_runs,
+                        total_runs=total_runs,
+                        current_query=query,
+                        current_strategy_key="complete",
+                    ),
+                )
+    finally:
+        for runtime in strategy_runtimes.values():
+            try:
+                runtime.engine.close()
+            except Exception:
+                pass
+        for runtime in strategy_runtimes.values():
+            try:
+                runtime.registry.close()
+            except Exception:
+                pass
+
+    strategy_summaries: Dict[str, Dict[str, Any]] = {}
+    for spec in strategy_specs:
+        spec_runs = [evaluation.runs[spec.strategy_key] for evaluation in evaluations if spec.strategy_key in evaluation.runs]
+        summary = _summarize_runs(spec_runs)
+        summary["strategy"] = spec.strategy
+        summary["stage2_mode"] = spec.stage2_mode
+        strategy_summaries[spec.strategy_key] = summary
+
+    stage2_mode_matrix = {
+        mode: strategy_summaries[f"staged:{mode}"]
+        for mode in stage2_modes
+        if f"staged:{mode}" in strategy_summaries
+    }
+    pairwise_deltas = [asdict(item) for item in _build_pairwise_deltas(stage2_mode_matrix)]
+
+    payload = {
+        "status": "completed",
+        "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
+        "source": str(source_root),
+        "queries_file": str(args.queries_file),
+        "query_count": len(evaluations),
+        "query_match": args.query_match,
+        "k": int(args.k),
+        "coarse_k": int(args.coarse_k),
+        "local_only": True,
+        "strategies": strategy_summaries,
+        "stage2_mode_matrix": stage2_mode_matrix,
+        "pairwise_stage2_deltas": pairwise_deltas,
+        "config": {
+            "embedding_backend": config.embedding_backend,
+            "embedding_model": config.embedding_model,
+            "embedding_use_gpu": bool(config.embedding_use_gpu),
+            "reranker_backend": config.reranker_backend,
+            "reranker_model": config.reranker_model,
+            "reranker_use_gpu": bool(config.reranker_use_gpu),
+            "enable_staged_rerank": bool(config.enable_staged_rerank),
+            "enable_cross_encoder_rerank": bool(config.enable_cross_encoder_rerank),
+        },
+        "progress_output": str(args.progress_output) if args.progress_output else None,
+        "evaluations": [
+            {
+                "query": evaluation.query,
+                "intent": evaluation.intent,
+                "notes": evaluation.notes,
+                "relevant_paths": evaluation.relevant_paths,
+                "runs": {key: asdict(run) for key, run in evaluation.runs.items()},
+            }
+            for evaluation in evaluations
+        ],
+    }
+
+    _write_final_outputs(
+        output_path=args.output,
+        progress_output=args.progress_output,
+        payload=payload,
+    )
+    print(json.dumps(payload, ensure_ascii=False, indent=2))
+
+
+if __name__ == "__main__":
+    main()
diff --git a/codex-lens/benchmarks/compare_staged_realtime_vs_dense_rerank.py b/codex-lens/benchmarks/compare_staged_realtime_vs_dense_rerank.py
index 1b7ba079..fb6b26a1 100644
--- a/codex-lens/benchmarks/compare_staged_realtime_vs_dense_rerank.py
+++ b/codex-lens/benchmarks/compare_staged_realtime_vs_dense_rerank.py
@@ -280,8 +280,9 @@ def main() -> None:
     if args.staged_cluster_strategy:
         config.staged_clustering_strategy = str(args.staged_cluster_strategy)
     # Stability: on some Windows setups, fastembed + DirectML can crash under load.
-    # Dense_rerank uses the embedding backend that matches the index; force CPU here.
+    # Force local embeddings and reranking onto CPU for reproducible benchmark runs.
     config.embedding_use_gpu = False
+    config.reranker_use_gpu = False
     registry = RegistryStore()
     registry.initialize()
     mapper = PathMapper()
diff --git a/codex-lens/benchmarks/results/ccw_smart_search_stage2.json b/codex-lens/benchmarks/results/ccw_smart_search_stage2.json
new file mode 100644
index 00000000..418bac3e
--- /dev/null
+++ b/codex-lens/benchmarks/results/ccw_smart_search_stage2.json
@@ -0,0 +1,1704 @@
+{
+  "timestamp": "2026-03-12 15:52:13",
+  "source": "D:\\Claude_dms3",
+  "queries_file": "D:\\Claude_dms3\\codex-lens\\benchmarks\\accuracy_queries_ccw_smart_search.jsonl",
+  "query_count": 16,
+  "k": 10,
+  "coarse_k": 100,
+  "local_only": true,
+  "strategies": {
+    "dense_rerank": {
+      "query_count": 16,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 2493.8517937501892,
+      "p50_latency_ms": 2304.0422499999404,
+      "p95_latency_ms": 4031.03429999575,
+      "errors": 0,
+      "strategy": "dense_rerank",
+      "stage2_mode": null
+    },
+    "staged:precomputed": {
+      "query_count": 16,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 2238.0576249985024,
+      "p50_latency_ms": 1962.1620500013232,
+      "p95_latency_ms": 3110.8512249961495,
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "precomputed"
+    },
+    "staged:realtime": {
+      "query_count": 16,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 10686.986462499015,
+      "p50_latency_ms": 7027.59129999578,
+      "p95_latency_ms": 28732.387600000948,
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "realtime"
+    },
+    "staged:static_global_graph": {
+      "query_count": 16,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 2284.2186249988154,
+      "p50_latency_ms": 2174.274800002575,
+      "p95_latency_ms": 3254.683274999261,
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "static_global_graph"
+    }
+  },
+  "stage2_mode_matrix": {
+    "precomputed": {
+      "query_count": 16,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 2238.0576249985024,
+      "p50_latency_ms": 1962.1620500013232,
+      "p95_latency_ms": 3110.8512249961495,
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "precomputed"
+    },
+    "realtime": {
+      "query_count": 16,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 10686.986462499015,
+      "p50_latency_ms": 7027.59129999578,
+      "p95_latency_ms": 28732.387600000948,
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "realtime"
+    },
+    "static_global_graph": {
+      "query_count": 16,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 2284.2186249988154,
+      "p50_latency_ms": 2174.274800002575,
+      "p95_latency_ms": 3254.683274999261,
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "static_global_graph"
+    }
+  },
+  "pairwise_stage2_deltas": [
+    {
+      "mode_a": "precomputed",
+      "mode_b": "realtime",
+      "hit_at_k_delta": 0.0,
+      "mrr_at_k_delta": 0.0,
+      "avg_recall_at_k_delta": 0.0,
+      "avg_latency_ms_delta": -8448.928837500513
+    },
+    {
+      "mode_a": "precomputed",
+      "mode_b": "static_global_graph",
+      "hit_at_k_delta": 0.0,
+      "mrr_at_k_delta": 0.0,
+      "avg_recall_at_k_delta": 0.0,
+      "avg_latency_ms_delta": -46.161000000312924
+    },
+    {
+      "mode_a": "realtime",
+      "mode_b": "static_global_graph",
+      "hit_at_k_delta": 0.0,
+      "mrr_at_k_delta": 0.0,
+      "avg_recall_at_k_delta": 0.0,
+      "avg_latency_ms_delta": 8402.7678375002
+    }
+  ],
+  "config": {
+    "embedding_backend": "fastembed",
+    "embedding_model": "code",
+    "embedding_use_gpu": false,
+    "reranker_backend": "onnx",
+    "reranker_model": "D:/Claude_dms3/codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2",
+    "enable_staged_rerank": true,
+    "enable_cross_encoder_rerank": true
+  },
+  "evaluations": [
+    {
+      "query": "executeHybridMode dense_rerank semantic smart_search",
+      "intent": "ccw-semantic-routing",
+      "notes": "CCW semantic mode delegates to CodexLens dense_rerank.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\ccw\\src\\tools\\smart-search.ts"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 5607.933899998665,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-validator.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\list.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\secret-redactor.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\health-check-service.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\view.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\api-key-tester.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 1853.0870999991894,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\api-key-tester.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\shell-escape.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\rate-limiter.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\commands-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\core-memory-store.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\core-memory-store.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 10468.899399995804,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\api-key-tester.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\shell-escape.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\rate-limiter.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\commands-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\core-memory-store.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\core-memory-store.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 1445.837599992752,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\api-key-tester.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\shell-escape.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\rate-limiter.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\commands-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\core-memory-store.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\core-memory-store.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "parse CodexLens JSON output strip ANSI smart_search",
+      "intent": "ccw-json-fallback",
+      "notes": "Covers JSON/plain-text fallback handling for CodexLens output.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\ccw\\src\\tools\\smart-search.ts"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 1518.7583000063896,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\utils\\secret-redactor.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\rules-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-context-builder.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\secret-redactor.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\cli-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\cli.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-queries.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\mcp-templates-db.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 1467.957000002265,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-queries.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\loop.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\memory-store.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\install.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\secret-redactor.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\rules-routes.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 35793.74619999528,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-queries.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\loop.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\memory-store.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\install.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\secret-redactor.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\rules-routes.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 2019.9724999964237,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-queries.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\loop.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\memory-store.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\install.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\secret-redactor.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\rules-routes.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "smart_search init embed search action schema",
+      "intent": "ccw-action-schema",
+      "notes": "Find the Zod schema that defines init/embed/search actions.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\ccw\\src\\tools\\smart-search.ts"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 2091.47919999063,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-parser.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\shell-escape.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\rate-limiter.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\secret-redactor.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\file-reader.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\rules-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\secret-redactor.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 2017.3953999876976,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\utils\\secret-redactor.d.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\team-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\serve.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\api-key-tester.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-parser.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 2941.078400015831,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\utils\\secret-redactor.d.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\team-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\serve.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\api-key-tester.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-parser.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 1921.6328999996185,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\utils\\secret-redactor.d.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\team-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\serve.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\api-key-tester.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-parser.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "auto init missing job dedupe smart_search",
+      "intent": "ccw-auto-init",
+      "notes": "Targets background init/embed warmup and dedupe state.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\ccw\\src\\tools\\smart-search.ts"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 1662.2750000059605,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\docs-frontend.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\cache-manager.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\dashboard-launcher.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\rules-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\react-frontend.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 1746.6091000139713,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\rules-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\react-frontend.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\team-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\server.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\cli-session-mux.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\cli.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\loop.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 6291.47570002079,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\rules-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\react-frontend.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\team-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\server.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\cli-session-mux.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\cli.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\loop.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 1718.0125000029802,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\rules-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\react-frontend.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\team-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\server.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\cli-session-mux.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\cli.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\loop.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "smart_search exact mode fallback to CodexLens fts",
+      "intent": "ccw-exact-fallback",
+      "notes": "Tracks the exact-mode fallback path into CodexLens FTS.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\ccw\\src\\tools\\smart-search.ts"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 1511.011400014162,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\files-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\codexlens-path.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\provider-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\uv-manager.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-validator.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-queries.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\secret-redactor.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 1897.7800999879837,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\files-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-validator.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-context-builder.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\server.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\install.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\codexlens-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 6647.179499998689,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\files-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-validator.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-context-builder.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\server.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\install.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\codexlens-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 2328.577100008726,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\files-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-validator.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-context-builder.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\server.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\install.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\codexlens-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "smart_search settings snapshot embedding backend reranker backend staged stage2 mode",
+      "intent": "ccw-config-snapshot",
+      "notes": "Reads local config snapshot for embedding/reranker/staged pipeline settings.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\ccw\\src\\tools\\smart-search.ts"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 2516.6053000092506,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\rate-limiter.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\file-reader.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\docs-frontend.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\help-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\workflow.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\core-memory.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 2778.8519999980927,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\docs-frontend.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\core-memory.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\file-reader.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\websocket.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\config-backup.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\serve.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 4940.330799981952,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\docs-frontend.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\core-memory.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\file-reader.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\websocket.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\config-backup.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\serve.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 3191.194299995899,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\docs-frontend.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\core-memory.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\file-reader.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\websocket.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\config-backup.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\serve.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "embedding backend fastembed local litellm api config",
+      "intent": "codexlens-embedding-config",
+      "notes": "Local-only benchmark should resolve to fastembed defaults.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\codex-lens\\src\\codexlens\\config.py"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 2773.382699996233,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\utils\\file-reader.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-context-builder.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\cli.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\files-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-parser.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-resolver.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 2465.842600002885,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\files-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\file-reader.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-resolver.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-context-builder.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\cli.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-parser.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 17898.587700009346,
+          "topk_paths": [
+            "d:\\claude_dms3\\codex-lens\\.venv\\lib\\site-packages\\sympy\\plotting\\backends\\base_backend.py",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\files-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\file-reader.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-resolver.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\pattern-detector.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-context-builder.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\cli.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 3331.694400012493,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\files-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\file-reader.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-resolver.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-context-builder.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\cli.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-parser.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "reranker backend onnx api legacy configuration",
+      "intent": "codexlens-reranker-config",
+      "notes": "Covers both config dataclass fields and env overrides.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\codex-lens\\src\\codexlens\\config.py",
+        "D:\\Claude_dms3\\codex-lens\\src\\codexlens\\env_config.py"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 3433.85640001297,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\commands-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\react-frontend.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-context-builder.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\docs-frontend.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\uv-manager.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\issue.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\cli.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 2722.7298999875784,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\uv-manager.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\cli.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\commands-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\install.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\memory-store.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\react-frontend.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 6998.953399986029,
+          "topk_paths": [
+            "d:\\claude_dms3\\codex-lens\\.venv\\lib\\site-packages\\sympy\\plotting\\backends\\base_backend.py",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\uv-manager.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\cli.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\data-aggregator.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\install.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\memory-store.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 2707.838899999857,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\uv-manager.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\cli.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\commands-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\install.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\memory-store.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\react-frontend.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "staged stage2 mode precomputed realtime static_global_graph",
+      "intent": "codexlens-stage2-config",
+      "notes": "Benchmark matrix should exercise the three supported stage2 modes.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\codex-lens\\src\\codexlens\\config.py",
+        "D:\\Claude_dms3\\codex-lens\\src\\codexlens\\env_config.py"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 2557.460299998522,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\health-check-service.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\python-utils.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-queries.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\codexlens-path.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-resolver.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 2611.47199998796,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\utils\\codexlens-path.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\api-key-tester.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\dashboard-generator.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\team.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\install.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\pending-question-service.d.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 9986.3125,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\utils\\codexlens-path.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\api-key-tester.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\dashboard-generator.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\team.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\install.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\pending-question-service.d.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 2705.1958999931812,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\utils\\codexlens-path.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\api-key-tester.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\dashboard-generator.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\team.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\install.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\pending-question-service.d.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "enable staged rerank stage 4 config",
+      "intent": "codexlens-stage4-rerank",
+      "notes": "Stage 4 rerank flag needs to stay enabled for local benchmarks.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\codex-lens\\src\\codexlens\\config.py"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 2839.552300006151,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-resolver.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\python-utils.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\health-check-service.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\package-discovery.d.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\session-path-resolver.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\orchestrator-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\data-aggregator.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 3044.0294999927282,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\utils\\package-discovery.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\ccw-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\uv-manager.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\session-path-resolver.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\workflow.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\websocket.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 12196.75379998982,
+          "topk_paths": [
+            "d:\\claude_dms3\\codex-lens\\build\\lib\\codexlens\\semantic\\reranker\\fastembed_reranker.py",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\package-discovery.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\ccw-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\session-path-resolver.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\workflow.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\websocket.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 2919.969099998474,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\utils\\package-discovery.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\ccw-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\uv-manager.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\session-path-resolver.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\workflow.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\websocket.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "cascade_search dense_rerank staged pipeline ChainSearchEngine",
+      "intent": "chain-search-cascade",
+      "notes": "Baseline query for the central retrieval engine.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\codex-lens\\src\\codexlens\\search\\chain_search.py"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 3082.173699989915,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\package-discovery.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\commands-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\health-check-service.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\dashboard-generator.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\rate-limiter.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 3012.5525999963284,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\commands-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\rate-limiter.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\core-memory-store.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\config-backup.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-validator.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\workflow.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\memory.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 10854.694199994206,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\commands-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\rate-limiter.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\core-memory-store.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\config-backup.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-validator.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\workflow.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\memory.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 3229.01289999485,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\commands-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\rate-limiter.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\core-memory-store.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\config-backup.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-validator.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\workflow.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\memory.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "realtime LSP expand stage2 search pipeline",
+      "intent": "chain-search-stage2-realtime",
+      "notes": "Targets realtime stage2 expansion logic.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\codex-lens\\src\\codexlens\\search\\chain_search.py"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 3505.4010999947786,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\rules-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-queries.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-resolver.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\files-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\memory-extraction-pipeline.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 3311.3164000064135,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-resolver.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-parser.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\api-key-tester.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\dashboard-generator.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\core-memory.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 26378.601400002837,
+          "topk_paths": [
+            "d:\\claude_dms3\\codex-lens\\.venv\\lib\\site-packages\\optimum\\onnxruntime\\pipelines.py",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-resolver.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\health-check-service.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-parser.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\api-key-tester.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\dashboard-generator.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\core-memory.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 2472.5419999957085,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-resolver.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\outline-parser.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\api-key-tester.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\dashboard-generator.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\core-memory.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "static global graph stage2 expansion implementation",
+      "intent": "chain-search-stage2-static",
+      "notes": "Targets static_global_graph stage2 expansion logic.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\codex-lens\\src\\codexlens\\search\\chain_search.py"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 1676.1588000059128,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\health-check-service.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-resolver.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\loop.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\system-routes.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\serve.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\commands-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\install.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\team.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 1614.9786999970675,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\serve.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\commands-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\config-backup.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\health-check-service.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\docs-frontend.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\codexlens-path.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\memory-extraction-pipeline.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\pattern-detector.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 2153.07349999249,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\serve.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\commands-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\config-backup.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\health-check-service.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\docs-frontend.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\codexlens-path.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\memory-extraction-pipeline.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\pattern-detector.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 1658.4901999980211,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\serve.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\commands-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\config-backup.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\health-check-service.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\docs-frontend.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\codexlens-path.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\memory-extraction-pipeline.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\pattern-detector.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "cross encoder rerank stage 4 implementation",
+      "intent": "chain-search-rerank",
+      "notes": "Relevant for dense_rerank and staged rerank latency comparisons.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\codex-lens\\src\\codexlens\\search\\chain_search.py"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 1556.9279999881983,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\claude-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\docs-frontend.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\uv-manager.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\package-discovery.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\commands-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\workflow.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\files-routes.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 1772.8751000016928,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\commands\\install.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\package-discovery.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\workflow.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\config-backup.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\server.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\cache-manager.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 7056.229200005531,
+          "topk_paths": [
+            "d:\\claude_dms3\\codex-lens\\.venv\\lib\\site-packages\\fastembed\\rerank\\cross_encoder\\onnx_text_cross_encoder.py",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\install.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\package-discovery.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\workflow.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\config-backup.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\server.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\cache-manager.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 1721.4015000015497,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\commands\\install.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\package-discovery.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\workflow.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\config-backup.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\server.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\cache-manager.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "get_reranker factory onnx backend selection",
+      "intent": "reranker-factory",
+      "notes": "Keeps the benchmark aligned with local ONNX reranker selection.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\codex-lens\\src\\codexlens\\semantic\\reranker\\factory.py"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 2038.9054999947548,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\uninstall.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\docs-frontend.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\data-aggregator.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\uv-manager.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 1906.9287000149488,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\uv-manager.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\files-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\docs-frontend.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\websocket.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\loop.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\websocket.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 4809.299199998379,
+          "topk_paths": [
+            "d:\\claude_dms3\\.workflow\\.bench\\ccw-smart-search-mini-20260312\\codex-lens\\src\\codexlens\\semantic\\reranker\\factory.py",
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\uv-manager.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\docs-frontend.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\websocket.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\loop.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 1549.4464999884367,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\discovery-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\uv-manager.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\files-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\docs-frontend.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\websocket.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\loop.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\websocket.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "EMBEDDING_BACKEND and RERANKER_BACKEND environment variables",
+      "intent": "env-overrides",
+      "notes": "Covers CCW/CodexLens local-only environment overrides.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\codex-lens\\src\\codexlens\\env_config.py"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 1529.7467999905348,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\shell-escape.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\upgrade.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\react-frontend.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-resolver.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\files-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\docs-frontend.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\codexlens-path.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\python-utils.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 1584.515799999237,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-resolver.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\files-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\workflow.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\memory-job-scheduler.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\shell-escape.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 5576.568499997258,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-resolver.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\files-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\workflow.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\memory-job-scheduler.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\shell-escape.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 1626.6797000020742,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\dist\\assets\\index-b4psv8bd.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\path-resolver.d.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\files-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\graph-routes.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\remote-notification-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\services\\flow-executor.js",
+            "d:\\claude_dms3\\ccw\\dist\\commands\\workflow.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\unified-memory-service.js",
+            "d:\\claude_dms3\\ccw\\dist\\core\\memory-job-scheduler.js",
+            "d:\\claude_dms3\\ccw\\dist\\utils\\shell-escape.d.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    }
+  ]
+}
\ No newline at end of file
diff --git a/codex-lens/benchmarks/results/ccw_smart_search_stage2_sample4_20260314.json b/codex-lens/benchmarks/results/ccw_smart_search_stage2_sample4_20260314.json
new file mode 100644
index 00000000..cb40f339
--- /dev/null
+++ b/codex-lens/benchmarks/results/ccw_smart_search_stage2_sample4_20260314.json
@@ -0,0 +1,526 @@
+{
+  "timestamp": "2026-03-14 23:16:55",
+  "source": "D:\\Claude_dms3",
+  "queries_file": "D:\\Claude_dms3\\codex-lens\\benchmarks\\accuracy_queries_ccw_smart_search.jsonl",
+  "query_count": 4,
+  "k": 10,
+  "coarse_k": 100,
+  "local_only": true,
+  "strategies": {
+    "dense_rerank": {
+      "query_count": 4,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 20171.940174996853,
+      "p50_latency_ms": 14222.247749984264,
+      "p95_latency_ms": 35222.31535999476,
+      "errors": 0,
+      "strategy": "dense_rerank",
+      "stage2_mode": null
+    },
+    "staged:precomputed": {
+      "query_count": 4,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 13679.793299987912,
+      "p50_latency_ms": 12918.63379997015,
+      "p95_latency_ms": 16434.964765003322,
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "precomputed"
+    },
+    "staged:realtime": {
+      "query_count": 4,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 13885.101849973202,
+      "p50_latency_ms": 13826.323699980974,
+      "p95_latency_ms": 14867.712269958853,
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "realtime"
+    },
+    "staged:static_global_graph": {
+      "query_count": 4,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 13336.124025002122,
+      "p50_latency_ms": 13415.476950019598,
+      "p95_latency_ms": 13514.329230004549,
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "static_global_graph"
+    }
+  },
+  "stage2_mode_matrix": {
+    "precomputed": {
+      "query_count": 4,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 13679.793299987912,
+      "p50_latency_ms": 12918.63379997015,
+      "p95_latency_ms": 16434.964765003322,
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "precomputed"
+    },
+    "realtime": {
+      "query_count": 4,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 13885.101849973202,
+      "p50_latency_ms": 13826.323699980974,
+      "p95_latency_ms": 14867.712269958853,
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "realtime"
+    },
+    "static_global_graph": {
+      "query_count": 4,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 13336.124025002122,
+      "p50_latency_ms": 13415.476950019598,
+      "p95_latency_ms": 13514.329230004549,
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "static_global_graph"
+    }
+  },
+  "pairwise_stage2_deltas": [
+    {
+      "mode_a": "precomputed",
+      "mode_b": "realtime",
+      "hit_at_k_delta": 0.0,
+      "mrr_at_k_delta": 0.0,
+      "avg_recall_at_k_delta": 0.0,
+      "avg_latency_ms_delta": -205.30854998528957
+    },
+    {
+      "mode_a": "precomputed",
+      "mode_b": "static_global_graph",
+      "hit_at_k_delta": 0.0,
+      "mrr_at_k_delta": 0.0,
+      "avg_recall_at_k_delta": 0.0,
+      "avg_latency_ms_delta": 343.66927498579025
+    },
+    {
+      "mode_a": "realtime",
+      "mode_b": "static_global_graph",
+      "hit_at_k_delta": 0.0,
+      "mrr_at_k_delta": 0.0,
+      "avg_recall_at_k_delta": 0.0,
+      "avg_latency_ms_delta": 548.9778249710798
+    }
+  ],
+  "config": {
+    "embedding_backend": "fastembed",
+    "embedding_model": "code",
+    "embedding_use_gpu": false,
+    "reranker_backend": "onnx",
+    "reranker_model": "cross-encoder/ms-marco-MiniLM-L-6-v2",
+    "enable_staged_rerank": true,
+    "enable_cross_encoder_rerank": true
+  },
+  "evaluations": [
+    {
+      "query": "executeHybridMode dense_rerank semantic smart_search",
+      "intent": "ccw-semantic-routing",
+      "notes": "CCW semantic mode delegates to CodexLens dense_rerank.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\ccw\\src\\tools\\smart-search.ts"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 38829.27079999447,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\issue-routes.ts",
+            "d:\\claude_dms3\\ccw\\src\\tools\\session-manager.ts",
+            "d:\\claude_dms3\\ccw\\src\\types\\queue-types.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\shared\\nativesessionpanel.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\history-importer.ts",
+            "d:\\claude_dms3\\ccw\\src\\core\\memory-extraction-pipeline.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\tests\\e2e\\skills-page.spec.ts",
+            "d:\\claude_dms3\\ccw\\dist\\tools\\discover-design-files.js",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\api-settings\\clisettingsmodal.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\tests\\e2e\\api-settings.spec.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 16915.833400011063,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\src\\tools\\native-session-discovery.ts",
+            "d:\\claude_dms3\\ccw\\src\\commands\\memory.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\hooks\\useissues.test.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\cli-sessions-routes.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\lib\\api.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\shared\\filepreview.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\hooks\\hook-templates.ts",
+            "d:\\claude_dms3\\ccw\\src\\utils\\file-reader.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\cli-sessions-routes.js",
+            "d:\\claude_dms3\\ccw\\src\\core\\history-importer.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 13961.2567999959,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\src\\tools\\native-session-discovery.ts",
+            "d:\\claude_dms3\\ccw\\src\\commands\\memory.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\hooks\\useissues.test.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\cli-sessions-routes.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\lib\\api.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\shared\\filepreview.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\hooks\\hook-templates.ts",
+            "d:\\claude_dms3\\ccw\\src\\utils\\file-reader.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\cli-sessions-routes.js",
+            "d:\\claude_dms3\\ccw\\src\\core\\history-importer.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 12986.330999970436,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\src\\tools\\native-session-discovery.ts",
+            "d:\\claude_dms3\\ccw\\src\\commands\\memory.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\hooks\\useissues.test.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\cli-sessions-routes.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\lib\\api.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\shared\\filepreview.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\hooks\\hook-templates.ts",
+            "d:\\claude_dms3\\ccw\\src\\utils\\file-reader.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\cli-sessions-routes.js",
+            "d:\\claude_dms3\\ccw\\src\\core\\history-importer.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "parse CodexLens JSON output strip ANSI smart_search",
+      "intent": "ccw-json-fallback",
+      "notes": "Covers JSON/plain-text fallback handling for CodexLens output.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\ccw\\src\\tools\\smart-search.ts"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 14782.901199996471,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\src\\tools\\codex-lens-lsp.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\issue\\queue\\queueexecuteinsession.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\terminal-dashboard\\queuepanel.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\hooks\\usewebsocket.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\hooks\\useflows.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\tests\\e2e\\api-error-monitoring.spec.ts",
+            "d:\\claude_dms3\\ccw\\tests\\native-session-discovery.test.ts",
+            "d:\\claude_dms3\\ccw\\src\\core\\services\\checkpoint-service.ts",
+            "d:\\claude_dms3\\ccw\\tests\\integration\\system-routes.test.ts",
+            "d:\\claude_dms3\\ccw\\src\\tools\\native-session-discovery.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 13710.042499959469,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\src\\hooks\\userealtimeupdates.ts",
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\cli-routes.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\stores\\queueexecutionstore.ts",
+            "d:\\claude_dms3\\ccw\\src\\tools\\native-session-discovery.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\lib\\themeshare.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\shared\\clistreampanel.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\terminal-panel\\queueexecutionlistview.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\tests\\e2e\\api-settings.spec.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\test\\i18n.tsx",
+            "d:\\claude_dms3\\ccw\\dist\\core\\history-importer.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 15027.674999952316,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\src\\hooks\\userealtimeupdates.ts",
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\cli-routes.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\stores\\queueexecutionstore.ts",
+            "d:\\claude_dms3\\ccw\\src\\tools\\native-session-discovery.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\lib\\themeshare.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\shared\\clistreampanel.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\terminal-panel\\queueexecutionlistview.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\tests\\e2e\\api-settings.spec.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\test\\i18n.tsx",
+            "d:\\claude_dms3\\ccw\\dist\\core\\history-importer.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 13389.622500002384,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\src\\hooks\\userealtimeupdates.ts",
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\cli-routes.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\stores\\queueexecutionstore.ts",
+            "d:\\claude_dms3\\ccw\\src\\tools\\native-session-discovery.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\lib\\themeshare.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\shared\\clistreampanel.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\terminal-panel\\queueexecutionlistview.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\tests\\e2e\\api-settings.spec.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\test\\i18n.tsx",
+            "d:\\claude_dms3\\ccw\\dist\\core\\history-importer.js"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "smart_search init embed search action schema",
+      "intent": "ccw-action-schema",
+      "notes": "Find the Zod schema that defines init/embed/search actions.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\ccw\\src\\tools\\smart-search.ts"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 13661.594299972057,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\src\\tools\\ask-question.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\a2ui\\a2uipopupcard.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\discovery-routes.ts",
+            "d:\\claude_dms3\\ccw\\src\\core\\a2ui\\a2uiwebsockethandler.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\hooks\\useissues.test.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\tests\\e2e\\discovery.spec.ts",
+            "d:\\claude_dms3\\ccw\\src\\tools\\__tests__\\ask-question.test.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\tests\\e2e\\api-settings.spec.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\a2ui\\a2uiwebsockethandler.js",
+            "d:\\claude_dms3\\ccw\\frontend\\tests\\e2e\\dashboard.spec.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 12127.225099980831,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\src\\lib\\api.ts",
+            "d:\\claude_dms3\\ccw\\src\\core\\lite-scanner-complete.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\shared\\themeselector.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\team\\teamheader.tsx",
+            "d:\\claude_dms3\\ccw\\src\\tools\\ask-question.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\a2ui\\a2uipopupcard.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\issue\\discovery\\findinglist.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\api-settings\\clisettingsmodal.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\discovery-routes.ts",
+            "d:\\claude_dms3\\ccw\\src\\core\\a2ui\\a2uiwebsockethandler.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 12860.084999978542,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\src\\lib\\api.ts",
+            "d:\\claude_dms3\\ccw\\src\\core\\lite-scanner-complete.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\shared\\themeselector.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\team\\teamheader.tsx",
+            "d:\\claude_dms3\\ccw\\src\\tools\\ask-question.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\a2ui\\a2uipopupcard.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\issue\\discovery\\findinglist.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\api-settings\\clisettingsmodal.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\discovery-routes.ts",
+            "d:\\claude_dms3\\ccw\\src\\core\\a2ui\\a2uiwebsockethandler.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 13441.331400036812,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\src\\lib\\api.ts",
+            "d:\\claude_dms3\\ccw\\src\\core\\lite-scanner-complete.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\shared\\themeselector.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\team\\teamheader.tsx",
+            "d:\\claude_dms3\\ccw\\src\\tools\\ask-question.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\a2ui\\a2uipopupcard.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\issue\\discovery\\findinglist.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\api-settings\\clisettingsmodal.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\discovery-routes.ts",
+            "d:\\claude_dms3\\ccw\\src\\core\\a2ui\\a2uiwebsockethandler.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    },
+    {
+      "query": "auto init missing job dedupe smart_search",
+      "intent": "ccw-auto-init",
+      "notes": "Targets background init/embed warmup and dedupe state.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\ccw\\src\\tools\\smart-search.ts"
+      ],
+      "runs": {
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "latency_ms": 13413.994400024414,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\frontend\\src\\pages\\memorypage.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\memory-routes.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\lib\\api.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\hooks\\usememory.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\shared\\batchoperationtoolbar.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\tests\\e2e\\memory.spec.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\hooks\\useprompthistory.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\stores\\flowstore.ts",
+            "d:\\claude_dms3\\ccw\\src\\services\\deepwiki-service.ts",
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\claude-routes.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "latency_ms": 11966.072200000286,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\src\\commands\\memory.ts",
+            "d:\\claude_dms3\\codex-lens\\src\\codexlens\\lsp\\handlers.py",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\ui\\commandcombobox.tsx",
+            "d:\\claude_dms3\\codex-lens\\src\\codexlens\\search\\global_graph_expander.py",
+            "d:\\claude_dms3\\codex-lens\\src\\codexlens\\api\\definition.py",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\orchestrator\\orchestrationplanbuilder.ts",
+            "d:\\claude_dms3\\codex-lens\\build\\lib\\codexlens\\lsp\\handlers.py",
+            "d:\\claude_dms3\\codex-lens\\build\\lib\\codexlens\\search\\global_graph_expander.py",
+            "d:\\claude_dms3\\codex-lens\\build\\lib\\codexlens\\api\\definition.py",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\pages\\memorypage.tsx"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "latency_ms": 13691.39059996605,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\src\\commands\\memory.ts",
+            "d:\\claude_dms3\\codex-lens\\src\\codexlens\\lsp\\handlers.py",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\ui\\commandcombobox.tsx",
+            "d:\\claude_dms3\\codex-lens\\src\\codexlens\\search\\global_graph_expander.py",
+            "d:\\claude_dms3\\codex-lens\\src\\codexlens\\api\\definition.py",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\orchestrator\\orchestrationplanbuilder.ts",
+            "d:\\claude_dms3\\codex-lens\\build\\lib\\codexlens\\lsp\\handlers.py",
+            "d:\\claude_dms3\\codex-lens\\build\\lib\\codexlens\\search\\global_graph_expander.py",
+            "d:\\claude_dms3\\codex-lens\\build\\lib\\codexlens\\api\\definition.py",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\pages\\memorypage.tsx"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "latency_ms": 13527.211199998856,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\src\\commands\\memory.ts",
+            "d:\\claude_dms3\\codex-lens\\src\\codexlens\\lsp\\handlers.py",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\ui\\commandcombobox.tsx",
+            "d:\\claude_dms3\\codex-lens\\src\\codexlens\\search\\global_graph_expander.py",
+            "d:\\claude_dms3\\codex-lens\\src\\codexlens\\api\\definition.py",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\orchestrator\\orchestrationplanbuilder.ts",
+            "d:\\claude_dms3\\codex-lens\\build\\lib\\codexlens\\lsp\\handlers.py",
+            "d:\\claude_dms3\\codex-lens\\build\\lib\\codexlens\\search\\global_graph_expander.py",
+            "d:\\claude_dms3\\codex-lens\\build\\lib\\codexlens\\api\\definition.py",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\pages\\memorypage.tsx"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "error": null
+        }
+      }
+    }
+  ]
+}
\ No newline at end of file
diff --git a/codex-lens/benchmarks/results/ccw_smart_search_stage2_smoke1_cpu_reranker_20260314.json b/codex-lens/benchmarks/results/ccw_smart_search_stage2_smoke1_cpu_reranker_20260314.json
new file mode 100644
index 00000000..a6f5dc8d
--- /dev/null
+++ b/codex-lens/benchmarks/results/ccw_smart_search_stage2_smoke1_cpu_reranker_20260314.json
@@ -0,0 +1,415 @@
+{
+  "timestamp": "2026-03-15 00:19:16",
+  "source": "D:\\Claude_dms3",
+  "queries_file": "D:\\Claude_dms3\\codex-lens\\benchmarks\\accuracy_queries_ccw_smart_search.jsonl",
+  "query_count": 1,
+  "k": 10,
+  "coarse_k": 100,
+  "local_only": true,
+  "strategies": {
+    "auto": {
+      "query_count": 1,
+      "hit_at_k": 1.0,
+      "mrr_at_k": 1.0,
+      "avg_recall_at_k": 1.0,
+      "avg_latency_ms": 1377.3565999865532,
+      "p50_latency_ms": 1377.3565999865532,
+      "p95_latency_ms": 1377.3565999865532,
+      "avg_generated_artifact_count": 0.0,
+      "avg_test_file_count": 0.0,
+      "runs_with_generated_artifacts": 0,
+      "runs_with_test_files": 0,
+      "effective_methods": {
+        "fts": 1
+      },
+      "errors": 0,
+      "strategy": "auto",
+      "stage2_mode": null
+    },
+    "fts": {
+      "query_count": 1,
+      "hit_at_k": 1.0,
+      "mrr_at_k": 1.0,
+      "avg_recall_at_k": 1.0,
+      "avg_latency_ms": 1460.0819000601768,
+      "p50_latency_ms": 1460.0819000601768,
+      "p95_latency_ms": 1460.0819000601768,
+      "avg_generated_artifact_count": 0.0,
+      "avg_test_file_count": 0.0,
+      "runs_with_generated_artifacts": 0,
+      "runs_with_test_files": 0,
+      "effective_methods": {
+        "fts": 1
+      },
+      "errors": 0,
+      "strategy": "fts",
+      "stage2_mode": null
+    },
+    "hybrid": {
+      "query_count": 1,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 45991.74140000343,
+      "p50_latency_ms": 45991.74140000343,
+      "p95_latency_ms": 45991.74140000343,
+      "avg_generated_artifact_count": 0.0,
+      "avg_test_file_count": 0.0,
+      "runs_with_generated_artifacts": 0,
+      "runs_with_test_files": 0,
+      "effective_methods": {
+        "hybrid": 1
+      },
+      "errors": 0,
+      "strategy": "hybrid",
+      "stage2_mode": null
+    },
+    "dense_rerank": {
+      "query_count": 1,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 22739.62610000372,
+      "p50_latency_ms": 22739.62610000372,
+      "p95_latency_ms": 22739.62610000372,
+      "avg_generated_artifact_count": 1.0,
+      "avg_test_file_count": 2.0,
+      "runs_with_generated_artifacts": 1,
+      "runs_with_test_files": 1,
+      "effective_methods": {
+        "dense_rerank": 1
+      },
+      "errors": 0,
+      "strategy": "dense_rerank",
+      "stage2_mode": null
+    },
+    "staged:precomputed": {
+      "query_count": 1,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 14900.017599999905,
+      "p50_latency_ms": 14900.017599999905,
+      "p95_latency_ms": 14900.017599999905,
+      "avg_generated_artifact_count": 1.0,
+      "avg_test_file_count": 0.0,
+      "runs_with_generated_artifacts": 1,
+      "runs_with_test_files": 0,
+      "effective_methods": {
+        "staged": 1
+      },
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "precomputed"
+    },
+    "staged:realtime": {
+      "query_count": 1,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 14104.314599990845,
+      "p50_latency_ms": 14104.314599990845,
+      "p95_latency_ms": 14104.314599990845,
+      "avg_generated_artifact_count": 1.0,
+      "avg_test_file_count": 0.0,
+      "runs_with_generated_artifacts": 1,
+      "runs_with_test_files": 0,
+      "effective_methods": {
+        "staged": 1
+      },
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "realtime"
+    },
+    "staged:static_global_graph": {
+      "query_count": 1,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 11906.852500021458,
+      "p50_latency_ms": 11906.852500021458,
+      "p95_latency_ms": 11906.852500021458,
+      "avg_generated_artifact_count": 1.0,
+      "avg_test_file_count": 0.0,
+      "runs_with_generated_artifacts": 1,
+      "runs_with_test_files": 0,
+      "effective_methods": {
+        "staged": 1
+      },
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "static_global_graph"
+    }
+  },
+  "stage2_mode_matrix": {
+    "precomputed": {
+      "query_count": 1,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 14900.017599999905,
+      "p50_latency_ms": 14900.017599999905,
+      "p95_latency_ms": 14900.017599999905,
+      "avg_generated_artifact_count": 1.0,
+      "avg_test_file_count": 0.0,
+      "runs_with_generated_artifacts": 1,
+      "runs_with_test_files": 0,
+      "effective_methods": {
+        "staged": 1
+      },
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "precomputed"
+    },
+    "realtime": {
+      "query_count": 1,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 14104.314599990845,
+      "p50_latency_ms": 14104.314599990845,
+      "p95_latency_ms": 14104.314599990845,
+      "avg_generated_artifact_count": 1.0,
+      "avg_test_file_count": 0.0,
+      "runs_with_generated_artifacts": 1,
+      "runs_with_test_files": 0,
+      "effective_methods": {
+        "staged": 1
+      },
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "realtime"
+    },
+    "static_global_graph": {
+      "query_count": 1,
+      "hit_at_k": 0.0,
+      "mrr_at_k": 0.0,
+      "avg_recall_at_k": 0.0,
+      "avg_latency_ms": 11906.852500021458,
+      "p50_latency_ms": 11906.852500021458,
+      "p95_latency_ms": 11906.852500021458,
+      "avg_generated_artifact_count": 1.0,
+      "avg_test_file_count": 0.0,
+      "runs_with_generated_artifacts": 1,
+      "runs_with_test_files": 0,
+      "effective_methods": {
+        "staged": 1
+      },
+      "errors": 0,
+      "strategy": "staged",
+      "stage2_mode": "static_global_graph"
+    }
+  },
+  "pairwise_stage2_deltas": [
+    {
+      "mode_a": "precomputed",
+      "mode_b": "realtime",
+      "hit_at_k_delta": 0.0,
+      "mrr_at_k_delta": 0.0,
+      "avg_recall_at_k_delta": 0.0,
+      "avg_latency_ms_delta": 795.7030000090599
+    },
+    {
+      "mode_a": "precomputed",
+      "mode_b": "static_global_graph",
+      "hit_at_k_delta": 0.0,
+      "mrr_at_k_delta": 0.0,
+      "avg_recall_at_k_delta": 0.0,
+      "avg_latency_ms_delta": 2993.165099978447
+    },
+    {
+      "mode_a": "realtime",
+      "mode_b": "static_global_graph",
+      "hit_at_k_delta": 0.0,
+      "mrr_at_k_delta": 0.0,
+      "avg_recall_at_k_delta": 0.0,
+      "avg_latency_ms_delta": 2197.462099969387
+    }
+  ],
+  "config": {
+    "embedding_backend": "fastembed",
+    "embedding_model": "code",
+    "embedding_use_gpu": false,
+    "reranker_backend": "onnx",
+    "reranker_model": "cross-encoder/ms-marco-MiniLM-L-6-v2",
+    "reranker_use_gpu": false,
+    "enable_staged_rerank": true,
+    "enable_cross_encoder_rerank": true
+  },
+  "evaluations": [
+    {
+      "query": "executeHybridMode dense_rerank semantic smart_search",
+      "intent": "ccw-semantic-routing",
+      "notes": "CCW semantic mode delegates to CodexLens dense_rerank.",
+      "relevant_paths": [
+        "D:\\Claude_dms3\\ccw\\src\\tools\\smart-search.ts"
+      ],
+      "runs": {
+        "auto": {
+          "strategy_key": "auto",
+          "strategy": "auto",
+          "stage2_mode": null,
+          "effective_method": "fts",
+          "execution_method": "fts",
+          "latency_ms": 1377.3565999865532,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\src\\tools\\smart-search.ts"
+          ],
+          "first_hit_rank": 1,
+          "hit_at_k": true,
+          "recall_at_k": 1.0,
+          "generated_artifact_count": 0,
+          "test_file_count": 0,
+          "error": null
+        },
+        "fts": {
+          "strategy_key": "fts",
+          "strategy": "fts",
+          "stage2_mode": null,
+          "effective_method": "fts",
+          "execution_method": "fts",
+          "latency_ms": 1460.0819000601768,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\src\\tools\\smart-search.ts"
+          ],
+          "first_hit_rank": 1,
+          "hit_at_k": true,
+          "recall_at_k": 1.0,
+          "generated_artifact_count": 0,
+          "test_file_count": 0,
+          "error": null
+        },
+        "hybrid": {
+          "strategy_key": "hybrid",
+          "strategy": "hybrid",
+          "stage2_mode": null,
+          "effective_method": "hybrid",
+          "execution_method": "hybrid",
+          "latency_ms": 45991.74140000343,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\src\\config\\litellm-api-config-manager.ts",
+            "d:\\claude_dms3\\codex-lens\\src\\codexlens\\semantic\\reranker\\api_reranker.py",
+            "d:\\claude_dms3\\ccw\\src\\commands\\core-memory.ts",
+            "d:\\claude_dms3\\codex-lens\\src\\codexlens\\cli\\commands.py",
+            "d:\\claude_dms3\\codex-lens\\scripts\\generate_embeddings.py",
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\notification-routes.ts",
+            "d:\\claude_dms3\\ccw\\src\\tools\\team-msg.ts",
+            "d:\\claude_dms3\\ccw\\src\\types\\remote-notification.ts",
+            "d:\\claude_dms3\\ccw\\src\\core\\memory-store.ts",
+            "d:\\claude_dms3\\codex-lens\\src\\codexlens\\semantic\\vector_store.py"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "generated_artifact_count": 0,
+          "test_file_count": 0,
+          "error": null
+        },
+        "dense_rerank": {
+          "strategy_key": "dense_rerank",
+          "strategy": "dense_rerank",
+          "stage2_mode": null,
+          "effective_method": "dense_rerank",
+          "execution_method": "cascade",
+          "latency_ms": 22739.62610000372,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\issue-routes.ts",
+            "d:\\claude_dms3\\ccw\\src\\tools\\session-manager.ts",
+            "d:\\claude_dms3\\ccw\\src\\types\\queue-types.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\shared\\nativesessionpanel.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\history-importer.ts",
+            "d:\\claude_dms3\\ccw\\src\\core\\memory-extraction-pipeline.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\tests\\e2e\\skills-page.spec.ts",
+            "d:\\claude_dms3\\ccw\\dist\\tools\\discover-design-files.js",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\api-settings\\clisettingsmodal.tsx",
+            "d:\\claude_dms3\\ccw\\frontend\\tests\\e2e\\api-settings.spec.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "generated_artifact_count": 1,
+          "test_file_count": 2,
+          "error": null
+        },
+        "staged:precomputed": {
+          "strategy_key": "staged:precomputed",
+          "strategy": "staged",
+          "stage2_mode": "precomputed",
+          "effective_method": "staged",
+          "execution_method": "cascade",
+          "latency_ms": 14900.017599999905,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\src\\tools\\native-session-discovery.ts",
+            "d:\\claude_dms3\\ccw\\src\\commands\\memory.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\hooks\\useissues.test.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\cli-sessions-routes.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\lib\\api.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\shared\\filepreview.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\hooks\\hook-templates.ts",
+            "d:\\claude_dms3\\ccw\\src\\utils\\file-reader.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\cli-sessions-routes.js",
+            "d:\\claude_dms3\\ccw\\src\\core\\history-importer.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "generated_artifact_count": 1,
+          "test_file_count": 0,
+          "error": null
+        },
+        "staged:realtime": {
+          "strategy_key": "staged:realtime",
+          "strategy": "staged",
+          "stage2_mode": "realtime",
+          "effective_method": "staged",
+          "execution_method": "cascade",
+          "latency_ms": 14104.314599990845,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\src\\tools\\native-session-discovery.ts",
+            "d:\\claude_dms3\\ccw\\src\\commands\\memory.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\hooks\\useissues.test.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\cli-sessions-routes.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\lib\\api.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\shared\\filepreview.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\hooks\\hook-templates.ts",
+            "d:\\claude_dms3\\ccw\\src\\utils\\file-reader.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\cli-sessions-routes.js",
+            "d:\\claude_dms3\\ccw\\src\\core\\history-importer.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "generated_artifact_count": 1,
+          "test_file_count": 0,
+          "error": null
+        },
+        "staged:static_global_graph": {
+          "strategy_key": "staged:static_global_graph",
+          "strategy": "staged",
+          "stage2_mode": "static_global_graph",
+          "effective_method": "staged",
+          "execution_method": "cascade",
+          "latency_ms": 11906.852500021458,
+          "topk_paths": [
+            "d:\\claude_dms3\\ccw\\src\\tools\\native-session-discovery.ts",
+            "d:\\claude_dms3\\ccw\\src\\commands\\memory.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\hooks\\useissues.test.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\routes\\cli-sessions-routes.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\lib\\api.ts",
+            "d:\\claude_dms3\\ccw\\frontend\\src\\components\\shared\\filepreview.tsx",
+            "d:\\claude_dms3\\ccw\\src\\core\\hooks\\hook-templates.ts",
+            "d:\\claude_dms3\\ccw\\src\\utils\\file-reader.ts",
+            "d:\\claude_dms3\\ccw\\dist\\core\\routes\\cli-sessions-routes.js",
+            "d:\\claude_dms3\\ccw\\src\\core\\history-importer.ts"
+          ],
+          "first_hit_rank": null,
+          "hit_at_k": false,
+          "recall_at_k": 0.0,
+          "generated_artifact_count": 1,
+          "test_file_count": 0,
+          "error": null
+        }
+      }
+    }
+  ]
+}
\ No newline at end of file
diff --git a/codex-lens/pyproject.toml b/codex-lens/pyproject.toml
index 4e81ddf4..71dd763a 100644
--- a/codex-lens/pyproject.toml
+++ b/codex-lens/pyproject.toml
@@ -57,9 +57,9 @@ semantic-directml = [
 # Cross-encoder reranking (second-stage, optional)
 # Install with: pip install codexlens[reranker] (default: ONNX backend)
 reranker-onnx = [
-    "optimum~=1.16.0",
-    "onnxruntime~=1.15.0",
-    "transformers~=4.36.0",
+    "optimum[onnxruntime]~=2.1.0",
+    "onnxruntime~=1.23.0",
+    "transformers~=4.53.0",
 ]
 
 # Remote reranking via HTTP API
@@ -79,9 +79,9 @@ reranker-legacy = [
 
 # Backward-compatible alias for default reranker backend
 reranker = [
-    "optimum~=1.16.0",
-    "onnxruntime~=1.15.0",
-    "transformers~=4.36.0",
+    "optimum[onnxruntime]~=2.1.0",
+    "onnxruntime~=1.23.0",
+    "transformers~=4.53.0",
 ]
 
 # Encoding detection for non-UTF8 files
@@ -116,3 +116,12 @@ package-dir = { "" = "src" }
 
 [tool.setuptools.package-data]
 "codexlens.lsp" = ["lsp-servers.json"]
+
+[tool.pytest.ini_options]
+markers = [
+    "integration: marks tests that exercise broader end-to-end or dependency-heavy flows",
+]
+filterwarnings = [
+    "ignore:'BaseCommand' is deprecated and will be removed in Click 9.0.*:DeprecationWarning",
+    "ignore:The '__version__' attribute is deprecated and will be removed in Click 9.1.*:DeprecationWarning",
+]
diff --git a/codex-lens/scripts/bootstrap_reranker_local.py b/codex-lens/scripts/bootstrap_reranker_local.py
new file mode 100644
index 00000000..7cc1d15e
--- /dev/null
+++ b/codex-lens/scripts/bootstrap_reranker_local.py
@@ -0,0 +1,340 @@
+#!/usr/bin/env python3
+"""Bootstrap a local-only ONNX reranker environment for CodexLens.
+
+This script defaults to dry-run output so it can be used as a reproducible
+bootstrap manifest. When `--apply` is passed, it installs pinned reranker
+packages into the selected virtual environment and can optionally pre-download
+the ONNX reranker model into a repo-local Hugging Face cache.
+
+Examples:
+    python scripts/bootstrap_reranker_local.py --dry-run
+    python scripts/bootstrap_reranker_local.py --apply --download-model
+    python scripts/bootstrap_reranker_local.py --venv .venv --model Xenova/ms-marco-MiniLM-L-12-v2
+"""
+
+from __future__ import annotations
+
+import argparse
+import os
+import shlex
+import subprocess
+import sys
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Iterable
+
+
+PROJECT_ROOT = Path(__file__).resolve().parents[1]
+MANIFEST_PATH = Path(__file__).with_name("requirements-reranker-local.txt")
+DEFAULT_MODEL = "Xenova/ms-marco-MiniLM-L-6-v2"
+DEFAULT_HF_HOME = PROJECT_ROOT / ".cache" / "huggingface"
+
+STEP_NOTES = {
+    "runtime": "Install the local ONNX runtime first so optimum/transformers do not backtrack over runtime wheels.",
+    "hf-stack": "Pin the Hugging Face stack used by the ONNX reranker backend.",
+}
+
+
+@dataclass(frozen=True)
+class RequirementStep:
+    name: str
+    packages: tuple[str, ...]
+
+
+def _normalize_venv_path(raw_path: str | Path) -> Path:
+    return (Path(raw_path) if raw_path else PROJECT_ROOT / ".venv").expanduser().resolve()
+
+
+def _venv_python(venv_path: Path) -> Path:
+    if os.name == "nt":
+        return venv_path / "Scripts" / "python.exe"
+    return venv_path / "bin" / "python"
+
+
+def _venv_huggingface_cli(venv_path: Path) -> Path:
+    if os.name == "nt":
+        preferred = venv_path / "Scripts" / "hf.exe"
+        return preferred if preferred.exists() else venv_path / "Scripts" / "huggingface-cli.exe"
+    preferred = venv_path / "bin" / "hf"
+    return preferred if preferred.exists() else venv_path / "bin" / "huggingface-cli"
+
+
+def _default_shell() -> str:
+    return "powershell" if os.name == "nt" else "bash"
+
+
+def _shell_quote(value: str, shell: str) -> str:
+    if shell == "bash":
+        return shlex.quote(value)
+    return "'" + value.replace("'", "''") + "'"
+
+
+def _format_command(parts: Iterable[str], shell: str) -> str:
+    return " ".join(_shell_quote(str(part), shell) for part in parts)
+
+
+def _format_set_env(name: str, value: str, shell: str) -> str:
+    quoted_value = _shell_quote(value, shell)
+    if shell == "bash":
+        return f"export {name}={quoted_value}"
+    return f"$env:{name} = {quoted_value}"
+
+
+def _model_local_dir(hf_home: Path, model_name: str) -> Path:
+    slug = model_name.replace("/", "--")
+    return hf_home / "models" / slug
+
+
+def _parse_manifest(manifest_path: Path) -> list[RequirementStep]:
+    current_name: str | None = None
+    current_packages: list[str] = []
+    steps: list[RequirementStep] = []
+
+    for raw_line in manifest_path.read_text(encoding="utf-8").splitlines():
+        line = raw_line.strip()
+        if not line:
+            continue
+
+        if line.startswith("# [") and line.endswith("]"):
+            if current_name and current_packages:
+                steps.append(RequirementStep(current_name, tuple(current_packages)))
+            current_name = line[3:-1]
+            current_packages = []
+            continue
+
+        if line.startswith("#"):
+            continue
+
+        if current_name is None:
+            raise ValueError(f"Package entry found before a section header in {manifest_path}")
+        current_packages.append(line)
+
+    if current_name and current_packages:
+        steps.append(RequirementStep(current_name, tuple(current_packages)))
+
+    if not steps:
+        raise ValueError(f"No requirement steps found in {manifest_path}")
+    return steps
+
+
+def _pip_install_command(python_path: Path, packages: Iterable[str]) -> list[str]:
+    return [
+        str(python_path),
+        "-m",
+        "pip",
+        "install",
+        "--upgrade",
+        "--disable-pip-version-check",
+        "--upgrade-strategy",
+        "only-if-needed",
+        "--only-binary=:all:",
+        *packages,
+    ]
+
+
+def _probe_command(python_path: Path) -> list[str]:
+    return [
+        str(python_path),
+        "-c",
+        (
+            "from codexlens.semantic.reranker.factory import check_reranker_available; "
+            "print(check_reranker_available('onnx'))"
+        ),
+    ]
+
+
+def _download_command(huggingface_cli: Path, model_name: str, model_dir: Path) -> list[str]:
+    return [
+        str(huggingface_cli),
+        "download",
+        model_name,
+        "--local-dir",
+        str(model_dir),
+    ]
+
+
+def _print_plan(
+    shell: str,
+    venv_path: Path,
+    python_path: Path,
+    huggingface_cli: Path,
+    manifest_path: Path,
+    steps: list[RequirementStep],
+    model_name: str,
+    hf_home: Path,
+) -> None:
+    model_dir = _model_local_dir(hf_home, model_name)
+
+    print("CodexLens local reranker bootstrap")
+    print(f"manifest: {manifest_path}")
+    print(f"target_venv: {venv_path}")
+    print(f"target_python: {python_path}")
+    print(f"backend: onnx")
+    print(f"model: {model_name}")
+    print(f"hf_home: {hf_home}")
+    print("mode: dry-run")
+    print("notes:")
+    print("- Uses only the selected venv Python; no global pip commands are emitted.")
+    print("- Targets the local ONNX reranker backend only; no API or LiteLLM providers are involved.")
+    print("")
+    print("pinned_steps:")
+    for step in steps:
+        print(f"- {step.name}: {', '.join(step.packages)}")
+        note = STEP_NOTES.get(step.name)
+        if note:
+            print(f"  note: {note}")
+    print("")
+    print("commands:")
+    print(
+        "1. "
+        + _format_command(
+            [
+                str(python_path),
+                "-m",
+                "pip",
+                "install",
+                "--upgrade",
+                "pip",
+                "setuptools",
+                "wheel",
+            ],
+            shell,
+        )
+    )
+    command_index = 2
+    for step in steps:
+        print(f"{command_index}. " + _format_command(_pip_install_command(python_path, step.packages), shell))
+        command_index += 1
+    print(f"{command_index}. " + _format_set_env("HF_HOME", str(hf_home), shell))
+    command_index += 1
+    print(f"{command_index}. " + _format_command(_download_command(huggingface_cli, model_name, model_dir), shell))
+    command_index += 1
+    print(f"{command_index}. " + _format_command(_probe_command(python_path), shell))
+    print("")
+    print("optional_runtime_env:")
+    print(_format_set_env("RERANKER_BACKEND", "onnx", shell))
+    print(_format_set_env("RERANKER_MODEL", str(model_dir), shell))
+    print(_format_set_env("HF_HOME", str(hf_home), shell))
+
+
+def _run_command(command: list[str], *, env: dict[str, str] | None = None) -> None:
+    command_env = os.environ.copy()
+    if env:
+        command_env.update(env)
+    command_env.setdefault("PYTHONUTF8", "1")
+    command_env.setdefault("PYTHONIOENCODING", "utf-8")
+    subprocess.run(command, check=True, env=command_env)
+
+
+def main() -> int:
+    parser = argparse.ArgumentParser(
+        description="Bootstrap pinned local-only ONNX reranker dependencies for a CodexLens virtual environment.",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog=__doc__,
+    )
+    parser.add_argument(
+        "--venv",
+        type=Path,
+        default=PROJECT_ROOT / ".venv",
+        help="Path to the CodexLens virtual environment (default: ./.venv under codex-lens).",
+    )
+    parser.add_argument(
+        "--model",
+        default=DEFAULT_MODEL,
+        help=f"Model repo to pre-download for local reranking (default: {DEFAULT_MODEL}).",
+    )
+    parser.add_argument(
+        "--hf-home",
+        type=Path,
+        default=DEFAULT_HF_HOME,
+        help="Repo-local Hugging Face cache directory used for optional model downloads.",
+    )
+    parser.add_argument(
+        "--shell",
+        choices=("powershell", "bash"),
+        default=_default_shell(),
+        help="Shell syntax to use when rendering dry-run commands.",
+    )
+    parser.add_argument(
+        "--apply",
+        action="store_true",
+        help="Execute the pinned install steps against the selected virtual environment.",
+    )
+    parser.add_argument(
+        "--download-model",
+        action="store_true",
+        help="When used with --apply, pre-download the model into the configured HF_HOME directory.",
+    )
+    parser.add_argument(
+        "--probe",
+        action="store_true",
+        help="When used with --apply, run a small reranker availability probe at the end.",
+    )
+    parser.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Print the deterministic bootstrap plan. This is also the default when --apply is omitted.",
+    )
+
+    args = parser.parse_args()
+
+    steps = _parse_manifest(MANIFEST_PATH)
+    venv_path = _normalize_venv_path(args.venv)
+    python_path = _venv_python(venv_path)
+    huggingface_cli = _venv_huggingface_cli(venv_path)
+    hf_home = args.hf_home.expanduser().resolve()
+
+    if not args.apply:
+        _print_plan(
+            shell=args.shell,
+            venv_path=venv_path,
+            python_path=python_path,
+            huggingface_cli=huggingface_cli,
+            manifest_path=MANIFEST_PATH,
+            steps=steps,
+            model_name=args.model,
+            hf_home=hf_home,
+        )
+        return 0
+
+    if not python_path.exists():
+        print(f"Target venv Python not found: {python_path}", file=sys.stderr)
+        return 1
+
+    _run_command(
+        [
+            str(python_path),
+            "-m",
+            "pip",
+            "install",
+            "--upgrade",
+            "pip",
+            "setuptools",
+            "wheel",
+        ]
+    )
+    for step in steps:
+        _run_command(_pip_install_command(python_path, step.packages))
+
+    if args.download_model:
+        if not huggingface_cli.exists():
+            print(f"Expected venv-local Hugging Face CLI not found: {huggingface_cli}", file=sys.stderr)
+            return 1
+        download_env = os.environ.copy()
+        download_env["HF_HOME"] = str(hf_home)
+        hf_home.mkdir(parents=True, exist_ok=True)
+        _run_command(_download_command(huggingface_cli, args.model, _model_local_dir(hf_home, args.model)), env=download_env)
+
+    if args.probe:
+        local_model_dir = _model_local_dir(hf_home, args.model)
+        probe_env = os.environ.copy()
+        probe_env["HF_HOME"] = str(hf_home)
+        probe_env.setdefault("RERANKER_BACKEND", "onnx")
+        probe_env.setdefault("RERANKER_MODEL", str(local_model_dir if local_model_dir.exists() else args.model))
+        _run_command(_probe_command(python_path), env=probe_env)
+
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/codex-lens/scripts/requirements-reranker-local.txt b/codex-lens/scripts/requirements-reranker-local.txt
new file mode 100644
index 00000000..789e742b
--- /dev/null
+++ b/codex-lens/scripts/requirements-reranker-local.txt
@@ -0,0 +1,13 @@
+# Ordered local ONNX reranker pins for CodexLens.
+# Validated against the repo-local Python 3.13 virtualenv on Windows.
+# bootstrap_reranker_local.py installs each section in file order to keep
+# pip resolver work bounded and repeatable.
+
+# [runtime]
+numpy==2.4.0
+onnxruntime==1.23.2
+
+# [hf-stack]
+huggingface-hub==0.36.2
+transformers==4.53.3
+optimum[onnxruntime]==2.1.0
diff --git a/codex-lens/src/codexlens/cli/commands.py b/codex-lens/src/codexlens/cli/commands.py
index f6a7a3ae..2f49e706 100644
--- a/codex-lens/src/codexlens/cli/commands.py
+++ b/codex-lens/src/codexlens/cli/commands.py
@@ -2,10 +2,13 @@
 
 from __future__ import annotations
 
+import inspect
 import json
 import logging
 import os
+import re
 import shutil
+import subprocess
 from pathlib import Path
 from typing import Annotated, Any, Dict, Iterable, List, Optional
 
@@ -22,6 +25,13 @@ from codexlens.storage.registry import RegistryStore, ProjectInfo
 from codexlens.storage.index_tree import IndexTreeBuilder
 from codexlens.storage.dir_index import DirIndexStore
 from codexlens.search.chain_search import ChainSearchEngine, SearchOptions
+from codexlens.search.ranking import (
+    QueryIntent,
+    apply_path_penalties,
+    detect_query_intent,
+    query_prefers_lexical_search,
+    query_targets_generated_files,
+)
 from codexlens.watcher import WatcherManager, WatcherConfig
 
 from .output import (
@@ -34,6 +44,56 @@ from .output import (
 )
 
 app = typer.Typer(help="CodexLens CLI — local code indexing and search.")
+# Index subcommand group for reorganized commands
+def _patch_typer_click_help_compat() -> None:
+    """Patch Typer help rendering for Click versions that pass ctx to make_metavar()."""
+    import click.core
+    from typer.core import TyperArgument
+
+    try:
+        params = inspect.signature(TyperArgument.make_metavar).parameters
+    except (TypeError, ValueError):
+        return
+
+    if len(params) != 1:
+        return
+
+    def _compat_make_metavar(self, ctx=None):  # type: ignore[override]
+        if self.metavar is not None:
+            return self.metavar
+
+        var = (self.name or "").upper()
+        if not self.required:
+            var = f"[{var}]"
+
+        try:
+            type_var = self.type.get_metavar(param=self, ctx=ctx)
+        except TypeError:
+            try:
+                type_var = self.type.get_metavar(self, ctx)
+            except TypeError:
+                type_var = self.type.get_metavar(self)
+
+        if type_var:
+            var += f":{type_var}"
+        if self.nargs != 1:
+            var += "..."
+        return var
+
+    TyperArgument.make_metavar = _compat_make_metavar
+
+    param_params = inspect.signature(click.core.Parameter.make_metavar).parameters
+    if len(param_params) == 2:
+        original_param_make_metavar = click.core.Parameter.make_metavar
+
+        def _compat_param_make_metavar(self, ctx=None):  # type: ignore[override]
+            return original_param_make_metavar(self, ctx)
+
+        click.core.Parameter.make_metavar = _compat_param_make_metavar
+
+
+_patch_typer_click_help_compat()
+
 
 # Index subcommand group for reorganized commands
 index_app = typer.Typer(help="Index management commands (init, embeddings, binary, status, migrate, all)")
@@ -119,6 +179,281 @@ def _extract_embedding_error(embed_result: Dict[str, Any]) -> str:
     return "Embedding generation failed (no error details provided)"
 
 
+def _auto_select_search_method(query: str) -> str:
+    """Choose a default search method from query intent."""
+    if query_targets_generated_files(query) or query_prefers_lexical_search(query):
+        return "fts"
+
+    intent = detect_query_intent(query)
+    if intent == QueryIntent.KEYWORD:
+        return "fts"
+    if intent == QueryIntent.SEMANTIC:
+        return "dense_rerank"
+    return "hybrid"
+
+
+_CLI_NON_CODE_EXTENSIONS = {
+    "md", "txt", "json", "yaml", "yml", "xml", "csv", "log",
+    "ini", "cfg", "conf", "toml", "env", "properties",
+    "html", "htm", "svg", "png", "jpg", "jpeg", "gif", "ico", "webp",
+    "pdf", "doc", "docx", "xls", "xlsx", "ppt", "pptx",
+    "lock", "sum", "mod",
+}
+_FALLBACK_ARTIFACT_DIRS = {
+    "dist",
+    "build",
+    "out",
+    "coverage",
+    "htmlcov",
+    ".cache",
+    ".workflow",
+    ".next",
+    ".nuxt",
+    ".parcel-cache",
+    ".turbo",
+    "tmp",
+    "temp",
+    "generated",
+}
+_FALLBACK_SOURCE_DIRS = {
+    "src",
+    "lib",
+    "core",
+    "app",
+    "server",
+    "client",
+    "services",
+}
+
+
+def _normalize_extension_filters(exclude_extensions: Optional[Iterable[str]]) -> set[str]:
+    """Normalize extension filters to lowercase values without leading dots."""
+    normalized: set[str] = set()
+    for ext in exclude_extensions or []:
+        cleaned = (ext or "").strip().lower().lstrip(".")
+        if cleaned:
+            normalized.add(cleaned)
+    return normalized
+
+
+def _score_filesystem_fallback_match(
+    query: str,
+    path_text: str,
+    line_text: str,
+    *,
+    base_score: float,
+) -> float:
+    """Score filesystem fallback hits with light source-aware heuristics."""
+    score = max(0.0, float(base_score))
+    if score <= 0:
+        return 0.0
+
+    query_intent = detect_query_intent(query)
+    if query_intent != QueryIntent.KEYWORD:
+        return score
+
+    path_parts = {
+        part.casefold()
+        for part in str(path_text).replace("\\", "/").split("/")
+        if part and part != "."
+    }
+    if _FALLBACK_SOURCE_DIRS.intersection(path_parts):
+        score *= 1.15
+
+    symbol = (query or "").strip()
+    if " " in symbol or not symbol:
+        return score
+
+    escaped_symbol = re.escape(symbol)
+    definition_patterns = (
+        rf"^\s*(?:export\s+)?(?:async\s+)?def\s+{escaped_symbol}\b",
+        rf"^\s*(?:export\s+)?(?:async\s+)?function\s+{escaped_symbol}\b",
+        rf"^\s*(?:export\s+)?class\s+{escaped_symbol}\b",
+        rf"^\s*(?:export\s+)?interface\s+{escaped_symbol}\b",
+        rf"^\s*(?:export\s+)?type\s+{escaped_symbol}\b",
+        rf"^\s*(?:export\s+)?(?:const|let|var)\s+{escaped_symbol}\b",
+    )
+    if any(re.search(pattern, line_text) for pattern in definition_patterns):
+        score *= 1.8
+
+    return score
+
+
+def _filesystem_fallback_search(
+    query: str,
+    search_path: Path,
+    *,
+    limit: int,
+    config: Config,
+    code_only: bool = False,
+    exclude_extensions: Optional[Iterable[str]] = None,
+) -> Optional[dict[str, Any]]:
+    """Fallback to ripgrep when indexed keyword search returns no results."""
+    rg_path = shutil.which("rg")
+    if not rg_path or not query.strip():
+        return None
+
+    import time
+
+    allow_generated = query_targets_generated_files(query)
+    ignored_dirs = {name for name in IndexTreeBuilder.IGNORE_DIRS if name}
+    ignored_dirs.add(".workflow")
+    if allow_generated:
+        ignored_dirs.difference_update(_FALLBACK_ARTIFACT_DIRS)
+
+    excluded_exts = _normalize_extension_filters(exclude_extensions)
+    if code_only:
+        excluded_exts.update(_CLI_NON_CODE_EXTENSIONS)
+
+    args = [
+        rg_path,
+        "--json",
+        "--line-number",
+        "--fixed-strings",
+        "--smart-case",
+        "--max-count",
+        "1",
+    ]
+    if allow_generated:
+        args.append("--hidden")
+
+    for dirname in sorted(ignored_dirs):
+        args.extend(["--glob", f"!**/{dirname}/**"])
+
+    args.extend([query, str(search_path)])
+
+    start_time = time.perf_counter()
+    proc = subprocess.run(
+        args,
+        stdout=subprocess.PIPE,
+        stderr=subprocess.PIPE,
+        text=True,
+        encoding="utf-8",
+        errors="replace",
+        check=False,
+    )
+
+    if proc.returncode not in (0, 1):
+        return None
+
+    matches: List[SearchResult] = []
+    seen_paths: set[str] = set()
+    for raw_line in proc.stdout.splitlines():
+        if len(matches) >= limit:
+            break
+        try:
+            event = json.loads(raw_line)
+        except json.JSONDecodeError:
+            continue
+        if event.get("type") != "match":
+            continue
+
+        data = event.get("data") or {}
+        path_text = ((data.get("path") or {}).get("text") or "").strip()
+        if not path_text or path_text in seen_paths:
+            continue
+
+        path_obj = Path(path_text)
+        extension = path_obj.suffix.lower().lstrip(".")
+        if extension and extension in excluded_exts:
+            continue
+        if code_only and config.language_for_path(path_obj) is None:
+            continue
+
+        line_text = ((data.get("lines") or {}).get("text") or "").rstrip("\r\n")
+        line_number = data.get("line_number")
+        seen_paths.add(path_text)
+        base_score = float(limit - len(matches))
+        matches.append(
+            SearchResult(
+                path=path_text,
+                score=_score_filesystem_fallback_match(
+                    query,
+                    path_text,
+                    line_text,
+                    base_score=base_score,
+                ),
+                excerpt=line_text.strip() or line_text or path_text,
+                content=None,
+                metadata={
+                    "filesystem_fallback": True,
+                    "backend": "ripgrep-fallback",
+                    "stale_index_suspected": True,
+                },
+                start_line=line_number,
+                end_line=line_number,
+            )
+        )
+
+    if not matches:
+        return None
+
+    matches = apply_path_penalties(
+        matches,
+        query,
+        test_file_penalty=config.test_file_penalty,
+        generated_file_penalty=config.generated_file_penalty,
+    )
+    return {
+        "results": matches,
+        "time_ms": (time.perf_counter() - start_time) * 1000.0,
+        "fallback": {
+            "backend": "ripgrep-fallback",
+            "stale_index_suspected": True,
+            "reason": "Indexed FTS search returned no results; filesystem fallback used.",
+        },
+    }
+
+
+def _remove_tree_best_effort(target: Path) -> dict[str, Any]:
+    """Remove a directory tree without aborting on locked files."""
+    target = target.resolve()
+    if not target.exists():
+        return {
+            "removed": True,
+            "partial": False,
+            "locked_paths": [],
+            "errors": [],
+            "remaining_path": None,
+        }
+
+    locked_paths: List[str] = []
+    errors: List[str] = []
+    entries = sorted(target.rglob("*"), key=lambda path: len(path.parts), reverse=True)
+
+    for entry in entries:
+        try:
+            if entry.is_dir() and not entry.is_symlink():
+                entry.rmdir()
+            else:
+                entry.unlink()
+        except FileNotFoundError:
+            continue
+        except PermissionError:
+            locked_paths.append(str(entry))
+        except OSError as exc:
+            if entry.is_dir():
+                continue
+            errors.append(f"{entry}: {exc}")
+
+    try:
+        target.rmdir()
+    except FileNotFoundError:
+        pass
+    except PermissionError:
+        locked_paths.append(str(target))
+    except OSError:
+        pass
+
+    return {
+        "removed": not target.exists(),
+        "partial": target.exists(),
+        "locked_paths": sorted(set(locked_paths)),
+        "errors": errors,
+        "remaining_path": str(target) if target.exists() else None,
+    }
+
+
 def _get_index_root() -> Path:
     """Get the index root directory from config or default.
 
@@ -542,7 +877,7 @@ def search(
     offset: int = typer.Option(0, "--offset", min=0, help="Pagination offset - skip first N results."),
     depth: int = typer.Option(-1, "--depth", "-d", help="Search depth (-1 = unlimited, 0 = current only)."),
     files_only: bool = typer.Option(False, "--files-only", "-f", help="Return only file paths without content snippets."),
-    method: str = typer.Option("dense_rerank", "--method", "-m", help="Search method: 'dense_rerank' (semantic, default), 'fts' (exact keyword)."),
+    method: str = typer.Option("auto", "--method", "-m", help="Search method: 'auto' (intent-aware, default), 'dense_rerank' (semantic), 'fts' (exact keyword)."),
     use_fuzzy: bool = typer.Option(False, "--use-fuzzy", help="Enable fuzzy matching in FTS method."),
     code_only: bool = typer.Option(False, "--code-only", help="Only return code files (excludes md, txt, json, yaml, xml, etc.)."),
     exclude_extensions: Optional[str] = typer.Option(None, "--exclude-extensions", help="Comma-separated list of file extensions to exclude (e.g., 'md,txt,json')."),
@@ -576,14 +911,16 @@ def search(
     Use --depth to limit search recursion (0 = current dir only).
 
     Search Methods:
-      - dense_rerank (default): Semantic search using Dense embedding coarse retrieval +
+      - auto (default): Intent-aware routing. KEYWORD -> fts, MIXED -> hybrid,
+        SEMANTIC -> dense_rerank.
+      - dense_rerank: Semantic search using Dense embedding coarse retrieval +
         Cross-encoder reranking. Best for natural language queries and code understanding.
       - fts: Full-text search using FTS5 (unicode61 tokenizer). Best for exact code
         identifiers like function/class names. Use --use-fuzzy for typo tolerance.
 
     Method Selection Guide:
-      - Code identifiers (function/class names): fts
-      - Natural language queries: dense_rerank (default)
+      - Code identifiers (function/class names): auto or fts
+      - Natural language queries: auto or dense_rerank
       - Typo-tolerant search: fts --use-fuzzy
 
     Requirements:
@@ -591,7 +928,7 @@ def search(
       Use 'codexlens embeddings-generate' to create embeddings first.
 
     Examples:
-      # Default semantic search (dense_rerank)
+      # Default intent-aware search
       codexlens search "authentication logic"
 
       # Exact code identifier search
@@ -612,7 +949,7 @@ def search(
 
         # Map old mode values to new method values
         mode_to_method = {
-            "auto": "hybrid",
+            "auto": "auto",
             "exact": "fts",
             "fuzzy": "fts",  # with use_fuzzy=True
             "hybrid": "hybrid",
@@ -638,19 +975,27 @@ def search(
 
     # Validate method - simplified interface exposes only dense_rerank and fts
     # Other methods (vector, hybrid, cascade) are hidden but still work for backward compatibility
-    valid_methods = ["fts", "dense_rerank", "vector", "hybrid", "cascade"]
+    valid_methods = ["auto", "fts", "dense_rerank", "vector", "hybrid", "cascade"]
     if actual_method not in valid_methods:
         if json_mode:
-            print_json(success=False, error=f"Invalid method: {actual_method}. Use 'dense_rerank' (semantic) or 'fts' (exact keyword).")
+            print_json(success=False, error=f"Invalid method: {actual_method}. Use 'auto', 'dense_rerank', or 'fts'.")
         else:
             console.print(f"[red]Invalid method:[/red] {actual_method}")
-            console.print("[dim]Use 'dense_rerank' (semantic, default) or 'fts' (exact keyword)[/dim]")
+            console.print("[dim]Use 'auto' (default), 'dense_rerank' (semantic), or 'fts' (exact keyword)[/dim]")
         raise typer.Exit(code=1)
 
+    resolved_method = (
+        _auto_select_search_method(query)
+        if actual_method == "auto"
+        else actual_method
+    )
+    display_method = resolved_method
+    execution_method = resolved_method
+
     # Map dense_rerank to cascade method internally
     internal_cascade_strategy = cascade_strategy
-    if actual_method == "dense_rerank":
-        actual_method = "cascade"
+    if execution_method == "dense_rerank":
+        execution_method = "cascade"
         internal_cascade_strategy = "dense_rerank"
 
     # Validate cascade_strategy if provided (for advanced users)
@@ -733,32 +1078,32 @@ def search(
         # vector: Pure vector semantic search
         # hybrid: RRF fusion of sparse + dense
         # cascade: Two-stage binary + dense retrieval
-        if actual_method == "fts":
+        if execution_method == "fts":
             hybrid_mode = False
             enable_fuzzy = use_fuzzy
             enable_vector = False
             pure_vector = False
             enable_cascade = False
-        elif actual_method == "vector":
+        elif execution_method == "vector":
             hybrid_mode = True
             enable_fuzzy = False
             enable_vector = True
             pure_vector = True
             enable_cascade = False
-        elif actual_method == "hybrid":
+        elif execution_method == "hybrid":
             hybrid_mode = True
             enable_fuzzy = use_fuzzy
             enable_vector = True
             pure_vector = False
             enable_cascade = False
-        elif actual_method == "cascade":
+        elif execution_method == "cascade":
             hybrid_mode = True
             enable_fuzzy = False
             enable_vector = True
             pure_vector = False
             enable_cascade = True
         else:
-            raise ValueError(f"Invalid method: {actual_method}")
+            raise ValueError(f"Invalid method: {execution_method}")
 
         # Parse exclude_extensions from comma-separated string
         exclude_exts_list = None
@@ -790,10 +1135,28 @@ def search(
                     console.print(fp)
         else:
             # Dispatch to cascade_search for cascade method
-            if actual_method == "cascade":
+            if execution_method == "cascade":
                 result = engine.cascade_search(query, search_path, k=limit, options=options, strategy=internal_cascade_strategy)
             else:
                 result = engine.search(query, search_path, options)
+            effective_results = result.results
+            effective_files_matched = result.stats.files_matched
+            effective_time_ms = result.stats.time_ms
+            fallback_payload = None
+            if display_method == "fts" and not use_fuzzy and not effective_results:
+                fallback_payload = _filesystem_fallback_search(
+                    query,
+                    search_path,
+                    limit=limit,
+                    config=config,
+                    code_only=code_only,
+                    exclude_extensions=exclude_exts_list,
+                )
+                if fallback_payload is not None:
+                    effective_results = fallback_payload["results"]
+                    effective_files_matched = len(effective_results)
+                    effective_time_ms = result.stats.time_ms + float(fallback_payload["time_ms"])
+
             results_list = [
                 {
                     "path": r.path,
@@ -803,25 +1166,29 @@ def search(
                     "source": getattr(r, "search_source", None),
                     "symbol": getattr(r, "symbol", None),
                 }
-                for r in result.results
+                for r in effective_results
             ]
 
             payload = {
                 "query": query,
-                "method": actual_method,
+                "method": display_method,
                 "count": len(results_list),
                 "results": results_list,
                 "stats": {
                     "dirs_searched": result.stats.dirs_searched,
-                    "files_matched": result.stats.files_matched,
-                    "time_ms": result.stats.time_ms,
+                    "files_matched": effective_files_matched,
+                    "time_ms": effective_time_ms,
                 },
             }
+            if fallback_payload is not None:
+                payload["fallback"] = fallback_payload["fallback"]
             if json_mode:
                 print_json(success=True, result=payload)
             else:
-                render_search_results(result.results, verbose=verbose)
-                console.print(f"[dim]Method: {actual_method} | Searched {result.stats.dirs_searched} directories in {result.stats.time_ms:.1f}ms[/dim]")
+                render_search_results(effective_results, verbose=verbose)
+                if fallback_payload is not None:
+                    console.print("[yellow]No indexed matches found; showing filesystem fallback results (stale index suspected).[/yellow]")
+                console.print(f"[dim]Method: {display_method} | Searched {result.stats.dirs_searched} directories in {effective_time_ms:.1f}ms[/dim]")
 
     except SearchError as exc:
         if json_mode:
@@ -1454,7 +1821,7 @@ def projects(
                 mapper = PathMapper()
                 index_root = mapper.source_to_index_dir(project_path)
                 if index_root.exists():
-                    shutil.rmtree(index_root)
+                    _remove_tree_best_effort(index_root)
 
                 if json_mode:
                     print_json(success=True, result={"removed": str(project_path)})
@@ -1966,17 +2333,30 @@ def clean(
                 registry_path.unlink()
 
             # Remove all indexes
-            shutil.rmtree(index_root)
+            removal = _remove_tree_best_effort(index_root)
 
             result = {
                 "cleaned": str(index_root),
                 "size_freed_mb": round(total_size / (1024 * 1024), 2),
+                "partial": bool(removal["partial"]),
+                "locked_paths": removal["locked_paths"],
+                "remaining_path": removal["remaining_path"],
+                "errors": removal["errors"],
             }
 
             if json_mode:
                 print_json(success=True, result=result)
             else:
-                console.print(f"[green]Removed all indexes:[/green] {result['size_freed_mb']} MB freed")
+                if result["partial"]:
+                    console.print(
+                        f"[yellow]Partially removed all indexes:[/yellow] {result['size_freed_mb']} MB freed"
+                    )
+                    if result["locked_paths"]:
+                        console.print(
+                            f"[dim]Locked paths left behind: {len(result['locked_paths'])}[/dim]"
+                        )
+                else:
+                    console.print(f"[green]Removed all indexes:[/green] {result['size_freed_mb']} MB freed")
 
         elif path:
             # Remove specific project
@@ -2003,18 +2383,29 @@ def clean(
             registry.close()
 
             # Remove indexes
-            shutil.rmtree(project_index)
+            removal = _remove_tree_best_effort(project_index)
 
             result = {
                 "cleaned": str(project_path),
                 "index_path": str(project_index),
                 "size_freed_mb": round(total_size / (1024 * 1024), 2),
+                "partial": bool(removal["partial"]),
+                "locked_paths": removal["locked_paths"],
+                "remaining_path": removal["remaining_path"],
+                "errors": removal["errors"],
             }
 
             if json_mode:
                 print_json(success=True, result=result)
             else:
-                console.print(f"[green]Removed indexes for:[/green] {project_path}")
+                if result["partial"]:
+                    console.print(f"[yellow]Partially removed indexes for:[/yellow] {project_path}")
+                    if result["locked_paths"]:
+                        console.print(
+                            f"[dim]Locked paths left behind: {len(result['locked_paths'])}[/dim]"
+                        )
+                else:
+                    console.print(f"[green]Removed indexes for:[/green] {project_path}")
                 console.print(f"  Freed: {result['size_freed_mb']} MB")
 
         else:
@@ -2617,7 +3008,7 @@ def embeddings_status(
         codexlens embeddings-status ~/projects/my-app                  # Check project (auto-finds index)
     """
     _deprecated_command_warning("embeddings-status", "index status")
-    from codexlens.cli.embedding_manager import check_index_embeddings, get_embedding_stats_summary
+    from codexlens.cli.embedding_manager import get_embedding_stats_summary, get_embeddings_status
 
     # Determine what to check
     if path is None:
@@ -3715,7 +4106,7 @@ def index_status(
     """
     _configure_logging(verbose, json_mode)
 
-    from codexlens.cli.embedding_manager import check_index_embeddings, get_embedding_stats_summary
+    from codexlens.cli.embedding_manager import get_embedding_stats_summary, get_embeddings_status
 
     # Determine target path and index root
     if path is None:
@@ -3751,13 +4142,19 @@ def index_status(
             raise typer.Exit(code=1)
 
     # Get embeddings status
-    embeddings_result = get_embedding_stats_summary(index_root)
+    embeddings_result = get_embeddings_status(index_root)
+    embeddings_summary_result = get_embedding_stats_summary(index_root)
 
     # Build combined result
     result = {
         "index_root": str(index_root),
-        "embeddings": embeddings_result.get("result") if embeddings_result.get("success") else None,
-        "embeddings_error": embeddings_result.get("error") if not embeddings_result.get("success") else None,
+        # Keep "embeddings" backward-compatible as the subtree summary payload.
+        "embeddings": embeddings_summary_result.get("result") if embeddings_summary_result.get("success") else None,
+        "embeddings_error": embeddings_summary_result.get("error") if not embeddings_summary_result.get("success") else None,
+        "embeddings_status": embeddings_result.get("result") if embeddings_result.get("success") else None,
+        "embeddings_status_error": embeddings_result.get("error") if not embeddings_result.get("success") else None,
+        "embeddings_summary": embeddings_summary_result.get("result") if embeddings_summary_result.get("success") else None,
+        "embeddings_summary_error": embeddings_summary_result.get("error") if not embeddings_summary_result.get("success") else None,
     }
 
     if json_mode:
@@ -3770,13 +4167,39 @@ def index_status(
         console.print("[bold]Dense Embeddings (HNSW):[/bold]")
         if embeddings_result.get("success"):
             data = embeddings_result["result"]
-            total = data.get("total_indexes", 0)
-            with_emb = data.get("indexes_with_embeddings", 0)
-            total_chunks = data.get("total_chunks", 0)
+            root = data.get("root") or data
+            subtree = data.get("subtree") or {}
+            centralized = data.get("centralized") or {}
 
-            console.print(f"  Total indexes: {total}")
-            console.print(f"  Indexes with embeddings: [{'green' if with_emb > 0 else 'yellow'}]{with_emb}[/]/{total}")
-            console.print(f"  Total chunks: {total_chunks:,}")
+            console.print(f"  Root files: {root.get('total_files', 0)}")
+            console.print(
+                f"  Root files with embeddings: "
+                f"[{'green' if root.get('has_embeddings') else 'yellow'}]{root.get('files_with_embeddings', 0)}[/]"
+                f"/{root.get('total_files', 0)}"
+            )
+            console.print(f"  Root coverage: {root.get('coverage_percent', 0):.1f}%")
+            console.print(f"  Root chunks: {root.get('total_chunks', 0):,}")
+            console.print(f"  Root storage mode: {root.get('storage_mode', 'none')}")
+            console.print(
+                f"  Centralized dense: "
+                f"{'ready' if centralized.get('dense_ready') else ('present' if centralized.get('dense_index_exists') else 'missing')}"
+            )
+            console.print(
+                f"  Centralized binary: "
+                f"{'ready' if centralized.get('binary_ready') else ('present' if centralized.get('binary_index_exists') else 'missing')}"
+            )
+
+            subtree_total = subtree.get("total_indexes", 0)
+            subtree_with_embeddings = subtree.get("indexes_with_embeddings", 0)
+            subtree_chunks = subtree.get("total_chunks", 0)
+            if subtree_total:
+                console.print("\n[bold]Subtree Summary:[/bold]")
+                console.print(f"  Total indexes: {subtree_total}")
+                console.print(
+                    f"  Indexes with embeddings: "
+                    f"[{'green' if subtree_with_embeddings > 0 else 'yellow'}]{subtree_with_embeddings}[/]/{subtree_total}"
+                )
+                console.print(f"  Total chunks: {subtree_chunks:,}")
         else:
             console.print(f"  [yellow]--[/yellow] {embeddings_result.get('error', 'Not available')}")
 
diff --git a/codex-lens/src/codexlens/cli/embedding_manager.py b/codex-lens/src/codexlens/cli/embedding_manager.py
index 1180252d..8bbb3a74 100644
--- a/codex-lens/src/codexlens/cli/embedding_manager.py
+++ b/codex-lens/src/codexlens/cli/embedding_manager.py
@@ -48,6 +48,8 @@ from itertools import islice
 from pathlib import Path
 from typing import Any, Dict, Generator, List, Optional, Tuple
 
+from codexlens.storage.index_filters import filter_index_paths
+
 try:
     from codexlens.semantic import SEMANTIC_AVAILABLE, is_embedding_backend_available
 except ImportError:
@@ -61,9 +63,15 @@ except ImportError:  # pragma: no cover
     VectorStore = None  # type: ignore[assignment]
 
 try:
-    from codexlens.config import VECTORS_META_DB_NAME
+    from codexlens.config import (
+        BINARY_VECTORS_MMAP_NAME,
+        VECTORS_HNSW_NAME,
+        VECTORS_META_DB_NAME,
+    )
 except ImportError:
+    VECTORS_HNSW_NAME = "_vectors.hnsw"
     VECTORS_META_DB_NAME = "_vectors_meta.db"
+    BINARY_VECTORS_MMAP_NAME = "_binary_vectors.mmap"
 
 try:
     from codexlens.search.ranking import get_file_category
@@ -410,6 +418,98 @@ def check_index_embeddings(index_path: Path) -> Dict[str, any]:
         }
 
 
+def _sqlite_table_exists(conn: sqlite3.Connection, table_name: str) -> bool:
+    """Return whether a SQLite table exists."""
+    cursor = conn.execute(
+        "SELECT name FROM sqlite_master WHERE type='table' AND name=?",
+        (table_name,),
+    )
+    return cursor.fetchone() is not None
+
+
+def _sqlite_count_rows(conn: sqlite3.Connection, table_name: str) -> int:
+    """Return row count for a table, or 0 when the table is absent."""
+    if not _sqlite_table_exists(conn, table_name):
+        return 0
+    cursor = conn.execute(f"SELECT COUNT(*) FROM {table_name}")
+    return int(cursor.fetchone()[0] or 0)
+
+
+def _sqlite_count_distinct_rows(conn: sqlite3.Connection, table_name: str, column_name: str) -> int:
+    """Return distinct row count for a table column, or 0 when the table is absent."""
+    if not _sqlite_table_exists(conn, table_name):
+        return 0
+    cursor = conn.execute(f"SELECT COUNT(DISTINCT {column_name}) FROM {table_name}")
+    return int(cursor.fetchone()[0] or 0)
+
+
+def _get_model_info_from_index(index_path: Path) -> Optional[Dict[str, Any]]:
+    """Read embedding model metadata from an index if available."""
+    try:
+        with sqlite3.connect(index_path) as conn:
+            if not _sqlite_table_exists(conn, "embeddings_config"):
+                return None
+        from codexlens.semantic.vector_store import VectorStore
+        with VectorStore(index_path) as vs:
+            config = vs.get_model_config()
+            if not config:
+                return None
+            return {
+                "model_profile": config.get("model_profile"),
+                "model_name": config.get("model_name"),
+                "embedding_dim": config.get("embedding_dim"),
+                "backend": config.get("backend"),
+                "created_at": config.get("created_at"),
+                "updated_at": config.get("updated_at"),
+            }
+    except Exception:
+        return None
+
+
+def _inspect_centralized_embeddings(index_root: Path) -> Dict[str, Any]:
+    """Inspect centralized vector artifacts stored directly at the current root."""
+    dense_index_path = index_root / VECTORS_HNSW_NAME
+    meta_db_path = index_root / VECTORS_META_DB_NAME
+    binary_index_path = index_root / BINARY_VECTORS_MMAP_NAME
+
+    result: Dict[str, Any] = {
+        "index_root": str(index_root),
+        "dense_index_path": str(dense_index_path) if dense_index_path.exists() else None,
+        "binary_index_path": str(binary_index_path) if binary_index_path.exists() else None,
+        "meta_db_path": str(meta_db_path) if meta_db_path.exists() else None,
+        "dense_index_exists": dense_index_path.exists(),
+        "binary_index_exists": binary_index_path.exists(),
+        "meta_db_exists": meta_db_path.exists(),
+        "chunk_metadata_rows": 0,
+        "binary_vector_rows": 0,
+        "files_with_embeddings": 0,
+        "dense_ready": False,
+        "binary_ready": False,
+        "usable": False,
+    }
+
+    if not meta_db_path.exists():
+        return result
+
+    try:
+        with sqlite3.connect(meta_db_path) as conn:
+            result["chunk_metadata_rows"] = _sqlite_count_rows(conn, "chunk_metadata")
+            result["binary_vector_rows"] = _sqlite_count_rows(conn, "binary_vectors")
+            result["files_with_embeddings"] = _sqlite_count_distinct_rows(conn, "chunk_metadata", "file_path")
+    except Exception as exc:
+        result["error"] = f"Failed to inspect centralized metadata: {exc}"
+        return result
+
+    result["dense_ready"] = result["dense_index_exists"] and result["chunk_metadata_rows"] > 0
+    result["binary_ready"] = (
+        result["binary_index_exists"]
+        and result["chunk_metadata_rows"] > 0
+        and result["binary_vector_rows"] > 0
+    )
+    result["usable"] = result["dense_ready"] or result["binary_ready"]
+    return result
+
+
 def _get_embedding_defaults() -> tuple[str, str, bool, List, str, float]:
     """Get default embedding settings from config.
 
@@ -1024,7 +1124,7 @@ def _discover_index_dbs_internal(index_root: Path) -> List[Path]:
     if not index_root.exists():
         return []
 
-    return sorted(index_root.rglob("_index.db"))
+    return sorted(filter_index_paths(index_root.rglob("_index.db"), index_root))
 
 
 def build_centralized_binary_vectors_from_existing(
@@ -1353,7 +1453,7 @@ def find_all_indexes(scan_dir: Path) -> List[Path]:
     if not scan_dir.exists():
         return []
 
-    return list(scan_dir.rglob("_index.db"))
+    return _discover_index_dbs_internal(scan_dir)
 
 
 
@@ -1866,8 +1966,32 @@ def get_embeddings_status(index_root: Path) -> Dict[str, any]:
         Aggregated status with coverage statistics, model info, and timestamps
     """
     index_files = _discover_index_dbs_internal(index_root)
+    centralized = _inspect_centralized_embeddings(index_root)
+    root_index_path = index_root / "_index.db"
+    root_index_exists = root_index_path.exists()
 
     if not index_files:
+        root_result = {
+            "index_path": str(root_index_path),
+            "exists": root_index_exists,
+            "total_files": 0,
+            "files_with_embeddings": 0,
+            "files_without_embeddings": 0,
+            "total_chunks": 0,
+            "coverage_percent": 0.0,
+            "has_embeddings": False,
+            "storage_mode": "none",
+        }
+        subtree_result = {
+            "total_indexes": 0,
+            "total_files": 0,
+            "files_with_embeddings": 0,
+            "files_without_embeddings": 0,
+            "total_chunks": 0,
+            "coverage_percent": 0.0,
+            "indexes_with_embeddings": 0,
+            "indexes_without_embeddings": 0,
+        }
         return {
             "success": True,
             "result": {
@@ -1880,72 +2004,123 @@ def get_embeddings_status(index_root: Path) -> Dict[str, any]:
                 "indexes_with_embeddings": 0,
                 "indexes_without_embeddings": 0,
                 "model_info": None,
+                "root": root_result,
+                "subtree": subtree_result,
+                "centralized": centralized,
             },
         }
 
-    total_files = 0
-    files_with_embeddings = 0
-    total_chunks = 0
-    indexes_with_embeddings = 0
-    model_info = None
+    subtree_total_files = 0
+    subtree_files_with_embeddings = 0
+    subtree_total_chunks = 0
+    subtree_indexes_with_embeddings = 0
+    subtree_model_info = None
     latest_updated_at = None
 
     for index_path in index_files:
         status = check_index_embeddings(index_path)
-        if status["success"]:
-            result = status["result"]
-            total_files += result["total_files"]
-            files_with_embeddings += result["files_with_chunks"]
-            total_chunks += result["total_chunks"]
-            if result["has_embeddings"]:
-                indexes_with_embeddings += 1
+        if not status["success"]:
+            continue
 
-                # Get model config from first index with embeddings (they should all match)
-                if model_info is None:
-                    try:
-                        from codexlens.semantic.vector_store import VectorStore
-                        with VectorStore(index_path) as vs:
-                            config = vs.get_model_config()
-                            if config:
-                                model_info = {
-                                    "model_profile": config.get("model_profile"),
-                                    "model_name": config.get("model_name"),
-                                    "embedding_dim": config.get("embedding_dim"),
-                                    "backend": config.get("backend"),
-                                    "created_at": config.get("created_at"),
-                                    "updated_at": config.get("updated_at"),
-                                }
-                                latest_updated_at = config.get("updated_at")
-                    except Exception:
-                        pass
-                else:
-                    # Track the latest updated_at across all indexes
-                    try:
-                        from codexlens.semantic.vector_store import VectorStore
-                        with VectorStore(index_path) as vs:
-                            config = vs.get_model_config()
-                            if config and config.get("updated_at"):
-                                if latest_updated_at is None or config["updated_at"] > latest_updated_at:
-                                    latest_updated_at = config["updated_at"]
-                    except Exception:
-                        pass
+        result = status["result"]
+        subtree_total_files += result["total_files"]
+        subtree_files_with_embeddings += result["files_with_chunks"]
+        subtree_total_chunks += result["total_chunks"]
 
-    # Update model_info with latest timestamp
-    if model_info and latest_updated_at:
-        model_info["updated_at"] = latest_updated_at
+        if not result["has_embeddings"]:
+            continue
+
+        subtree_indexes_with_embeddings += 1
+        candidate_model_info = _get_model_info_from_index(index_path)
+        if not candidate_model_info:
+            continue
+        if subtree_model_info is None:
+            subtree_model_info = candidate_model_info
+            latest_updated_at = candidate_model_info.get("updated_at")
+            continue
+        candidate_updated_at = candidate_model_info.get("updated_at")
+        if candidate_updated_at and (latest_updated_at is None or candidate_updated_at > latest_updated_at):
+            latest_updated_at = candidate_updated_at
+
+    if subtree_model_info and latest_updated_at:
+        subtree_model_info["updated_at"] = latest_updated_at
+
+    root_total_files = 0
+    root_files_with_embeddings = 0
+    root_total_chunks = 0
+    root_has_embeddings = False
+    root_storage_mode = "none"
+
+    if root_index_exists:
+        root_status = check_index_embeddings(root_index_path)
+        if root_status["success"]:
+            root_data = root_status["result"]
+            root_total_files = int(root_data["total_files"])
+            if root_data["has_embeddings"]:
+                root_files_with_embeddings = int(root_data["files_with_chunks"])
+                root_total_chunks = int(root_data["total_chunks"])
+                root_has_embeddings = True
+                root_storage_mode = "distributed"
+
+    if centralized["usable"]:
+        root_files_with_embeddings = int(centralized["files_with_embeddings"])
+        root_total_chunks = int(centralized["chunk_metadata_rows"])
+        root_has_embeddings = True
+        root_storage_mode = "centralized" if root_storage_mode == "none" else "mixed"
+
+    model_info = None
+    if root_has_embeddings:
+        if root_storage_mode in {"distributed", "mixed"} and root_index_exists:
+            model_info = _get_model_info_from_index(root_index_path)
+        if model_info is None and root_storage_mode in {"centralized", "mixed"}:
+            model_info = subtree_model_info
+
+    root_coverage_percent = round(
+        (root_files_with_embeddings / root_total_files * 100) if root_total_files > 0 else 0,
+        1,
+    )
+    root_files_without_embeddings = max(root_total_files - root_files_with_embeddings, 0)
+
+    root_result = {
+        "index_path": str(root_index_path),
+        "exists": root_index_exists,
+        "total_files": root_total_files,
+        "files_with_embeddings": root_files_with_embeddings,
+        "files_without_embeddings": root_files_without_embeddings,
+        "total_chunks": root_total_chunks,
+        "coverage_percent": root_coverage_percent,
+        "has_embeddings": root_has_embeddings,
+        "storage_mode": root_storage_mode,
+    }
+    subtree_result = {
+        "total_indexes": len(index_files),
+        "total_files": subtree_total_files,
+        "files_with_embeddings": subtree_files_with_embeddings,
+        "files_without_embeddings": subtree_total_files - subtree_files_with_embeddings,
+        "total_chunks": subtree_total_chunks,
+        "coverage_percent": round(
+            (subtree_files_with_embeddings / subtree_total_files * 100) if subtree_total_files > 0 else 0,
+            1,
+        ),
+        "indexes_with_embeddings": subtree_indexes_with_embeddings,
+        "indexes_without_embeddings": len(index_files) - subtree_indexes_with_embeddings,
+    }
 
     return {
         "success": True,
         "result": {
-            "total_indexes": len(index_files),
-            "total_files": total_files,
-            "files_with_embeddings": files_with_embeddings,
-            "files_without_embeddings": total_files - files_with_embeddings,
-            "total_chunks": total_chunks,
-            "coverage_percent": round((files_with_embeddings / total_files * 100) if total_files > 0 else 0, 1),
-            "indexes_with_embeddings": indexes_with_embeddings,
-            "indexes_without_embeddings": len(index_files) - indexes_with_embeddings,
+            "total_indexes": 1 if root_index_exists else 0,
+            "total_files": root_total_files,
+            "files_with_embeddings": root_files_with_embeddings,
+            "files_without_embeddings": root_files_without_embeddings,
+            "total_chunks": root_total_chunks,
+            "coverage_percent": root_coverage_percent,
+            "indexes_with_embeddings": 1 if root_has_embeddings else 0,
+            "indexes_without_embeddings": 1 if root_index_exists and not root_has_embeddings else 0,
             "model_info": model_info,
+            "root": root_result,
+            "subtree": subtree_result,
+            "centralized": centralized,
         },
     }
 
diff --git a/codex-lens/src/codexlens/config.py b/codex-lens/src/codexlens/config.py
index c76841b2..527560f7 100644
--- a/codex-lens/src/codexlens/config.py
+++ b/codex-lens/src/codexlens/config.py
@@ -126,11 +126,14 @@ class Config:
     enable_reranking: bool = False
     reranking_top_k: int = 50
     symbol_boost_factor: float = 1.5
+    test_file_penalty: float = 0.15  # Penalty for test/fixture paths during final ranking
+    generated_file_penalty: float = 0.35  # Penalty for generated/build artifact paths during final ranking
 
     # Optional cross-encoder reranking (second stage; requires optional reranker deps)
     enable_cross_encoder_rerank: bool = False
     reranker_backend: str = "onnx"
     reranker_model: str = "cross-encoder/ms-marco-MiniLM-L-6-v2"
+    reranker_use_gpu: bool = True  # Whether reranker backends should use GPU acceleration
     reranker_top_k: int = 50
     reranker_max_input_tokens: int = 8192  # Maximum tokens for reranker API batching
     reranker_chunk_type_weights: Optional[Dict[str, float]] = None  # Weights for chunk types: {"code": 1.0, "docstring": 0.7}
@@ -312,6 +315,7 @@ class Config:
                 "enabled": self.enable_cross_encoder_rerank,
                 "backend": self.reranker_backend,
                 "model": self.reranker_model,
+                "use_gpu": self.reranker_use_gpu,
                 "top_k": self.reranker_top_k,
                 "max_input_tokens": self.reranker_max_input_tokens,
                 "pool_enabled": self.reranker_pool_enabled,
@@ -418,6 +422,8 @@ class Config:
                         )
                 if "model" in reranker:
                     self.reranker_model = reranker["model"]
+                if "use_gpu" in reranker:
+                    self.reranker_use_gpu = reranker["use_gpu"]
                 if "top_k" in reranker:
                     self.reranker_top_k = reranker["top_k"]
                 if "max_input_tokens" in reranker:
@@ -712,6 +718,7 @@ class Config:
             EMBEDDING_COOLDOWN: Rate limit cooldown for embedding
             RERANKER_MODEL: Override reranker model
             RERANKER_BACKEND: Override reranker backend
+            RERANKER_USE_GPU: Override reranker GPU usage (true/false)
             RERANKER_ENABLED: Override reranker enabled state (true/false)
             RERANKER_POOL_ENABLED: Enable reranker high availability pool
             RERANKER_STRATEGY: Load balance strategy for reranker
@@ -832,6 +839,11 @@ class Config:
             else:
                 log.warning("Invalid RERANKER_BACKEND in .env: %r", reranker_backend)
 
+        reranker_use_gpu = get_env("RERANKER_USE_GPU")
+        if reranker_use_gpu:
+            self.reranker_use_gpu = _parse_bool(reranker_use_gpu)
+            log.debug("Overriding reranker_use_gpu from .env: %s", self.reranker_use_gpu)
+
         reranker_enabled = get_env("RERANKER_ENABLED")
         if reranker_enabled:
             value = reranker_enabled.lower()
@@ -878,6 +890,25 @@ class Config:
             except ValueError:
                 log.warning("Invalid RERANKER_TEST_FILE_PENALTY in .env: %r", test_penalty)
 
+        ranking_test_penalty = get_env("TEST_FILE_PENALTY")
+        if ranking_test_penalty:
+            try:
+                self.test_file_penalty = float(ranking_test_penalty)
+                log.debug("Overriding test_file_penalty from .env: %s", self.test_file_penalty)
+            except ValueError:
+                log.warning("Invalid TEST_FILE_PENALTY in .env: %r", ranking_test_penalty)
+
+        generated_penalty = get_env("GENERATED_FILE_PENALTY")
+        if generated_penalty:
+            try:
+                self.generated_file_penalty = float(generated_penalty)
+                log.debug(
+                    "Overriding generated_file_penalty from .env: %s",
+                    self.generated_file_penalty,
+                )
+            except ValueError:
+                log.warning("Invalid GENERATED_FILE_PENALTY in .env: %r", generated_penalty)
+
         docstring_weight = get_env("RERANKER_DOCSTRING_WEIGHT")
         if docstring_weight:
             try:
diff --git a/codex-lens/src/codexlens/env_config.py b/codex-lens/src/codexlens/env_config.py
index 87dac45c..8f1b1b0f 100644
--- a/codex-lens/src/codexlens/env_config.py
+++ b/codex-lens/src/codexlens/env_config.py
@@ -23,6 +23,7 @@ ENV_VARS = {
     # Reranker configuration (overrides settings.json)
     "RERANKER_MODEL": "Reranker model name (overrides settings.json)",
     "RERANKER_BACKEND": "Reranker backend: fastembed, onnx, api, litellm, legacy",
+    "RERANKER_USE_GPU": "Use GPU for local reranker backends: true/false",
     "RERANKER_ENABLED": "Enable reranker: true/false",
     "RERANKER_API_KEY": "API key for reranker service (SiliconFlow/Cohere/Jina)",
     "RERANKER_API_BASE": "Base URL for reranker API (overrides provider default)",
@@ -65,6 +66,9 @@ ENV_VARS = {
     # Chunking configuration
     "CHUNK_STRIP_COMMENTS": "Strip comments from code chunks for embedding: true/false (default: true)",
     "CHUNK_STRIP_DOCSTRINGS": "Strip docstrings from code chunks for embedding: true/false (default: true)",
+    # Search ranking tuning
+    "TEST_FILE_PENALTY": "Penalty for test/fixture paths in final search ranking: 0.0-1.0 (default: 0.15)",
+    "GENERATED_FILE_PENALTY": "Penalty for generated/build artifact paths in final search ranking: 0.0-1.0 (default: 0.35)",
     # Reranker tuning
     "RERANKER_TEST_FILE_PENALTY": "Penalty for test files in reranking: 0.0-1.0 (default: 0.0)",
     "RERANKER_DOCSTRING_WEIGHT": "Weight for docstring chunks in reranking: 0.0-1.0 (default: 1.0)",
diff --git a/codex-lens/src/codexlens/search/chain_search.py b/codex-lens/src/codexlens/search/chain_search.py
index 07b71a02..c269af66 100644
--- a/codex-lens/src/codexlens/search/chain_search.py
+++ b/codex-lens/src/codexlens/search/chain_search.py
@@ -7,7 +7,7 @@ Supports depth-limited traversal, result aggregation, and symbol search.
 from __future__ import annotations
 
 from concurrent.futures import ThreadPoolExecutor, as_completed
-from dataclasses import dataclass, field
+from dataclasses import dataclass, field, replace
 from pathlib import Path
 from typing import List, Optional, Dict, Any, Literal, Tuple, TYPE_CHECKING
 import json
@@ -30,11 +30,36 @@ from codexlens.config import Config
 from codexlens.storage.registry import RegistryStore, DirMapping
 from codexlens.storage.dir_index import DirIndexStore, SubdirLink
 from codexlens.storage.global_index import GlobalSymbolIndex
+from codexlens.storage.index_filters import is_ignored_index_path
 from codexlens.storage.path_mapper import PathMapper
 from codexlens.storage.sqlite_store import SQLiteStore
 from codexlens.storage.vector_meta_store import VectorMetadataStore
-from codexlens.config import BINARY_VECTORS_MMAP_NAME, VECTORS_META_DB_NAME
+from codexlens.config import (
+    BINARY_VECTORS_MMAP_NAME,
+    VECTORS_HNSW_NAME,
+    VECTORS_META_DB_NAME,
+)
 from codexlens.search.hybrid_search import HybridSearchEngine
+from codexlens.search.ranking import query_prefers_lexical_search
+
+SEARCH_ARTIFACT_DIRS = frozenset({
+    "dist",
+    "build",
+    "out",
+    "target",
+    "bin",
+    "obj",
+    "_build",
+    "coverage",
+    "htmlcov",
+    ".cache",
+    ".parcel-cache",
+    ".turbo",
+    ".next",
+    ".nuxt",
+    "node_modules",
+    "bower_components",
+})
 
 
 @dataclass
@@ -60,6 +85,7 @@ class SearchOptions:
         hybrid_weights: Custom RRF weights for hybrid search (optional)
         group_results: Enable grouping of similar results (default False)
         grouping_threshold: Score threshold for grouping similar results (default 0.01)
+        inject_feature_anchors: Whether to inject lexical feature anchors (default True)
     """
     depth: int = -1
     max_workers: int = 8
@@ -79,6 +105,7 @@ class SearchOptions:
     hybrid_weights: Optional[Dict[str, float]] = None
     group_results: bool = False
     grouping_threshold: float = 0.01
+    inject_feature_anchors: bool = True
 
 
 @dataclass
@@ -169,6 +196,11 @@ class ChainSearchEngine:
         self._realtime_lsp_keepalive_lock = threading.RLock()
         self._realtime_lsp_keepalive = None
         self._realtime_lsp_keepalive_key = None
+        self._runtime_cache_lock = threading.RLock()
+        self._dense_ann_cache: Dict[Tuple[str, int], Any] = {}
+        self._legacy_dense_ann_cache: Dict[Tuple[str, int], Any] = {}
+        self._reranker_cache_key: Optional[Tuple[str, Optional[str], bool, Optional[int]]] = None
+        self._reranker_instance: Any = None
         # Track which (workspace_root, config_file) pairs have already been warmed up.
         # This avoids paying the warmup sleep on every query when using keep-alive LSP servers.
         self._realtime_lsp_warmed_ids: set[tuple[str, str | None]] = set()
@@ -194,6 +226,7 @@ class ChainSearchEngine:
         if self._executor is not None:
             self._executor.shutdown(wait=True)
             self._executor = None
+        self._clear_runtime_caches()
         with self._realtime_lsp_keepalive_lock:
             keepalive = self._realtime_lsp_keepalive
             self._realtime_lsp_keepalive = None
@@ -212,6 +245,166 @@ class ChainSearchEngine:
         """Context manager exit."""
         self.close()
 
+    @staticmethod
+    def _release_cached_resource(resource: Any) -> None:
+        """Best-effort cleanup for cached runtime helpers."""
+        if resource is None:
+            return
+        for attr_name in ("clear", "close"):
+            cleanup = getattr(resource, attr_name, None)
+            if callable(cleanup):
+                try:
+                    cleanup()
+                except Exception:
+                    pass
+                break
+
+    def _clear_runtime_caches(self) -> None:
+        """Drop per-engine runtime caches for dense indexes and rerankers."""
+        with self._runtime_cache_lock:
+            dense_indexes = list(self._dense_ann_cache.values())
+            legacy_dense_indexes = list(self._legacy_dense_ann_cache.values())
+            reranker = self._reranker_instance
+            self._dense_ann_cache = {}
+            self._legacy_dense_ann_cache = {}
+            self._reranker_cache_key = None
+            self._reranker_instance = None
+
+        for resource in [*dense_indexes, *legacy_dense_indexes, reranker]:
+            self._release_cached_resource(resource)
+
+    def _get_cached_centralized_dense_index(self, index_root: Path, dim: int) -> Optional[Any]:
+        """Load and cache a centralized dense ANN index for repeated queries."""
+        from codexlens.semantic.ann_index import ANNIndex
+
+        resolved_root = Path(index_root).resolve()
+        cache_key = (str(resolved_root), int(dim))
+        with self._runtime_cache_lock:
+            cached = self._dense_ann_cache.get(cache_key)
+        if cached is not None:
+            return cached
+
+        ann_index = ANNIndex.create_central(index_root=resolved_root, dim=int(dim))
+        if not ann_index.load() or ann_index.count() == 0:
+            return None
+
+        with self._runtime_cache_lock:
+            cached = self._dense_ann_cache.get(cache_key)
+            if cached is None:
+                self._dense_ann_cache[cache_key] = ann_index
+                cached = ann_index
+        return cached
+
+    def _get_cached_legacy_dense_index(self, index_path: Path, dim: int) -> Optional[Any]:
+        """Load and cache a legacy per-index dense ANN index for repeated queries."""
+        from codexlens.semantic.ann_index import ANNIndex
+
+        resolved_path = Path(index_path).resolve()
+        cache_key = (str(resolved_path), int(dim))
+        with self._runtime_cache_lock:
+            cached = self._legacy_dense_ann_cache.get(cache_key)
+        if cached is not None:
+            return cached
+
+        ann_index = ANNIndex(resolved_path, dim=int(dim))
+        if not ann_index.load() or ann_index.count() == 0:
+            return None
+
+        with self._runtime_cache_lock:
+            cached = self._legacy_dense_ann_cache.get(cache_key)
+            if cached is None:
+                self._legacy_dense_ann_cache[cache_key] = ann_index
+                cached = ann_index
+        return cached
+
+    def _get_cached_reranker(self) -> Any:
+        """Return a cached reranker instance for repeated cascade queries."""
+        try:
+            from codexlens.semantic.reranker import (
+                check_reranker_available,
+                get_reranker,
+            )
+        except ImportError as exc:
+            self.logger.debug("Reranker not available: %s", exc)
+            return None
+        except Exception as exc:
+            self.logger.debug("Failed to import reranker factory: %s", exc)
+            return None
+
+        backend = "onnx"
+        model_name = None
+        use_gpu = True
+        max_tokens = None
+
+        if self._config is not None:
+            backend = getattr(self._config, "reranker_backend", "onnx") or "onnx"
+            model_name = getattr(self._config, "reranker_model", None)
+            use_gpu = getattr(
+                self._config,
+                "reranker_use_gpu",
+                getattr(self._config, "embedding_use_gpu", True),
+            )
+            max_tokens = getattr(self._config, "reranker_max_input_tokens", None)
+
+        cache_key = (
+            str(backend).strip().lower(),
+            str(model_name).strip() if isinstance(model_name, str) and model_name.strip() else None,
+            bool(use_gpu),
+            int(max_tokens) if isinstance(max_tokens, (int, float)) else None,
+        )
+        with self._runtime_cache_lock:
+            cached = (
+                self._reranker_instance
+                if self._reranker_instance is not None and self._reranker_cache_key == cache_key
+                else None
+            )
+        if cached is not None:
+            return cached
+
+        ok, err = check_reranker_available(cache_key[0])
+        if not ok:
+            self.logger.debug("Reranker backend unavailable (%s): %s", cache_key[0], err)
+            return None
+
+        kwargs: Dict[str, Any] = {}
+        device = None
+        if cache_key[0] == "onnx":
+            kwargs["use_gpu"] = cache_key[2]
+        elif cache_key[0] == "api":
+            if cache_key[3] is not None:
+                kwargs["max_input_tokens"] = cache_key[3]
+        elif not cache_key[2]:
+            device = "cpu"
+
+        try:
+            reranker = get_reranker(
+                backend=cache_key[0],
+                model_name=cache_key[1],
+                device=device,
+                **kwargs,
+            )
+        except Exception as exc:
+            self.logger.debug("Failed to initialize reranker: %s", exc)
+            return None
+
+        previous = None
+        with self._runtime_cache_lock:
+            cached = (
+                self._reranker_instance
+                if self._reranker_instance is not None and self._reranker_cache_key == cache_key
+                else None
+            )
+            if cached is not None:
+                reranker = cached
+            else:
+                previous = self._reranker_instance
+                self._reranker_cache_key = cache_key
+                self._reranker_instance = reranker
+
+        if previous is not None and previous is not reranker:
+            self._release_cached_resource(previous)
+        return reranker
+
     def search(self, query: str,
                source_path: Path,
                options: Optional[SearchOptions] = None) -> ChainSearchResult:
@@ -238,6 +431,21 @@ class ChainSearchEngine:
             ...     print(f"{r.path}: {r.score:.2f}")
         """
         options = options or SearchOptions()
+        effective_options = options
+        if options.hybrid_mode and query_prefers_lexical_search(query):
+            self.logger.debug(
+                "Hybrid shortcut: using lexical search path for lexical-priority query %r",
+                query,
+            )
+            effective_options = replace(
+                options,
+                hybrid_mode=False,
+                enable_vector=False,
+                pure_vector=False,
+                enable_cascade=False,
+                hybrid_weights=None,
+                enable_fuzzy=True,
+            )
         start_time = time.time()
         stats = SearchStats()
 
@@ -254,7 +462,7 @@ class ChainSearchEngine:
             )
 
         # Step 2: Collect all index paths to search
-        index_paths = self._collect_index_paths(start_index, options.depth)
+        index_paths = self._collect_index_paths(start_index, effective_options.depth)
         stats.dirs_searched = len(index_paths)
 
         if not index_paths:
@@ -269,33 +477,47 @@ class ChainSearchEngine:
 
         # Step 3: Parallel search
         results, search_stats = self._search_parallel(
-            index_paths, query, options
+            index_paths, query, effective_options
         )
         stats.errors = search_stats.errors
 
         # Step 3.5: Filter by extension if requested
-        if options.code_only or options.exclude_extensions:
+        if effective_options.code_only or effective_options.exclude_extensions:
             results = self._filter_by_extension(
-                results, options.code_only, options.exclude_extensions
+                results, effective_options.code_only, effective_options.exclude_extensions
+            )
+
+        if effective_options.inject_feature_anchors:
+            results = self._inject_query_feature_anchors(
+                query,
+                source_path,
+                effective_options,
+                results,
+                limit=min(6, max(2, effective_options.total_limit)),
             )
 
         # Step 4: Merge and rank
-        final_results = self._merge_and_rank(results, options.total_limit, options.offset)
+        final_results = self._merge_and_rank(
+            results,
+            effective_options.total_limit,
+            effective_options.offset,
+            query=query,
+        )
 
         # Step 5: Optional grouping of similar results
-        if options.group_results:
+        if effective_options.group_results:
             from codexlens.search.ranking import group_similar_results
             final_results = group_similar_results(
-                final_results, score_threshold_abs=options.grouping_threshold
+                final_results, score_threshold_abs=effective_options.grouping_threshold
             )
 
         stats.files_matched = len(final_results)
 
         # Optional: Symbol search
         symbols = []
-        if options.include_symbols:
+        if effective_options.include_symbols:
             symbols = self._search_symbols_parallel(
-                index_paths, query, None, options.total_limit
+                index_paths, query, None, effective_options.total_limit
             )
 
         # Optional: graph expansion using precomputed neighbors
@@ -408,100 +630,24 @@ class ChainSearchEngine:
                 stats=stats
             )
 
-        # Initialize embedding backends
-        try:
-            from codexlens.indexing.embedding import (
-                BinaryEmbeddingBackend,
-                DenseEmbeddingBackend,
-            )
-            from codexlens.semantic.ann_index import BinaryANNIndex
-        except ImportError as exc:
-            self.logger.warning(
-                "Binary cascade dependencies not available: %s. "
-                "Falling back to standard search.",
-                exc
-            )
-            return self.search(query, source_path, options=options)
-
         # Stage 1: Binary vector coarse retrieval
         self.logger.debug(
             "Binary Cascade Stage 1: Binary coarse retrieval for %d candidates",
             coarse_k,
         )
 
-        use_gpu = True
-        if self._config is not None:
-            use_gpu = getattr(self._config, "embedding_use_gpu", True)
+        coarse_candidates, used_centralized, _, stage2_index_root = self._collect_binary_coarse_candidates(
+            query,
+            index_paths,
+            coarse_k,
+            stats,
+            index_root=index_paths[0].parent if index_paths else None,
+        )
 
-        try:
-            binary_backend = BinaryEmbeddingBackend(use_gpu=use_gpu)
-            query_binary_packed = binary_backend.embed_packed([query])[0]
-        except Exception as exc:
-            self.logger.warning(
-                "Failed to generate binary query embedding: %s. "
-                "Falling back to standard search.",
-                exc
-            )
-            return self.search(query, source_path, options=options)
-
-        # Try centralized BinarySearcher first (preferred for mmap indexes)
-        # The index root is the parent of the first index path
-        index_root = index_paths[0].parent if index_paths else None
-        all_candidates: List[Tuple[int, int, Path]] = []  # (chunk_id, distance, index_path)
-        used_centralized = False
-
-        if index_root:
-            centralized_searcher = self._get_centralized_binary_searcher(index_root)
-            if centralized_searcher is not None:
-                try:
-                    # BinarySearcher expects dense vector, not packed binary
-                    from codexlens.semantic.embedder import Embedder
-                    embedder = Embedder()
-                    query_dense = embedder.embed_to_numpy([query])[0]
-
-                    # Centralized search - returns (chunk_id, hamming_distance) tuples
-                    results = centralized_searcher.search(query_dense, top_k=coarse_k)
-                    for chunk_id, dist in results:
-                        all_candidates.append((chunk_id, dist, index_root))
-                    used_centralized = True
-                    self.logger.debug(
-                        "Centralized binary search found %d candidates", len(results)
-                    )
-                except Exception as exc:
-                    self.logger.debug(
-                        "Centralized binary search failed: %s, falling back to per-directory",
-                        exc
-                    )
-                    centralized_searcher.clear()
-
-        # Fallback: Search per-directory indexes with legacy BinaryANNIndex
-        if not used_centralized:
-            for index_path in index_paths:
-                try:
-                    # Get or create binary index for this path (uses deprecated BinaryANNIndex)
-                    binary_index = self._get_or_create_binary_index(index_path)
-                    if binary_index is None or binary_index.count() == 0:
-                        continue
-
-                    # Search binary index
-                    ids, distances = binary_index.search(query_binary_packed, coarse_k)
-                    for chunk_id, dist in zip(ids, distances):
-                        all_candidates.append((chunk_id, dist, index_path))
-
-                except Exception as exc:
-                    self.logger.debug(
-                        "Binary search failed for %s: %s", index_path, exc
-                    )
-                    stats.errors.append(f"Binary search failed for {index_path}: {exc}")
-
-        if not all_candidates:
+        if not coarse_candidates:
             self.logger.debug("No binary candidates found, falling back to standard search")
             return self.search(query, source_path, options=options)
 
-        # Sort by Hamming distance and take top coarse_k
-        all_candidates.sort(key=lambda x: x[1])
-        coarse_candidates = all_candidates[:coarse_k]
-
         self.logger.debug(
             "Binary Cascade Stage 1 complete: %d candidates retrieved",
             len(coarse_candidates),
@@ -514,21 +660,6 @@ class ChainSearchEngine:
             k,
         )
 
-        try:
-            dense_backend = DenseEmbeddingBackend(use_gpu=use_gpu)
-            query_dense = dense_backend.embed_to_numpy([query])[0]
-        except Exception as exc:
-            self.logger.warning(
-                "Failed to generate dense query embedding: %s. "
-                "Using Hamming distance scores only.",
-                exc
-            )
-            # Fall back to using Hamming distance as score
-            return self._build_results_from_candidates(
-                coarse_candidates[:k], index_paths, stats, query, start_time,
-                use_centralized=used_centralized
-            )
-
         # Group candidates by index path for batch retrieval
         candidates_by_index: Dict[Path, List[int]] = {}
         for chunk_id, _, index_path in coarse_candidates:
@@ -539,9 +670,18 @@ class ChainSearchEngine:
         # Retrieve dense embeddings and compute cosine similarity
         scored_results: List[Tuple[float, SearchResult]] = []
         import sqlite3
+        dense_query_cache: Dict[Tuple[str, str, bool], "np.ndarray"] = {}
+        dense_query_errors: list[str] = []
 
         for index_path, chunk_ids in candidates_by_index.items():
             try:
+                query_index_root = index_path if used_centralized else index_path.parent
+                query_dense = self._embed_dense_query(
+                    query,
+                    index_root=query_index_root,
+                    query_cache=dense_query_cache,
+                )
+
                 # Collect valid rows and dense vectors for batch processing
                 valid_rows: List[Dict[str, Any]] = []
                 dense_vectors: List["np.ndarray"] = []
@@ -653,6 +793,28 @@ class ChainSearchEngine:
                     "Dense reranking failed for %s: %s", index_path, exc
                 )
                 stats.errors.append(f"Dense reranking failed for {index_path}: {exc}")
+                dense_query_errors.append(str(exc))
+
+        if not scored_results:
+            if dense_query_errors:
+                self.logger.warning(
+                    "Failed to generate dense query embeddings for binary cascade: %s. "
+                    "Using Hamming distance scores only.",
+                    dense_query_errors[0],
+                )
+            final_results = self._materialize_binary_candidates(
+                coarse_candidates[:k],
+                stats,
+                stage2_index_root=stage2_index_root,
+            )
+            stats.files_matched = len(final_results)
+            stats.time_ms = (time.time() - start_time) * 1000
+            return ChainSearchResult(
+                query=query,
+                results=final_results,
+                symbols=[],
+                stats=stats,
+            )
 
         # Sort by score descending and deduplicate by path
         scored_results.sort(key=lambda x: x[0], reverse=True)
@@ -662,7 +824,10 @@ class ChainSearchEngine:
             if result.path not in path_to_result:
                 path_to_result[result.path] = result
 
-        final_results = list(path_to_result.values())[:k]
+        final_results = self._apply_default_path_penalties(
+            query,
+            list(path_to_result.values()),
+        )[:k]
 
         # Optional: grouping of similar results
         if options.group_results:
@@ -865,8 +1030,20 @@ class ChainSearchEngine:
             stats,
             index_root=start_index.parent,
         )
+        coarse_results = self._inject_query_feature_anchors(
+            query,
+            source_path,
+            options,
+            coarse_results,
+            limit=min(6, max(2, k)),
+        )
         stage_times["stage1_binary_ms"] = (time.time() - stage1_start) * 1000
         stage_counts["stage1_candidates"] = len(coarse_results)
+        stage_counts["stage1_feature_anchors"] = sum(
+            1
+            for result in coarse_results
+            if (result.metadata or {}).get("feature_query_anchor")
+        )
 
         self.logger.debug(
             "Staged Stage 1: Binary search found %d candidates in %.2fms",
@@ -923,9 +1100,18 @@ class ChainSearchEngine:
 
         # ========== Stage 3: Clustering and Representative Selection ==========
         stage3_start = time.time()
-        clustered_results = self._stage3_cluster_prune(expanded_results, k * 2)
+        stage3_target_count = self._resolve_stage3_target_count(
+            k,
+            len(expanded_results),
+        )
+        clustered_results = self._stage3_cluster_prune(
+            expanded_results,
+            stage3_target_count,
+            query=query,
+        )
         stage_times["stage3_cluster_ms"] = (time.time() - stage3_start) * 1000
         stage_counts["stage3_clustered"] = len(clustered_results)
+        stage_counts["stage3_target_count"] = stage3_target_count
         if self._config is not None:
             try:
                 stage_counts["stage3_strategy"] = str(getattr(self._config, "staged_clustering_strategy", "auto") or "auto")
@@ -965,7 +1151,10 @@ class ChainSearchEngine:
             if result.path not in path_to_result or result.score > path_to_result[result.path].score:
                 path_to_result[result.path] = result
 
-        final_results = list(path_to_result.values())[:k]
+        final_results = self._apply_default_path_penalties(
+            query,
+            list(path_to_result.values()),
+        )[:k]
 
         # Optional: grouping of similar results
         if options.group_results:
@@ -1010,259 +1199,24 @@ class ChainSearchEngine:
         *,
         index_root: Optional[Path] = None,
     ) -> Tuple[List[SearchResult], Optional[Path]]:
-        """Stage 1: Binary vector coarse search using Hamming distance.
-
-        Reuses the binary coarse search logic from binary_cascade_search.
-
-        Args:
-            query: Search query string
-            index_paths: List of index database paths to search
-            coarse_k: Number of coarse candidates to retrieve
-            stats: SearchStats to update with errors
-
-        Returns:
-            Tuple of (list of SearchResult objects, index_root path or None)
-        """
-        # Initialize binary embedding backend
-        try:
-            from codexlens.indexing.embedding import BinaryEmbeddingBackend
-        except ImportError as exc:
-            self.logger.warning(
-                "BinaryEmbeddingBackend not available: %s", exc
-            )
-            return [], None
-
-        # Try centralized BinarySearcher first (preferred for mmap indexes).
-        # Centralized binary vectors live at a project index root (where `index binary-mmap`
-        # was run), which may be an ancestor of the nearest `_index.db` directory.
-        index_root = Path(index_root).resolve() if index_root is not None else (index_paths[0].parent if index_paths else None)
-        if index_root is not None:
-            index_root = self._find_nearest_binary_mmap_root(index_root)
-        coarse_candidates: List[Tuple[int, float, Path]] = []  # (chunk_id, distance, index_path)
-        used_centralized = False
-        using_dense_fallback = False
-
-        if index_root:
-            binary_searcher = self._get_centralized_binary_searcher(index_root)
-            if binary_searcher is not None:
-                try:
-                    use_gpu = True
-                    if self._config is not None:
-                        use_gpu = getattr(self._config, "embedding_use_gpu", True)
-
-                    query_dense = None
-                    backend = getattr(binary_searcher, "backend", None)
-                    model = getattr(binary_searcher, "model", None)
-                    profile = getattr(binary_searcher, "model_profile", None) or "code"
-
-                    if backend == "litellm":
-                        try:
-                            from codexlens.semantic.factory import get_embedder as get_factory_embedder
-                            embedder = get_factory_embedder(backend="litellm", model=model or "code")
-                            query_dense = embedder.embed_to_numpy([query])[0]
-                        except Exception:
-                            query_dense = None
-                    if query_dense is None:
-                        from codexlens.semantic.embedder import get_embedder
-                        embedder = get_embedder(profile=str(profile), use_gpu=use_gpu)
-                        query_dense = embedder.embed_to_numpy([query])[0]
-
-                    results = binary_searcher.search(query_dense, top_k=coarse_k)
-                    for chunk_id, distance in results:
-                        coarse_candidates.append((chunk_id, distance, index_root))
-                    if coarse_candidates:
-                        used_centralized = True
-                        self.logger.debug(
-                            "Stage 1 centralized binary search: %d candidates", len(results)
-                        )
-                except Exception as exc:
-                    self.logger.debug(f"Centralized binary search failed: {exc}")
-
-        if not used_centralized:
-            # Fallback to per-directory binary indexes (legacy BinaryANNIndex).
-            #
-            # Generating the query binary embedding can be expensive (depending on embedding backend).
-            # If no legacy binary vector files exist, skip this path and fall back to dense ANN search.
-            has_legacy_binary_vectors = any(
-                (p.parent / f"{p.stem}_binary_vectors.bin").exists() for p in index_paths
-            )
-            if not has_legacy_binary_vectors:
-                self.logger.debug(
-                    "No legacy binary vector files found; skipping legacy binary search fallback"
-                )
-            else:
-                use_gpu = True
-                if self._config is not None:
-                    use_gpu = getattr(self._config, "embedding_use_gpu", True)
-
-                query_binary = None
-                try:
-                    binary_backend = BinaryEmbeddingBackend(use_gpu=use_gpu)
-                    query_binary = binary_backend.embed_packed([query])[0]
-                except Exception as exc:
-                    self.logger.warning(f"Failed to generate binary query embedding: {exc}")
-                    query_binary = None
-
-                if query_binary is not None:
-                    for index_path in index_paths:
-                        try:
-                            binary_index = self._get_or_create_binary_index(index_path)
-                            if binary_index is None or binary_index.count() == 0:
-                                continue
-                            ids, distances = binary_index.search(query_binary, coarse_k)
-                            for chunk_id, dist in zip(ids, distances):
-                                coarse_candidates.append((chunk_id, float(dist), index_path))
-                        except Exception as exc:
-                            self.logger.debug(
-                                "Binary search failed for %s: %s", index_path, exc
-                            )
+        """Stage 1: Binary vector coarse search using Hamming distance."""
 
+        coarse_candidates, _, using_dense_fallback, stage2_index_root = self._collect_binary_coarse_candidates(
+            query,
+            index_paths,
+            coarse_k,
+            stats,
+            index_root=index_root,
+            allow_dense_fallback=True,
+        )
         if not coarse_candidates:
-            # Final fallback: dense ANN coarse search (HNSW) over existing dense vector indexes.
-            #
-            # This allows the staged pipeline (LSP expansion + clustering) to run even when
-            # binary vectors are not generated for the current project.
-            dense_candidates: List[Tuple[int, float, Path]] = []
-            try:
-                from codexlens.semantic.ann_index import ANNIndex
-                from codexlens.semantic.embedder import Embedder
-
-                embedder = Embedder()
-                query_dense = embedder.embed_to_numpy([query])[0]
-                dim = int(getattr(query_dense, "shape", (len(query_dense),))[0])
-
-                for index_path in index_paths:
-                    try:
-                        ann_index = ANNIndex(index_path, dim=dim)
-                        if not ann_index.load() or ann_index.count() == 0:
-                            continue
-                        ids, distances = ann_index.search(query_dense, top_k=coarse_k)
-                        for chunk_id, dist in zip(ids, distances):
-                            dense_candidates.append((chunk_id, float(dist), index_path))
-                    except Exception as exc:
-                        self.logger.debug(
-                            "Dense coarse search failed for %s: %s", index_path, exc
-                        )
-            except Exception as exc:
-                self.logger.debug("Dense coarse search fallback unavailable: %s", exc)
-                dense_candidates = []
-
-            if dense_candidates:
-                dense_candidates.sort(key=lambda x: x[1])
-                coarse_candidates = dense_candidates[:coarse_k]
-                using_dense_fallback = True
-
-        if not coarse_candidates:
-            return [], index_root
-
-        # Sort by Hamming distance and take top coarse_k
-        coarse_candidates.sort(key=lambda x: x[1])
-        coarse_candidates = coarse_candidates[:coarse_k]
-
-        # Build SearchResult objects from candidates
-        coarse_results: List[SearchResult] = []
-
-        # Group candidates by index path for efficient retrieval
-        candidates_by_index: Dict[Path, List[int]] = {}
-        for chunk_id, _, idx_path in coarse_candidates:
-            if idx_path not in candidates_by_index:
-                candidates_by_index[idx_path] = []
-            candidates_by_index[idx_path].append(chunk_id)
-
-        # Retrieve chunk content
-        import sqlite3
-        central_meta_path = index_root / VECTORS_META_DB_NAME if index_root else None
-        central_meta_store = None
-        if central_meta_path and central_meta_path.exists():
-            central_meta_store = VectorMetadataStore(central_meta_path)
-
-        for idx_path, chunk_ids in candidates_by_index.items():
-            try:
-                chunks_data = []
-                if central_meta_store:
-                    chunks_data = central_meta_store.get_chunks_by_ids(chunk_ids)
-
-                if not chunks_data and used_centralized:
-                    meta_db_path = idx_path / VECTORS_META_DB_NAME
-                    if meta_db_path.exists():
-                        meta_store = VectorMetadataStore(meta_db_path)
-                        chunks_data = meta_store.get_chunks_by_ids(chunk_ids)
-
-                if not chunks_data:
-                    try:
-                        conn = sqlite3.connect(str(idx_path))
-                        conn.row_factory = sqlite3.Row
-                        placeholders = ",".join("?" * len(chunk_ids))
-                        cursor = conn.execute(
-                            f"""
-                            SELECT id, file_path, content, metadata, category
-                            FROM semantic_chunks
-                            WHERE id IN ({placeholders})
-                            """,
-                            chunk_ids
-                        )
-                        chunks_data = [
-                            {
-                                "id": row["id"],
-                                "file_path": row["file_path"],
-                                "content": row["content"],
-                                "metadata": row["metadata"],
-                                "category": row["category"],
-                            }
-                            for row in cursor.fetchall()
-                        ]
-                        conn.close()
-                    except Exception:
-                        pass
-
-                for chunk in chunks_data:
-                    chunk_id = chunk.get("id") or chunk.get("chunk_id")
-                    distance = next(
-                        (d for cid, d, _ in coarse_candidates if cid == chunk_id),
-                        256
-                    )
-                    if using_dense_fallback:
-                        # Cosine distance in [0, 2] -> clamp to [0, 1] score
-                        score = max(0.0, 1.0 - float(distance))
-                    else:
-                        score = 1.0 - (int(distance) / 256.0)
-
-                    content = chunk.get("content", "")
-
-                    # Extract symbol info from metadata if available
-                    metadata = chunk.get("metadata")
-                    symbol_name = None
-                    symbol_kind = None
-                    start_line = chunk.get("start_line")
-                    end_line = chunk.get("end_line")
-                    if metadata:
-                        try:
-                            meta_dict = json.loads(metadata) if isinstance(metadata, str) else metadata
-                            symbol_name = meta_dict.get("symbol_name")
-                            symbol_kind = meta_dict.get("symbol_kind")
-                            start_line = meta_dict.get("start_line", start_line)
-                            end_line = meta_dict.get("end_line", end_line)
-                        except Exception:
-                            pass
-
-                    result = SearchResult(
-                        path=chunk.get("file_path", ""),
-                        score=float(score),
-                        excerpt=content[:500] if content else "",
-                        content=content,
-                        symbol_name=symbol_name,
-                        symbol_kind=symbol_kind,
-                        start_line=start_line,
-                        end_line=end_line,
-                    )
-                    coarse_results.append(result)
-            except Exception as exc:
-                self.logger.debug(
-                    "Failed to retrieve chunks from %s: %s", idx_path, exc
-                )
-                stats.errors.append(f"Stage 1 chunk retrieval failed for {idx_path}: {exc}")
-
-        return coarse_results, index_root
+            return [], stage2_index_root
+        return self._materialize_binary_candidates(
+            coarse_candidates,
+            stats,
+            stage2_index_root=stage2_index_root,
+            using_dense_fallback=using_dense_fallback,
+        ), stage2_index_root
 
     def _stage2_lsp_expand(
         self,
@@ -1803,6 +1757,261 @@ class ChainSearchEngine:
 
         return combined
 
+    def _collect_query_feature_anchor_results(
+        self,
+        query: str,
+        source_path: Path,
+        options: SearchOptions,
+        *,
+        limit: int,
+    ) -> List[SearchResult]:
+        """Collect small lexical anchor sets for explicit file/feature hints."""
+        if limit <= 0:
+            return []
+
+        from codexlens.search.ranking import (
+            QueryIntent,
+            _path_topic_tokens,
+            detect_query_intent,
+            extract_explicit_path_hints,
+            is_auxiliary_reference_path,
+            is_generated_artifact_path,
+            is_test_file,
+            query_targets_auxiliary_files,
+            query_targets_generated_files,
+            query_targets_test_files,
+        )
+
+        explicit_hints = extract_explicit_path_hints(query)
+        if not explicit_hints:
+            return []
+        skip_test_files = query_targets_test_files(query)
+        skip_generated_files = query_targets_generated_files(query)
+        skip_auxiliary_files = query_targets_auxiliary_files(query)
+
+        anchor_limit = max(1, int(limit))
+        per_hint_limit = max(2, min(6, anchor_limit))
+        seed_opts = SearchOptions(
+            depth=options.depth,
+            max_workers=options.max_workers,
+            limit_per_dir=max(10, per_hint_limit),
+            total_limit=max(anchor_limit, per_hint_limit * 2),
+            include_symbols=False,
+            include_semantic=False,
+            files_only=False,
+            code_only=options.code_only,
+            exclude_extensions=list(options.exclude_extensions or []),
+            enable_vector=False,
+            hybrid_mode=False,
+            pure_vector=False,
+            enable_cascade=False,
+            inject_feature_anchors=False,
+        )
+
+        anchors_by_path: Dict[str, SearchResult] = {}
+        for hint_tokens in explicit_hints:
+            hint_query = " ".join(hint_tokens)
+            try:
+                seed_result = self.search(hint_query, source_path, options=seed_opts)
+            except Exception as exc:
+                self.logger.debug(
+                    "Feature anchor search failed for %r: %s",
+                    hint_query,
+                    exc,
+                )
+                continue
+
+            for candidate in seed_result.results:
+                _, basename_tokens = _path_topic_tokens(candidate.path)
+                if not basename_tokens or not all(token in basename_tokens for token in hint_tokens):
+                    continue
+                if not skip_test_files and is_test_file(candidate.path):
+                    continue
+                if not skip_generated_files and is_generated_artifact_path(candidate.path):
+                    continue
+                if not skip_auxiliary_files and is_auxiliary_reference_path(candidate.path):
+                    continue
+                metadata = {
+                    **(candidate.metadata or {}),
+                    "feature_query_anchor": True,
+                    "feature_query_hint": hint_query,
+                    "feature_query_hint_tokens": list(hint_tokens),
+                }
+                anchor = candidate.model_copy(
+                    deep=True,
+                    update={"metadata": metadata},
+                )
+                existing = anchors_by_path.get(anchor.path)
+                if existing is None or float(anchor.score) > float(existing.score):
+                    anchors_by_path[anchor.path] = anchor
+                if len(anchors_by_path) >= anchor_limit:
+                    break
+            if len(anchors_by_path) >= anchor_limit:
+                break
+
+        query_intent = detect_query_intent(query)
+        if not anchors_by_path and query_intent in {QueryIntent.KEYWORD, QueryIntent.MIXED}:
+            lexical_query = (query or "").strip()
+            if lexical_query:
+                try:
+                    seed_result = self.search(lexical_query, source_path, options=seed_opts)
+                except Exception as exc:
+                    self.logger.debug(
+                        "Lexical feature anchor search failed for %r: %s",
+                        lexical_query,
+                        exc,
+                    )
+                else:
+                    for candidate in seed_result.results:
+                        if not skip_test_files and is_test_file(candidate.path):
+                            continue
+                        if not skip_generated_files and is_generated_artifact_path(candidate.path):
+                            continue
+                        if not skip_auxiliary_files and is_auxiliary_reference_path(candidate.path):
+                            continue
+                        metadata = {
+                            **(candidate.metadata or {}),
+                            "feature_query_anchor": True,
+                            "feature_query_hint": lexical_query,
+                            "feature_query_hint_tokens": [],
+                            "feature_query_seed_kind": "lexical_query",
+                        }
+                        anchor = candidate.model_copy(
+                            deep=True,
+                            update={"metadata": metadata},
+                        )
+                        existing = anchors_by_path.get(anchor.path)
+                        if existing is None or float(anchor.score) > float(existing.score):
+                            anchors_by_path[anchor.path] = anchor
+                        if len(anchors_by_path) >= anchor_limit:
+                            break
+
+        return sorted(
+            anchors_by_path.values(),
+            key=lambda result: result.score,
+            reverse=True,
+        )[:anchor_limit]
+
+    def _merge_query_feature_anchor_results(
+        self,
+        base_results: List[SearchResult],
+        anchor_results: List[SearchResult],
+    ) -> List[SearchResult]:
+        """Merge explicit feature anchors into coarse candidates with comparable scores."""
+        if not anchor_results:
+            return sorted(base_results, key=lambda result: result.score, reverse=True)
+
+        merged: Dict[str, SearchResult] = {result.path: result for result in base_results}
+        base_sorted = sorted(base_results, key=lambda result: result.score, reverse=True)
+        base_max = float(base_sorted[0].score) if base_sorted else 1.0
+        if base_sorted:
+            cutoff_index = min(len(base_sorted) - 1, max(0, min(4, len(base_sorted) - 1)))
+            anchor_floor = float(base_sorted[cutoff_index].score)
+        else:
+            anchor_floor = base_max
+        if anchor_floor <= 0:
+            anchor_floor = max(base_max * 0.85, 0.01)
+
+        for index, anchor in enumerate(anchor_results):
+            target_score = max(
+                anchor_floor,
+                base_max * max(0.75, 0.92 - (0.03 * index)),
+                0.01,
+            )
+            existing = merged.get(anchor.path)
+            existing_metadata = existing.metadata or {} if existing is not None else {}
+            metadata = {
+                **existing_metadata,
+                **(anchor.metadata or {}),
+                "feature_query_anchor": True,
+            }
+            if existing is not None:
+                target_score = max(float(existing.score), target_score)
+                merged[anchor.path] = existing.model_copy(
+                    deep=True,
+                    update={
+                        "score": target_score,
+                        "metadata": metadata,
+                    },
+                )
+            else:
+                merged[anchor.path] = anchor.model_copy(
+                    deep=True,
+                    update={
+                        "score": target_score,
+                        "metadata": metadata,
+                    },
+                )
+
+        return sorted(merged.values(), key=lambda result: result.score, reverse=True)
+
+    def _inject_query_feature_anchors(
+        self,
+        query: str,
+        source_path: Path,
+        options: SearchOptions,
+        base_results: List[SearchResult],
+        *,
+        limit: int,
+    ) -> List[SearchResult]:
+        """Inject explicit file/feature anchors into coarse candidate sets."""
+        anchor_results = self._collect_query_feature_anchor_results(
+            query,
+            source_path,
+            options,
+            limit=limit,
+        )
+        return self._merge_query_feature_anchor_results(base_results, anchor_results)
+
+    @staticmethod
+    def _combine_stage3_anchor_results(
+        anchor_results: List[SearchResult],
+        clustered_results: List[SearchResult],
+        *,
+        target_count: int,
+    ) -> List[SearchResult]:
+        """Combine preserved query anchors with Stage 3 representatives."""
+        if target_count <= 0:
+            return []
+        merged: List[SearchResult] = []
+        seen: set[tuple[str, Optional[str], Optional[int]]] = set()
+        for result in [*anchor_results, *clustered_results]:
+            key = (result.path, result.symbol_name, result.start_line)
+            if key in seen:
+                continue
+            seen.add(key)
+            merged.append(result)
+            if len(merged) >= target_count:
+                break
+        return merged
+
+    def _select_stage3_query_anchor_results(
+        self,
+        query: str,
+        expanded_results: List[SearchResult],
+        *,
+        limit: int,
+    ) -> List[SearchResult]:
+        """Select a small number of explicit feature anchors to preserve through clustering."""
+        if limit <= 0 or not expanded_results:
+            return []
+
+        ranked_results = self._apply_default_path_penalties(query, expanded_results)
+        anchors: List[SearchResult] = []
+        seen: set[tuple[str, Optional[str], Optional[int]]] = set()
+        for result in ranked_results:
+            metadata = result.metadata or {}
+            if not metadata.get("feature_query_anchor"):
+                continue
+            key = (result.path, result.symbol_name, result.start_line)
+            if key in seen:
+                continue
+            seen.add(key)
+            anchors.append(result)
+            if len(anchors) >= limit:
+                break
+        return anchors
+
     def _find_lsp_workspace_root(self, start_path: Path) -> Path:
         """Best-effort workspace root selection for LSP initialization.
 
@@ -1851,6 +2060,7 @@ class ChainSearchEngine:
         self,
         expanded_results: List[SearchResult],
         target_count: int,
+        query: Optional[str] = None,
     ) -> List[SearchResult]:
         """Stage 3: Cluster expanded results and select representatives.
 
@@ -1867,9 +2077,42 @@ class ChainSearchEngine:
         if not expanded_results:
             return []
 
+        original_target_count = target_count
+        anchor_results: List[SearchResult] = []
+        if query:
+            anchor_results = self._select_stage3_query_anchor_results(
+                query,
+                expanded_results,
+                limit=min(4, max(1, original_target_count // 4)),
+            )
+        if anchor_results:
+            anchor_keys = {
+                (result.path, result.symbol_name, result.start_line)
+                for result in anchor_results
+            }
+            expanded_results = [
+                result
+                for result in expanded_results
+                if (result.path, result.symbol_name, result.start_line) not in anchor_keys
+            ]
+            target_count = max(0, original_target_count - len(anchor_results))
+            if target_count <= 0:
+                return anchor_results[:original_target_count]
+
+        if not expanded_results:
+            return self._combine_stage3_anchor_results(
+                anchor_results,
+                [],
+                target_count=original_target_count,
+            )
+
         # If few results, skip clustering
         if len(expanded_results) <= target_count:
-            return expanded_results
+            return self._combine_stage3_anchor_results(
+                anchor_results,
+                expanded_results,
+                target_count=original_target_count,
+            )
 
         strategy_name = "auto"
         if self._config is not None:
@@ -1877,10 +2120,18 @@ class ChainSearchEngine:
         strategy_name = str(strategy_name).strip().lower()
 
         if strategy_name in {"noop", "none", "off"}:
-            return sorted(expanded_results, key=lambda r: r.score, reverse=True)[:target_count]
+            return self._combine_stage3_anchor_results(
+                anchor_results,
+                sorted(expanded_results, key=lambda r: r.score, reverse=True)[:target_count],
+                target_count=original_target_count,
+            )
 
         if strategy_name in {"score", "top", "rank"}:
-            return sorted(expanded_results, key=lambda r: r.score, reverse=True)[:target_count]
+            return self._combine_stage3_anchor_results(
+                anchor_results,
+                sorted(expanded_results, key=lambda r: r.score, reverse=True)[:target_count],
+                target_count=original_target_count,
+            )
 
         if strategy_name in {"path", "file"}:
             best_by_path: Dict[str, SearchResult] = {}
@@ -1892,7 +2143,11 @@ class ChainSearchEngine:
                     best_by_path[key] = r
             candidates = list(best_by_path.values()) or expanded_results
             candidates.sort(key=lambda r: r.score, reverse=True)
-            return candidates[:target_count]
+            return self._combine_stage3_anchor_results(
+                anchor_results,
+                candidates[:target_count],
+                target_count=original_target_count,
+            )
 
         if strategy_name in {"dir_rr", "rr_dir", "round_robin_dir"}:
             results_sorted = sorted(expanded_results, key=lambda r: r.score, reverse=True)
@@ -1920,7 +2175,11 @@ class ChainSearchEngine:
                         break
                 if not progressed:
                     break
-            return out
+            return self._combine_stage3_anchor_results(
+                anchor_results,
+                out,
+                target_count=original_target_count,
+            )
 
         try:
             from codexlens.search.clustering import (
@@ -1943,9 +2202,11 @@ class ChainSearchEngine:
             if embeddings is None or len(embeddings) == 0:
                 # No embeddings available, fall back to score-based selection
                 self.logger.debug("No embeddings for clustering, using score-based selection")
-                return sorted(
-                    expanded_results, key=lambda r: r.score, reverse=True
-                )[:target_count]
+                return self._combine_stage3_anchor_results(
+                    anchor_results,
+                    sorted(expanded_results, key=lambda r: r.score, reverse=True)[:target_count],
+                    target_count=original_target_count,
+                )
 
             # Create clustering config
             config = ClusteringConfig(
@@ -1972,18 +2233,26 @@ class ChainSearchEngine:
                 remaining_sorted = sorted(remaining, key=lambda r: r.score, reverse=True)
                 representatives.extend(remaining_sorted[:target_count - len(representatives)])
 
-            return representatives[:target_count]
+            return self._combine_stage3_anchor_results(
+                anchor_results,
+                representatives[:target_count],
+                target_count=original_target_count,
+            )
 
         except ImportError as exc:
             self.logger.debug("Clustering not available: %s", exc)
-            return sorted(
-                expanded_results, key=lambda r: r.score, reverse=True
-            )[:target_count]
+            return self._combine_stage3_anchor_results(
+                anchor_results,
+                sorted(expanded_results, key=lambda r: r.score, reverse=True)[:target_count],
+                target_count=original_target_count,
+            )
         except Exception as exc:
             self.logger.debug("Stage 3 clustering failed: %s", exc)
-            return sorted(
-                expanded_results, key=lambda r: r.score, reverse=True
-            )[:target_count]
+            return self._combine_stage3_anchor_results(
+                anchor_results,
+                sorted(expanded_results, key=lambda r: r.score, reverse=True)[:target_count],
+                target_count=original_target_count,
+            )
 
     def _stage4_optional_rerank(
         self,
@@ -1998,16 +2267,21 @@ class ChainSearchEngine:
         Args:
             query: Search query string
             clustered_results: Results from Stage 3 clustering
-            k: Number of final results to return
+            k: Requested final result count before downstream path penalties
 
         Returns:
-            Reranked results sorted by cross-encoder score
+            Reranked results sorted by cross-encoder score. This can exceed the
+            requested final ``k`` so the caller can still demote noisy test or
+            generated hits before applying the final trim.
         """
         if not clustered_results:
             return []
 
-        # Use existing _cross_encoder_rerank method
-        return self._cross_encoder_rerank(query, clustered_results, k)
+        rerank_limit = self._resolve_rerank_candidate_limit(
+            k,
+            len(clustered_results),
+        )
+        return self._cross_encoder_rerank(query, clustered_results, rerank_limit)
 
     def _get_embeddings_for_clustering(
         self,
@@ -2159,74 +2433,15 @@ class ChainSearchEngine:
                 stats=stats
             )
 
-        # Initialize binary embedding backend
-        try:
-            from codexlens.indexing.embedding import BinaryEmbeddingBackend
-        except ImportError as exc:
-            self.logger.warning(
-                "BinaryEmbeddingBackend not available: %s, falling back to standard search",
-                exc
-            )
-            return self.search(query, source_path, options=options)
-
         # Step 4: Binary coarse search (same as binary_cascade_search)
         binary_coarse_time = time.time()
-        coarse_candidates: List[Tuple[int, int, Path]] = []
-
-        # Try centralized BinarySearcher first (preferred for mmap indexes)
-        # The index root is the parent of the first index path
-        index_root = index_paths[0].parent if index_paths else None
-        used_centralized = False
-
-        if index_root:
-            binary_searcher = self._get_centralized_binary_searcher(index_root)
-            if binary_searcher is not None:
-                try:
-                    # BinarySearcher expects dense vector, not packed binary
-                    from codexlens.semantic.embedder import Embedder
-                    embedder = Embedder()
-                    query_dense = embedder.embed_to_numpy([query])[0]
-
-                    results = binary_searcher.search(query_dense, top_k=coarse_k)
-                    for chunk_id, distance in results:
-                        coarse_candidates.append((chunk_id, distance, index_root))
-                    # Only mark as used if we got actual results
-                    if coarse_candidates:
-                        used_centralized = True
-                        self.logger.debug(
-                            "Binary coarse search (centralized): %d candidates in %.2fms",
-                            len(results), (time.time() - binary_coarse_time) * 1000
-                        )
-                except Exception as exc:
-                    self.logger.debug(f"Centralized binary search failed: {exc}")
-
-        if not used_centralized:
-            # Get GPU preference from config
-            use_gpu = True
-            if self._config is not None:
-                use_gpu = getattr(self._config, "embedding_use_gpu", True)
-
-            try:
-                binary_backend = BinaryEmbeddingBackend(use_gpu=use_gpu)
-                query_binary = binary_backend.embed_packed([query])[0]
-            except Exception as exc:
-                self.logger.warning(f"Failed to generate binary query embedding: {exc}")
-                return self.search(query, source_path, options=options)
-
-            # Fallback to per-directory binary indexes
-            for index_path in index_paths:
-                try:
-                    binary_index = self._get_or_create_binary_index(index_path)
-                    if binary_index is None or binary_index.count() == 0:
-                        continue
-                    # BinaryANNIndex returns (ids, distances) arrays
-                    ids, distances = binary_index.search(query_binary, coarse_k)
-                    for chunk_id, dist in zip(ids, distances):
-                        coarse_candidates.append((chunk_id, dist, index_path))
-                except Exception as exc:
-                    self.logger.debug(
-                        "Binary search failed for %s: %s", index_path, exc
-                    )
+        coarse_candidates, _, _, stage2_index_root = self._collect_binary_coarse_candidates(
+            query,
+            index_paths,
+            coarse_k,
+            stats,
+            index_root=index_paths[0].parent if index_paths else None,
+        )
 
         if not coarse_candidates:
             self.logger.info("No binary candidates found, falling back to standard search for reranking")
@@ -2242,91 +2457,11 @@ class ChainSearchEngine:
             len(coarse_candidates), (time.time() - binary_coarse_time) * 1000
         )
 
-        # Step 5: Build SearchResult objects for cross-encoder reranking
-        # Group candidates by index path for efficient retrieval
-        candidates_by_index: Dict[Path, List[int]] = {}
-        for chunk_id, distance, index_path in coarse_candidates:
-            if index_path not in candidates_by_index:
-                candidates_by_index[index_path] = []
-            candidates_by_index[index_path].append(chunk_id)
-
-        # Retrieve chunk content for reranking
-        # Always use centralized VectorMetadataStore since chunks are stored there
-        import sqlite3
-        coarse_results: List[SearchResult] = []
-
-        # Find the centralized metadata store path (project root)
-        # index_root was computed earlier, use it for chunk retrieval
-        central_meta_path = index_root / VECTORS_META_DB_NAME if index_root else None
-        central_meta_store = None
-        if central_meta_path and central_meta_path.exists():
-            central_meta_store = VectorMetadataStore(central_meta_path)
-
-        for index_path, chunk_ids in candidates_by_index.items():
-            try:
-                chunks_data = []
-                if central_meta_store:
-                    # Try centralized VectorMetadataStore first (preferred)
-                    chunks_data = central_meta_store.get_chunks_by_ids(chunk_ids)
-
-                if not chunks_data and used_centralized:
-                    # Fallback to per-index-path meta store
-                    meta_db_path = index_path / VECTORS_META_DB_NAME
-                    if meta_db_path.exists():
-                        meta_store = VectorMetadataStore(meta_db_path)
-                        chunks_data = meta_store.get_chunks_by_ids(chunk_ids)
-
-                if not chunks_data:
-                    # Final fallback: query semantic_chunks table directly
-                    # This handles per-directory indexes with semantic_chunks table
-                    try:
-                        conn = sqlite3.connect(str(index_path))
-                        conn.row_factory = sqlite3.Row
-                        placeholders = ",".join("?" * len(chunk_ids))
-                        cursor = conn.execute(
-                            f"""
-                            SELECT id, file_path, content, metadata, category
-                            FROM semantic_chunks
-                            WHERE id IN ({placeholders})
-                            """,
-                            chunk_ids
-                        )
-                        chunks_data = [
-                            {
-                                "id": row["id"],
-                                "file_path": row["file_path"],
-                                "content": row["content"],
-                                "metadata": row["metadata"],
-                                "category": row["category"],
-                            }
-                            for row in cursor.fetchall()
-                        ]
-                        conn.close()
-                    except Exception:
-                        pass  # Skip if table doesn't exist
-
-                for chunk in chunks_data:
-                    # Find the Hamming distance for this chunk
-                    chunk_id = chunk.get("id") or chunk.get("chunk_id")
-                    distance = next(
-                        (d for cid, d, _ in coarse_candidates if cid == chunk_id),
-                        256
-                    )
-                    # Initial score from Hamming distance (will be replaced by reranker)
-                    score = 1.0 - (distance / 256.0)
-
-                    content = chunk.get("content", "")
-                    result = SearchResult(
-                        path=chunk.get("file_path", ""),
-                        score=float(score),
-                        excerpt=content[:500] if content else "",
-                        content=content,
-                    )
-                    coarse_results.append(result)
-            except Exception as exc:
-                self.logger.debug(
-                    "Failed to retrieve chunks from %s: %s", index_path, exc
-                )
+        coarse_results = self._materialize_binary_candidates(
+            coarse_candidates,
+            stats,
+            stage2_index_root=stage2_index_root,
+        )
 
         if not coarse_results:
             stats.time_ms = (time.time() - start_time) * 1000
@@ -2334,13 +2469,26 @@ class ChainSearchEngine:
                 query=query, results=[], symbols=[], stats=stats
             )
 
+        coarse_results = self._inject_query_feature_anchors(
+            query,
+            source_path,
+            options,
+            coarse_results,
+            limit=min(6, max(2, k)),
+        )
+
         self.logger.debug(
             "Retrieved %d chunks for cross-encoder reranking", len(coarse_results)
         )
 
         # Step 6: Cross-encoder reranking
         rerank_time = time.time()
-        reranked_results = self._cross_encoder_rerank(query, coarse_results, top_k=k)
+        rerank_limit = self._resolve_rerank_candidate_limit(k, len(coarse_results))
+        reranked_results = self._cross_encoder_rerank(
+            query,
+            coarse_results,
+            top_k=rerank_limit,
+        )
 
         self.logger.debug(
             "Cross-encoder reranking: %d results in %.2fms",
@@ -2353,7 +2501,10 @@ class ChainSearchEngine:
             if result.path not in path_to_result or result.score > path_to_result[result.path].score:
                 path_to_result[result.path] = result
 
-        final_results = list(path_to_result.values())[:k]
+        final_results = self._apply_default_path_penalties(
+            query,
+            list(path_to_result.values()),
+        )[:k]
 
         stats.files_matched = len(final_results)
         stats.time_ms = (time.time() - start_time) * 1000
@@ -2399,13 +2550,48 @@ class ChainSearchEngine:
         Returns:
             ChainSearchResult with cross-encoder reranked results and statistics
         """
+        options = options or SearchOptions()
+
+        if query_prefers_lexical_search(query):
+            self.logger.debug(
+                "Dense rerank shortcut: using lexical search for lexical-priority query %r",
+                query,
+            )
+            lexical_options = SearchOptions(
+                depth=options.depth,
+                max_workers=options.max_workers,
+                limit_per_dir=max(options.limit_per_dir, max(10, k)),
+                total_limit=max(options.total_limit, max(20, k * 4)),
+                offset=options.offset,
+                include_symbols=False,
+                files_only=options.files_only,
+                include_semantic=False,
+                code_only=options.code_only,
+                exclude_extensions=list(options.exclude_extensions or []),
+                hybrid_mode=False,
+                enable_fuzzy=True,
+                enable_vector=False,
+                pure_vector=False,
+                enable_cascade=False,
+                hybrid_weights=None,
+                group_results=options.group_results,
+                grouping_threshold=options.grouping_threshold,
+                inject_feature_anchors=options.inject_feature_anchors,
+            )
+            lexical_result = self.search(query, source_path, options=lexical_options)
+            return ChainSearchResult(
+                query=query,
+                results=lexical_result.results,
+                related_results=lexical_result.related_results,
+                symbols=[],
+                stats=lexical_result.stats,
+            )
+
         if not NUMPY_AVAILABLE:
             self.logger.warning(
                 "NumPy not available, falling back to standard search"
             )
             return self.search(query, source_path, options=options)
-
-        options = options or SearchOptions()
         start_time = time.time()
         stats = SearchStats()
 
@@ -2442,130 +2628,115 @@ class ChainSearchEngine:
                 stats=stats
             )
 
-        # Step 3: Find centralized HNSW index and read model config
-        from codexlens.config import VECTORS_HNSW_NAME
-        central_hnsw_path = None
-        index_root = start_index.parent
-        current_dir = index_root
-        for _ in range(10):  # Limit search depth
-            candidate = current_dir / VECTORS_HNSW_NAME
-            if candidate.exists():
-                central_hnsw_path = candidate
-                index_root = current_dir  # Update to where HNSW was found
-                break
-            parent = current_dir.parent
-            if parent == current_dir:  # Reached root
-                break
-            current_dir = parent
-
-        # Step 4: Generate query dense embedding using same model as centralized index
-        # Read embedding config to match the model used during indexing
+        # Step 3-5: Group child indexes by centralized dense vector root and search each root.
         dense_coarse_time = time.time()
-        try:
-            from codexlens.semantic.factory import get_embedder
-
-            # Get embedding settings from centralized index config (preferred) or fallback to self._config
-            embedding_backend = "litellm"  # Default to API for dense
-            embedding_model = "qwen3-embedding-sf"  # Default model
-            use_gpu = True
-
-            # Try to read model config from centralized index's embeddings_config table
-            central_index_db = index_root / "_index.db"
-            if central_index_db.exists():
-                try:
-                    from codexlens.semantic.vector_store import VectorStore
-                    with VectorStore(central_index_db) as vs:
-                        model_config = vs.get_model_config()
-                        if model_config:
-                            embedding_backend = model_config.get("backend", embedding_backend)
-                            embedding_model = model_config.get("model_name", embedding_model)
-                            self.logger.debug(
-                                "Read model config from centralized index: %s/%s",
-                                embedding_backend, embedding_model
-                            )
-                except Exception as e:
-                    self.logger.debug("Failed to read centralized model config: %s", e)
-
-            # Fallback to self._config if not read from index
-            if self._config is not None:
-                if embedding_backend == "litellm" and embedding_model == "qwen3-embedding-sf":
-                    # Only use config values if we didn't read from centralized index
-                    config_backend = getattr(self._config, "embedding_backend", None)
-                    config_model = getattr(self._config, "embedding_model", None)
-                    if config_backend:
-                        embedding_backend = config_backend
-                    if config_model:
-                        embedding_model = config_model
-                use_gpu = getattr(self._config, "embedding_use_gpu", True)
-
-            # Create embedder matching index configuration
-            if embedding_backend == "litellm":
-                embedder = get_embedder(backend="litellm", model=embedding_model)
-            else:
-                embedder = get_embedder(backend="fastembed", profile=embedding_model, use_gpu=use_gpu)
-
-            query_dense = embedder.embed_to_numpy([query])[0]
-            self.logger.debug(f"Dense query embedding: {query_dense.shape[0]}-dim via {embedding_backend}/{embedding_model}")
-        except Exception as exc:
-            self.logger.warning(f"Failed to generate dense query embedding: {exc}")
-            return self.search(query, source_path, options=options)
-
-        # Step 5: Dense coarse search using centralized HNSW index
         coarse_candidates: List[Tuple[int, float, Path]] = []  # (chunk_id, distance, index_path)
+        central_index_roots: Dict[Path, Path] = {}
+        dense_root_groups, dense_fallback_index_paths = self._group_index_paths_by_dense_root(index_paths)
+        dense_query_cache: Dict[Tuple[str, str, bool], "np.ndarray"] = {}
+        try:
+            from codexlens.semantic.ann_index import ANNIndex
 
-        if central_hnsw_path is not None:
-            # Use centralized index
-            try:
-                from codexlens.semantic.ann_index import ANNIndex
-                ann_index = ANNIndex.create_central(
-                    index_root=index_root,
-                    dim=query_dense.shape[0],
-                )
-                if ann_index.load() and ann_index.count() > 0:
-                    # Search centralized HNSW index
-                    ids, distances = ann_index.search(query_dense, top_k=coarse_k)
-                    for chunk_id, dist in zip(ids, distances):
-                        coarse_candidates.append((chunk_id, dist, index_root / "_index.db"))
-                    self.logger.debug(
-                        "Centralized dense search: %d candidates from %s",
-                        len(ids), central_hnsw_path
-                    )
-            except Exception as exc:
+            dense_candidate_groups: List[List[Tuple[int, float, Path]]] = []
+            dense_roots_by_settings = self._group_dense_roots_by_embedding_settings(
+                dense_root_groups
+            )
+            if len(dense_roots_by_settings) > 1:
                 self.logger.debug(
-                    "Centralized dense search failed for %s: %s", central_hnsw_path, exc
+                    "Dense coarse search detected %d embedding setting groups; interleaving candidates across groups",
+                    len(dense_roots_by_settings),
                 )
 
-        # Fallback: try per-directory HNSW indexes if centralized not found
-        if not coarse_candidates:
-            for index_path in index_paths:
-                try:
-                    # Load HNSW index
-                    from codexlens.semantic.ann_index import ANNIndex
-                    ann_index = ANNIndex(index_path, dim=query_dense.shape[0])
-                    if not ann_index.load():
-                        continue
+            for dense_roots in dense_roots_by_settings.values():
+                group_candidates: List[Tuple[int, float, Path]] = []
+                for dense_root in dense_roots:
+                    try:
+                        query_dense = self._embed_dense_query(
+                            query,
+                            index_root=dense_root,
+                            query_cache=dense_query_cache,
+                        )
+                        ann_index = self._get_cached_centralized_dense_index(
+                            dense_root,
+                            int(query_dense.shape[0]),
+                        )
+                        if ann_index is None:
+                            continue
 
-                    if ann_index.count() == 0:
-                        continue
+                        ids, distances = ann_index.search(query_dense, top_k=coarse_k)
+                        central_index_db = dense_root / "_index.db"
+                        central_index_roots[central_index_db] = dense_root
+                        for chunk_id, dist in zip(ids, distances):
+                            group_candidates.append((chunk_id, dist, central_index_db))
+                        if ids:
+                            self.logger.debug(
+                                "Centralized dense search: %d candidates from %s",
+                                len(ids),
+                                dense_root / VECTORS_HNSW_NAME,
+                            )
+                    except Exception as exc:
+                        self.logger.debug(
+                            "Centralized dense search failed for %s: %s",
+                            dense_root,
+                            exc,
+                        )
+                if group_candidates:
+                    dense_candidate_groups.append(group_candidates)
 
-                    # Search HNSW index
-                    ids, distances = ann_index.search(query_dense, top_k=coarse_k)
-                    for chunk_id, dist in zip(ids, distances):
-                        coarse_candidates.append((chunk_id, dist, index_path))
+            coarse_candidates = self._interleave_dense_candidate_groups(
+                dense_candidate_groups,
+                coarse_k,
+            )
 
-                except Exception as exc:
+            if not coarse_candidates:
+                fallback_index_paths = dense_fallback_index_paths if dense_root_groups else index_paths
+                fallback_candidate_groups: List[List[Tuple[int, float, Path]]] = []
+                fallback_index_groups = self._group_dense_index_paths_by_embedding_settings(
+                    fallback_index_paths
+                )
+                if len(fallback_index_groups) > 1:
                     self.logger.debug(
-                        "Dense search failed for %s: %s", index_path, exc
+                        "Legacy dense fallback detected %d embedding setting groups; interleaving candidates across groups",
+                        len(fallback_index_groups),
                     )
+                for grouped_index_paths in fallback_index_groups.values():
+                    group_candidates: List[Tuple[int, float, Path]] = []
+                    for index_path in grouped_index_paths:
+                        try:
+                            query_dense = self._embed_dense_query(
+                                query,
+                                index_root=index_path.parent,
+                                query_cache=dense_query_cache,
+                            )
+                            ann_index = self._get_cached_legacy_dense_index(
+                                index_path,
+                                int(query_dense.shape[0]),
+                            )
+                            if ann_index is None:
+                                continue
+
+                            ids, distances = ann_index.search(query_dense, top_k=coarse_k)
+                            for chunk_id, dist in zip(ids, distances):
+                                group_candidates.append((chunk_id, dist, index_path))
+                        except Exception as exc:
+                            self.logger.debug(
+                                "Dense search failed for %s: %s", index_path, exc
+                            )
+                    if group_candidates:
+                        fallback_candidate_groups.append(group_candidates)
+
+                coarse_candidates = self._interleave_dense_candidate_groups(
+                    fallback_candidate_groups,
+                    coarse_k,
+                )
+        except Exception as exc:
+            self.logger.warning(f"Failed to prepare dense coarse search: {exc}")
+            return self.search(query, source_path, options=options)
 
         if not coarse_candidates:
             self.logger.info("No dense candidates found, falling back to standard search")
             return self.search(query, source_path, options=options)
 
-        # Sort by distance (ascending for cosine distance) and take top coarse_k
-        coarse_candidates.sort(key=lambda x: x[1])
-        coarse_candidates = coarse_candidates[:coarse_k]
-
         self.logger.debug(
             "Dense coarse search: %d candidates in %.2fms",
             len(coarse_candidates), (time.time() - dense_coarse_time) * 1000
@@ -2584,11 +2755,10 @@ class ChainSearchEngine:
 
         for index_path, chunk_ids in candidates_by_index.items():
             try:
-                # For centralized index, use _vectors_meta.db for chunk metadata
-                # which contains file_path, content, start_line, end_line
-                if central_hnsw_path is not None and index_path == index_root / "_index.db":
+                central_root = central_index_roots.get(index_path)
+                if central_root is not None:
                     # Use centralized metadata from _vectors_meta.db
-                    meta_db_path = index_root / "_vectors_meta.db"
+                    meta_db_path = central_root / "_vectors_meta.db"
                     if meta_db_path.exists():
                         conn = sqlite3.connect(str(meta_db_path))
                         conn.row_factory = sqlite3.Row
@@ -2645,7 +2815,11 @@ class ChainSearchEngine:
                 for chunk in chunks_data:
                     chunk_id = chunk.get("id")
                     distance = next(
-                        (d for cid, d, _ in coarse_candidates if cid == chunk_id),
+                        (
+                            d
+                            for cid, d, candidate_index_path in coarse_candidates
+                            if cid == chunk_id and candidate_index_path == index_path
+                        ),
                         1.0
                     )
                     # Convert cosine distance to score (clamp to [0, 1] for Pydantic validation)
@@ -2671,13 +2845,26 @@ class ChainSearchEngine:
                 query=query, results=[], symbols=[], stats=stats
             )
 
+        coarse_results = self._inject_query_feature_anchors(
+            query,
+            source_path,
+            options,
+            coarse_results,
+            limit=min(6, max(2, k)),
+        )
+
         self.logger.debug(
             "Retrieved %d chunks for cross-encoder reranking", len(coarse_results)
         )
 
         # Step 6: Cross-encoder reranking
         rerank_time = time.time()
-        reranked_results = self._cross_encoder_rerank(query, coarse_results, top_k=k)
+        rerank_limit = self._resolve_rerank_candidate_limit(k, len(coarse_results))
+        reranked_results = self._cross_encoder_rerank(
+            query,
+            coarse_results,
+            top_k=rerank_limit,
+        )
 
         self.logger.debug(
             "Cross-encoder reranking: %d results in %.2fms",
@@ -2690,7 +2877,10 @@ class ChainSearchEngine:
             if result.path not in path_to_result or result.score > path_to_result[result.path].score:
                 path_to_result[result.path] = result
 
-        final_results = list(path_to_result.values())[:k]
+        final_results = self._apply_default_path_penalties(
+            query,
+            list(path_to_result.values()),
+        )[:k]
 
         stats.files_matched = len(final_results)
         stats.time_ms = (time.time() - start_time) * 1000
@@ -2792,6 +2982,630 @@ class ChainSearchEngine:
 
         return Path(index_root).resolve()
 
+    def _find_nearest_dense_hnsw_root(
+        self,
+        index_root: Path,
+        *,
+        max_levels: int = 10,
+    ) -> Optional[Path]:
+        """Walk up index_root parents to find the nearest centralized dense HNSW root."""
+
+        current_dir = Path(index_root).resolve()
+        for _ in range(max(0, int(max_levels)) + 1):
+            try:
+                if (current_dir / VECTORS_HNSW_NAME).exists():
+                    return current_dir
+            except Exception:
+                return None
+
+            parent = current_dir.parent
+            if parent == current_dir:
+                break
+            current_dir = parent
+
+        return None
+
+    def _group_index_paths_by_binary_root(
+        self,
+        index_paths: List[Path],
+        *,
+        preferred_root: Optional[Path] = None,
+    ) -> Tuple[List[Path], List[Path]]:
+        """Group collected indexes by centralized binary mmap root."""
+
+        grouped: Dict[Path, List[Path]] = {}
+        ungrouped: List[Path] = []
+        preferred_root = (
+            Path(preferred_root).resolve()
+            if preferred_root is not None
+            else None
+        )
+
+        for index_path in index_paths:
+            candidate_roots: List[Path] = [index_path.parent]
+            if preferred_root is not None and preferred_root != index_path.parent:
+                candidate_roots.append(preferred_root)
+
+            resolved_root: Optional[Path] = None
+            for candidate_root in candidate_roots:
+                found_root = self._find_nearest_binary_mmap_root(candidate_root)
+                if (found_root / BINARY_VECTORS_MMAP_NAME).exists():
+                    resolved_root = found_root
+                    break
+
+            if resolved_root is None:
+                ungrouped.append(index_path)
+                continue
+
+            grouped.setdefault(resolved_root, []).append(index_path)
+
+        return [root for root in grouped if grouped[root]], ungrouped
+
+    def _group_index_paths_by_dense_root(
+        self,
+        index_paths: List[Path],
+    ) -> Tuple[List[Path], List[Path]]:
+        """Group collected indexes by centralized dense HNSW root."""
+
+        grouped: Dict[Path, List[Path]] = {}
+        ungrouped: List[Path] = []
+
+        for index_path in index_paths:
+            dense_root = self._find_nearest_dense_hnsw_root(index_path.parent)
+            if dense_root is None:
+                ungrouped.append(index_path)
+                continue
+            grouped.setdefault(dense_root, []).append(index_path)
+
+        return [root for root in grouped if grouped[root]], ungrouped
+
+    def _group_dense_roots_by_embedding_settings(
+        self,
+        dense_roots: List[Path],
+    ) -> Dict[Tuple[str, str, bool], List[Path]]:
+        """Group dense roots by the embedding settings used to build them."""
+        grouped: Dict[Tuple[str, str, bool], List[Path]] = {}
+        for dense_root in dense_roots:
+            settings = self._resolve_dense_embedding_settings(index_root=dense_root)
+            grouped.setdefault(settings, []).append(dense_root)
+        return grouped
+
+    def _group_dense_index_paths_by_embedding_settings(
+        self,
+        index_paths: List[Path],
+    ) -> Dict[Tuple[str, str, bool], List[Path]]:
+        """Group legacy dense ANN indexes by the embedding settings used to query them."""
+        grouped: Dict[Tuple[str, str, bool], List[Path]] = {}
+        for index_path in index_paths:
+            settings = self._resolve_dense_embedding_settings(
+                index_root=index_path.parent,
+            )
+            grouped.setdefault(settings, []).append(index_path)
+        return grouped
+
+    @staticmethod
+    def _interleave_dense_candidate_groups(
+        candidate_groups: List[List[Tuple[int, float, Path]]],
+        limit: int,
+    ) -> List[Tuple[int, float, Path]]:
+        """Interleave locally ranked dense candidates from mixed embedding groups."""
+        if limit <= 0:
+            return []
+
+        ordered_groups = [
+            sorted(group, key=lambda item: item[1])
+            for group in candidate_groups
+            if group
+        ]
+        if not ordered_groups:
+            return []
+        if len(ordered_groups) == 1:
+            return ordered_groups[0][:limit]
+
+        merged: List[Tuple[int, float, Path]] = []
+        offsets = [0 for _ in ordered_groups]
+        while len(merged) < limit:
+            made_progress = False
+            for group_index, group in enumerate(ordered_groups):
+                offset = offsets[group_index]
+                if offset >= len(group):
+                    continue
+                merged.append(group[offset])
+                offsets[group_index] += 1
+                made_progress = True
+                if len(merged) >= limit:
+                    break
+            if not made_progress:
+                break
+        return merged
+
+    def _resolve_dense_embedding_settings(
+        self,
+        *,
+        index_root: Optional[Path],
+    ) -> Tuple[str, str, bool]:
+        """Resolve embedding backend/profile for a dense vector root."""
+
+        embedding_backend = "litellm"
+        embedding_model = "qwen3-embedding-sf"
+        use_gpu = True
+        loaded_from_root = False
+
+        if index_root is not None:
+            central_index_db = index_root / "_index.db"
+            if central_index_db.exists():
+                try:
+                    from codexlens.semantic.vector_store import VectorStore
+
+                    with VectorStore(central_index_db) as vs:
+                        model_config = vs.get_model_config()
+                        if model_config:
+                            embedding_backend = str(
+                                model_config.get("backend", embedding_backend)
+                            )
+                            if embedding_backend == "litellm":
+                                embedding_model = str(
+                                    model_config.get("model_name", embedding_model)
+                                )
+                            else:
+                                embedding_model = str(
+                                    model_config.get(
+                                        "model_profile",
+                                        model_config.get("model_name", embedding_model),
+                                    )
+                                )
+                            loaded_from_root = True
+                except Exception as exc:
+                    self.logger.debug(
+                        "Failed to read dense embedding config from %s: %s",
+                        central_index_db,
+                        exc,
+                    )
+
+        if self._config is not None:
+            if not loaded_from_root:
+                config_backend = getattr(self._config, "embedding_backend", None)
+                config_model = getattr(self._config, "embedding_model", None)
+                if config_backend:
+                    embedding_backend = str(config_backend)
+                if config_model:
+                    embedding_model = str(config_model)
+            use_gpu = bool(getattr(self._config, "embedding_use_gpu", True))
+
+        return embedding_backend, embedding_model, use_gpu
+
+    def _embed_dense_query(
+        self,
+        query: str,
+        *,
+        index_root: Optional[Path],
+        query_cache: Optional[Dict[Tuple[str, str, bool], "np.ndarray"]] = None,
+    ) -> "np.ndarray":
+        """Embed a query using the model configuration associated with a dense root."""
+
+        from codexlens.semantic.factory import get_embedder
+
+        embedding_backend, embedding_model, use_gpu = self._resolve_dense_embedding_settings(
+            index_root=index_root,
+        )
+        cache_key = (embedding_backend, embedding_model, use_gpu)
+        if query_cache is not None and cache_key in query_cache:
+            return query_cache[cache_key]
+
+        if embedding_backend == "litellm":
+            embedder = get_embedder(backend="litellm", model=embedding_model)
+        else:
+            embedder = get_embedder(
+                backend="fastembed",
+                profile=embedding_model,
+                use_gpu=use_gpu,
+            )
+
+        query_dense = embedder.embed_to_numpy([query])[0]
+        if query_cache is not None:
+            query_cache[cache_key] = query_dense
+
+        self.logger.debug(
+            "Dense query embedding: %d-dim via %s/%s",
+            int(query_dense.shape[0]),
+            embedding_backend,
+            embedding_model,
+        )
+        return query_dense
+
+    def _embed_query_for_binary_searcher(
+        self,
+        query: str,
+        *,
+        binary_searcher: Any,
+        query_cache: Optional[Dict[Tuple[str, str, bool], "np.ndarray"]] = None,
+    ) -> "np.ndarray":
+        """Embed a query using the model configuration exposed by BinarySearcher."""
+
+        use_gpu = True
+        if self._config is not None:
+            use_gpu = getattr(self._config, "embedding_use_gpu", True)
+
+        query_dense = None
+        backend = getattr(binary_searcher, "backend", None)
+        model = getattr(binary_searcher, "model", None)
+        profile = getattr(binary_searcher, "model_profile", None) or "code"
+        cache_key = (
+            str(backend or "fastembed"),
+            str(model or profile),
+            bool(use_gpu),
+        )
+
+        if query_cache is not None and cache_key in query_cache:
+            return query_cache[cache_key]
+
+        if backend == "litellm":
+            try:
+                from codexlens.semantic.factory import get_embedder as get_factory_embedder
+
+                embedder = get_factory_embedder(
+                    backend="litellm",
+                    model=model or "code",
+                )
+                query_dense = embedder.embed_to_numpy([query])[0]
+            except Exception:
+                query_dense = None
+
+        if query_dense is None:
+            from codexlens.semantic.embedder import get_embedder
+
+            embedder = get_embedder(profile=str(profile), use_gpu=use_gpu)
+            query_dense = embedder.embed_to_numpy([query])[0]
+
+        if query_cache is not None:
+            query_cache[cache_key] = query_dense
+
+        return query_dense
+
+    def _collect_binary_coarse_candidates(
+        self,
+        query: str,
+        index_paths: List[Path],
+        coarse_k: int,
+        stats: SearchStats,
+        *,
+        index_root: Optional[Path] = None,
+        allow_dense_fallback: bool = False,
+    ) -> Tuple[List[Tuple[int, float, Path]], bool, bool, Optional[Path]]:
+        """Collect coarse candidates from centralized/legacy binary indexes."""
+
+        try:
+            from codexlens.indexing.embedding import BinaryEmbeddingBackend
+        except ImportError as exc:
+            self.logger.warning(
+                "BinaryEmbeddingBackend not available: %s", exc
+            )
+            return [], False, False, None
+
+        requested_index_root = (
+            Path(index_root).resolve()
+            if index_root is not None
+            else (index_paths[0].parent if index_paths else None)
+        )
+        coarse_candidates: List[Tuple[int, float, Path]] = []
+        used_centralized = False
+        using_dense_fallback = False
+        dense_query_cache: Dict[Tuple[str, str, bool], "np.ndarray"] = {}
+        binary_roots_with_hits: set[Path] = set()
+        stage2_index_root: Optional[Path] = None
+
+        binary_root_groups, _ = self._group_index_paths_by_binary_root(
+            index_paths,
+            preferred_root=requested_index_root,
+        )
+        for binary_root in binary_root_groups:
+            binary_searcher = self._get_centralized_binary_searcher(binary_root)
+            if binary_searcher is None:
+                continue
+            try:
+                query_dense = self._embed_query_for_binary_searcher(
+                    query,
+                    binary_searcher=binary_searcher,
+                    query_cache=dense_query_cache,
+                )
+                results = binary_searcher.search(query_dense, top_k=coarse_k)
+                for chunk_id, distance in results:
+                    coarse_candidates.append((chunk_id, float(distance), binary_root))
+                if results:
+                    used_centralized = True
+                    binary_roots_with_hits.add(binary_root)
+                    self.logger.debug(
+                        "Centralized binary search found %d candidates from %s",
+                        len(results),
+                        binary_root,
+                    )
+            except Exception as exc:
+                self.logger.debug(
+                    "Centralized binary search failed for %s: %s",
+                    binary_root,
+                    exc,
+                )
+
+        if len(binary_roots_with_hits) == 1:
+            stage2_index_root = next(iter(binary_roots_with_hits))
+
+        if not used_centralized:
+            has_legacy_binary_vectors = any(
+                (p.parent / f"{p.stem}_binary_vectors.bin").exists() for p in index_paths
+            )
+            if has_legacy_binary_vectors:
+                use_gpu = True
+                if self._config is not None:
+                    use_gpu = getattr(self._config, "embedding_use_gpu", True)
+
+                query_binary = None
+                try:
+                    binary_backend = BinaryEmbeddingBackend(use_gpu=use_gpu)
+                    query_binary = binary_backend.embed_packed([query])[0]
+                except Exception as exc:
+                    self.logger.warning(f"Failed to generate binary query embedding: {exc}")
+                    query_binary = None
+
+                if query_binary is not None:
+                    for index_path in index_paths:
+                        try:
+                            binary_index = self._get_or_create_binary_index(index_path)
+                            if binary_index is None or binary_index.count() == 0:
+                                continue
+                            ids, distances = binary_index.search(query_binary, coarse_k)
+                            for chunk_id, dist in zip(ids, distances):
+                                coarse_candidates.append((chunk_id, float(dist), index_path))
+                        except Exception as exc:
+                            self.logger.debug(
+                                "Binary search failed for %s: %s", index_path, exc
+                            )
+                            stats.errors.append(
+                                f"Binary search failed for {index_path}: {exc}"
+                            )
+            else:
+                self.logger.debug(
+                    "No legacy binary vector files found; skipping legacy binary search fallback"
+                )
+
+        if not coarse_candidates and allow_dense_fallback:
+            dense_candidates: List[Tuple[int, float, Path]] = []
+            dense_roots_with_hits: set[Path] = set()
+            try:
+                from codexlens.semantic.ann_index import ANNIndex
+
+                dense_root_groups, dense_fallback_index_paths = self._group_index_paths_by_dense_root(index_paths)
+                dense_candidate_groups: List[List[Tuple[int, float, Path]]] = []
+                dense_roots_by_settings = self._group_dense_roots_by_embedding_settings(
+                    dense_root_groups
+                )
+                if len(dense_roots_by_settings) > 1:
+                    self.logger.debug(
+                        "Stage 1 dense fallback detected %d embedding setting groups; interleaving candidates across groups",
+                        len(dense_roots_by_settings),
+                    )
+                for dense_roots in dense_roots_by_settings.values():
+                    group_candidates: List[Tuple[int, float, Path]] = []
+                    for dense_root in dense_roots:
+                        try:
+                            query_dense = self._embed_dense_query(
+                                query,
+                                index_root=dense_root,
+                                query_cache=dense_query_cache,
+                            )
+                            ann_index = self._get_cached_centralized_dense_index(
+                                dense_root,
+                                int(query_dense.shape[0]),
+                            )
+                            if ann_index is None:
+                                continue
+                            ids, distances = ann_index.search(query_dense, top_k=coarse_k)
+                            for chunk_id, dist in zip(ids, distances):
+                                group_candidates.append((chunk_id, float(dist), dense_root))
+                            if ids:
+                                dense_roots_with_hits.add(dense_root)
+                                self.logger.debug(
+                                    "Stage 1 centralized dense fallback: %d candidates from %s",
+                                    len(ids),
+                                    dense_root,
+                                )
+                        except Exception as exc:
+                            self.logger.debug(
+                                "Dense coarse search failed for %s: %s",
+                                dense_root,
+                                exc,
+                            )
+                    if group_candidates:
+                        dense_candidate_groups.append(group_candidates)
+
+                dense_candidates = self._interleave_dense_candidate_groups(
+                    dense_candidate_groups,
+                    coarse_k,
+                )
+
+                fallback_index_paths = dense_fallback_index_paths if dense_root_groups else index_paths
+                if not dense_candidates:
+                    fallback_candidate_groups: List[List[Tuple[int, float, Path]]] = []
+                    fallback_index_groups = self._group_dense_index_paths_by_embedding_settings(
+                        fallback_index_paths
+                    )
+                    if len(fallback_index_groups) > 1:
+                        self.logger.debug(
+                            "Stage 1 legacy dense fallback detected %d embedding setting groups; interleaving candidates across groups",
+                            len(fallback_index_groups),
+                        )
+                    for grouped_index_paths in fallback_index_groups.values():
+                        group_candidates = []
+                        for index_path in grouped_index_paths:
+                            try:
+                                query_dense = self._embed_dense_query(
+                                    query,
+                                    index_root=index_path.parent,
+                                    query_cache=dense_query_cache,
+                                )
+                                ann_index = self._get_cached_legacy_dense_index(
+                                    index_path,
+                                    int(query_dense.shape[0]),
+                                )
+                                if ann_index is None:
+                                    continue
+                                ids, distances = ann_index.search(query_dense, top_k=coarse_k)
+                                for chunk_id, dist in zip(ids, distances):
+                                    group_candidates.append((chunk_id, float(dist), index_path))
+                            except Exception as exc:
+                                self.logger.debug(
+                                    "Dense coarse search failed for %s: %s", index_path, exc
+                                )
+                        if group_candidates:
+                            fallback_candidate_groups.append(group_candidates)
+
+                    dense_candidates = self._interleave_dense_candidate_groups(
+                        fallback_candidate_groups,
+                        coarse_k,
+                    )
+            except Exception as exc:
+                self.logger.debug("Dense coarse search fallback unavailable: %s", exc)
+                dense_candidates = []
+
+            if dense_candidates:
+                if stage2_index_root is None and len(dense_roots_with_hits) == 1:
+                    stage2_index_root = next(iter(dense_roots_with_hits))
+                coarse_candidates = dense_candidates
+                using_dense_fallback = True
+
+        if coarse_candidates:
+            if using_dense_fallback:
+                coarse_candidates = coarse_candidates[:coarse_k]
+            else:
+                coarse_candidates.sort(key=lambda x: x[1])
+                coarse_candidates = coarse_candidates[:coarse_k]
+
+        return coarse_candidates, used_centralized, using_dense_fallback, stage2_index_root
+
+    def _materialize_binary_candidates(
+        self,
+        coarse_candidates: List[Tuple[int, float, Path]],
+        stats: SearchStats,
+        *,
+        stage2_index_root: Optional[Path] = None,
+        using_dense_fallback: bool = False,
+    ) -> List[SearchResult]:
+        """Fetch chunk payloads for coarse binary/dense-fallback candidates."""
+
+        if not coarse_candidates:
+            return []
+
+        coarse_results: List[Tuple[int, SearchResult]] = []
+        candidates_by_index: Dict[Path, List[int]] = {}
+        candidate_order: Dict[Tuple[Path, int], int] = {}
+        for chunk_id, _, idx_path in coarse_candidates:
+            if idx_path not in candidates_by_index:
+                candidates_by_index[idx_path] = []
+            candidates_by_index[idx_path].append(chunk_id)
+            candidate_order.setdefault((idx_path, int(chunk_id)), len(candidate_order))
+
+        import sqlite3
+
+        central_meta_store = None
+        central_meta_path = stage2_index_root / VECTORS_META_DB_NAME if stage2_index_root else None
+        if central_meta_path and central_meta_path.exists():
+            central_meta_store = VectorMetadataStore(central_meta_path)
+
+        for idx_path, chunk_ids in candidates_by_index.items():
+            try:
+                chunks_data = []
+                if central_meta_store is not None and stage2_index_root is not None and idx_path == stage2_index_root:
+                    chunks_data = central_meta_store.get_chunks_by_ids(chunk_ids)
+
+                if not chunks_data and idx_path.name != "_index.db":
+                    meta_db_path = idx_path / VECTORS_META_DB_NAME
+                    if meta_db_path.exists():
+                        meta_store = VectorMetadataStore(meta_db_path)
+                        chunks_data = meta_store.get_chunks_by_ids(chunk_ids)
+
+                if not chunks_data:
+                    try:
+                        conn = sqlite3.connect(str(idx_path))
+                        conn.row_factory = sqlite3.Row
+                        placeholders = ",".join("?" * len(chunk_ids))
+                        cursor = conn.execute(
+                            f"""
+                            SELECT id, file_path, content, metadata, category
+                            FROM semantic_chunks
+                            WHERE id IN ({placeholders})
+                            """,
+                            chunk_ids,
+                        )
+                        chunks_data = [
+                            {
+                                "id": row["id"],
+                                "file_path": row["file_path"],
+                                "content": row["content"],
+                                "metadata": row["metadata"],
+                                "category": row["category"],
+                            }
+                            for row in cursor.fetchall()
+                        ]
+                        conn.close()
+                    except Exception:
+                        chunks_data = []
+
+                for chunk in chunks_data:
+                    chunk_id = chunk.get("id") or chunk.get("chunk_id")
+                    distance = next(
+                        (
+                            d
+                            for cid, d, candidate_idx_path in coarse_candidates
+                            if cid == chunk_id and candidate_idx_path == idx_path
+                        ),
+                        256,
+                    )
+                    if using_dense_fallback:
+                        score = max(0.0, 1.0 - float(distance))
+                    else:
+                        score = 1.0 - (float(distance) / 256.0)
+
+                    content = chunk.get("content", "")
+                    metadata = chunk.get("metadata")
+                    symbol_name = None
+                    symbol_kind = None
+                    start_line = chunk.get("start_line")
+                    end_line = chunk.get("end_line")
+                    if metadata:
+                        try:
+                            meta_dict = json.loads(metadata) if isinstance(metadata, str) else metadata
+                            symbol_name = meta_dict.get("symbol_name")
+                            symbol_kind = meta_dict.get("symbol_kind")
+                            start_line = meta_dict.get("start_line", start_line)
+                            end_line = meta_dict.get("end_line", end_line)
+                        except Exception:
+                            pass
+
+                    coarse_results.append(
+                        (
+                            candidate_order.get((idx_path, int(chunk_id)), len(candidate_order)),
+                            SearchResult(
+                                path=chunk.get("file_path", ""),
+                                score=float(score),
+                                excerpt=content[:500] if content else "",
+                                content=content,
+                                symbol_name=symbol_name,
+                                symbol_kind=symbol_kind,
+                                start_line=start_line,
+                                end_line=end_line,
+                            ),
+                        )
+                    )
+            except Exception as exc:
+                self.logger.debug(
+                    "Failed to retrieve chunks from %s: %s", idx_path, exc
+                )
+                stats.errors.append(
+                    f"Stage 1 chunk retrieval failed for {idx_path}: {exc}"
+                )
+
+        coarse_results.sort(key=lambda item: item[0])
+        return [result for _, result in coarse_results]
+
     def _compute_cosine_similarity(
         self,
         query_vec: "np.ndarray",
@@ -2996,46 +3810,28 @@ class ChainSearchEngine:
         if not results:
             return []
 
-        # Try to get reranker from config or create new one
-        reranker = None
-        try:
-            from codexlens.semantic.reranker import (
-                check_reranker_available,
-                get_reranker,
+        # Collapse duplicate chunks from the same file before reranking.
+        # Otherwise, untouched tail chunks can overwrite reranked chunks for the
+        # same path during the later path-level deduplication step.
+        path_to_result: Dict[str, SearchResult] = {}
+        for result in results:
+            path = result.path
+            if path not in path_to_result or result.score > path_to_result[path].score:
+                path_to_result[path] = result
+        if len(path_to_result) != len(results):
+            self.logger.debug(
+                "Deduplicated rerank candidates by path: %d -> %d",
+                len(results),
+                len(path_to_result),
             )
+        results = sorted(
+            path_to_result.values(),
+            key=lambda item: float(item.score),
+            reverse=True,
+        )
 
-            # Determine backend and model from config
-            backend = "onnx"
-            model_name = None
-            use_gpu = True
-
-            if self._config is not None:
-                backend = getattr(self._config, "reranker_backend", "onnx") or "onnx"
-                model_name = getattr(self._config, "reranker_model", None)
-                use_gpu = getattr(self._config, "embedding_use_gpu", True)
-
-            ok, err = check_reranker_available(backend)
-            if not ok:
-                self.logger.debug("Reranker backend unavailable (%s): %s", backend, err)
-                return results[:top_k]
-
-            # Create reranker
-            kwargs = {}
-            if backend == "onnx":
-                kwargs["use_gpu"] = use_gpu
-            elif backend == "api":
-                # Pass max_input_tokens for adaptive batching
-                max_tokens = getattr(self._config, "reranker_max_input_tokens", None)
-                if max_tokens:
-                    kwargs["max_input_tokens"] = max_tokens
-
-            reranker = get_reranker(backend=backend, model_name=model_name, **kwargs)
-
-        except ImportError as exc:
-            self.logger.debug("Reranker not available: %s", exc)
-            return results[:top_k]
-        except Exception as exc:
-            self.logger.debug("Failed to initialize reranker: %s", exc)
+        reranker = self._get_cached_reranker()
+        if reranker is None:
             return results[:top_k]
 
         # Use cross_encoder_rerank from ranking module
@@ -3475,6 +4271,11 @@ class ChainSearchEngine:
         """
         collected = []
         visited = set()
+        scan_root = start_index.parent.resolve()
+        try:
+            scan_source_root = self.mapper.index_to_source(start_index)
+        except Exception:
+            scan_source_root = None
 
         def _collect_recursive(index_path: Path, current_depth: int):
             # Normalize path to avoid duplicates
@@ -3483,6 +4284,10 @@ class ChainSearchEngine:
                 return
             visited.add(normalized)
 
+            if is_ignored_index_path(normalized, scan_root):
+                self.logger.debug("Skipping ignored artifact index subtree: %s", normalized)
+                return
+
             # Add current index
             if normalized.exists():
                 collected.append(normalized)
@@ -3504,6 +4309,33 @@ class ChainSearchEngine:
                 self.logger.warning(f"Failed to read subdirs from {normalized}: {exc}")
 
         _collect_recursive(start_index, 0)
+
+        if scan_source_root is not None:
+            try:
+                descendant_roots = self.registry.find_descendant_project_roots(
+                    scan_source_root
+                )
+            except Exception as exc:
+                descendant_roots = []
+                self.logger.debug(
+                    "Failed to query descendant project roots for %s: %s",
+                    scan_source_root,
+                    exc,
+                )
+
+            for mapping in descendant_roots:
+                try:
+                    relative_depth = len(
+                        mapping.source_path.resolve().relative_to(
+                            scan_source_root.resolve()
+                        ).parts
+                    )
+                except ValueError:
+                    continue
+                if depth >= 0 and relative_depth > depth:
+                    continue
+                _collect_recursive(mapping.index_path, relative_depth)
+
         self.logger.info(f"Collected {len(collected)} indexes (depth={depth})")
         return collected
 
@@ -3548,6 +4380,13 @@ class ChainSearchEngine:
             except Exception:
                 pass  # Ignore pre-load failures
 
+        shared_hybrid_engine = None
+        if options.hybrid_mode:
+            shared_hybrid_engine = HybridSearchEngine(
+                weights=options.hybrid_weights,
+                config=self._config,
+            )
+
         executor = self._get_executor(effective_workers)
         # Submit all search tasks
         future_to_path = {
@@ -3562,7 +4401,8 @@ class ChainSearchEngine:
                 options.enable_fuzzy,
                 options.enable_vector,
                 options.pure_vector,
-                options.hybrid_weights
+                options.hybrid_weights,
+                shared_hybrid_engine,
             ): idx_path
             for idx_path in index_paths
         }
@@ -3590,7 +4430,8 @@ class ChainSearchEngine:
                               enable_fuzzy: bool = True,
                               enable_vector: bool = False,
                               pure_vector: bool = False,
-                              hybrid_weights: Optional[Dict[str, float]] = None) -> List[SearchResult]:
+                              hybrid_weights: Optional[Dict[str, float]] = None,
+                              hybrid_engine: Optional[HybridSearchEngine] = None) -> List[SearchResult]:
         """Search a single index database.
 
         Handles exceptions gracefully, returning empty list on failure.
@@ -3613,8 +4454,11 @@ class ChainSearchEngine:
         try:
             # Use hybrid search if enabled
             if hybrid_mode:
-                hybrid_engine = HybridSearchEngine(weights=hybrid_weights)
-                fts_results = hybrid_engine.search(
+                engine = hybrid_engine or HybridSearchEngine(
+                    weights=hybrid_weights,
+                    config=self._config,
+                )
+                fts_results = engine.search(
                     index_path,
                     query,
                     limit=limit,
@@ -3715,7 +4559,7 @@ class ChainSearchEngine:
         return filtered
 
     def _merge_and_rank(self, results: List[SearchResult],
-                         limit: int, offset: int = 0) -> List[SearchResult]:
+                         limit: int, offset: int = 0, query: Optional[str] = None) -> List[SearchResult]:
         """Aggregate, deduplicate, and rank results.
 
         Process:
@@ -3738,13 +4582,94 @@ class ChainSearchEngine:
             if path not in path_to_result or result.score > path_to_result[path].score:
                 path_to_result[path] = result
 
-        # Sort by score descending
         unique_results = list(path_to_result.values())
-        unique_results.sort(key=lambda r: r.score, reverse=True)
+        if query:
+            unique_results = self._apply_default_path_penalties(query, unique_results)
+        else:
+            unique_results.sort(key=lambda r: r.score, reverse=True)
 
         # Apply offset and limit for pagination
         return unique_results[offset:offset + limit]
 
+    def _apply_default_path_penalties(
+        self,
+        query: str,
+        results: List[SearchResult],
+    ) -> List[SearchResult]:
+        """Apply default path penalties for noisy test and generated artifact results."""
+        if not results:
+            return results
+
+        test_penalty = 0.15
+        generated_penalty = 0.35
+        if self._config is not None:
+            test_penalty = float(
+                getattr(self._config, "test_file_penalty", test_penalty) or 0.0
+            )
+            generated_penalty = float(
+                getattr(
+                    self._config,
+                    "generated_file_penalty",
+                    generated_penalty,
+                )
+                or 0.0
+            )
+        if test_penalty <= 0 and generated_penalty <= 0:
+            return sorted(results, key=lambda r: r.score, reverse=True)
+
+        from codexlens.search.ranking import (
+            apply_path_penalties,
+            rebalance_noisy_results,
+        )
+
+        penalized = apply_path_penalties(
+            results,
+            query,
+            test_file_penalty=test_penalty,
+            generated_file_penalty=generated_penalty,
+        )
+        return rebalance_noisy_results(penalized, query)
+
+    def _resolve_rerank_candidate_limit(
+        self,
+        requested_k: int,
+        candidate_count: int,
+    ) -> int:
+        """Return the cross-encoder rerank budget before final trimming."""
+        if candidate_count <= 0:
+            return max(1, int(requested_k or 1))
+
+        rerank_limit = max(1, int(requested_k or 1))
+        if self._config is not None:
+            for attr_name in ("reranker_top_k", "reranking_top_k"):
+                configured_value = getattr(self._config, attr_name, None)
+                if isinstance(configured_value, bool):
+                    continue
+                if isinstance(configured_value, (int, float)):
+                    rerank_limit = max(rerank_limit, int(configured_value))
+
+        return max(1, min(candidate_count, rerank_limit))
+
+    def _resolve_stage3_target_count(
+        self,
+        requested_k: int,
+        candidate_count: int,
+    ) -> int:
+        """Return the number of Stage 3 representatives to preserve."""
+        base_target = max(1, int(requested_k or 1)) * 2
+        target_count = base_target
+        if self._config is not None and getattr(
+            self._config,
+            "enable_staged_rerank",
+            False,
+        ):
+            target_count = max(
+                target_count,
+                self._resolve_rerank_candidate_limit(requested_k, candidate_count),
+            )
+
+        return max(1, min(candidate_count, target_count))
+
     def _search_symbols_parallel(self, index_paths: List[Path],
                                   name: str,
                                   kind: Optional[str],
diff --git a/codex-lens/src/codexlens/search/hybrid_search.py b/codex-lens/src/codexlens/search/hybrid_search.py
index cdc37277..9a300069 100644
--- a/codex-lens/src/codexlens/search/hybrid_search.py
+++ b/codex-lens/src/codexlens/search/hybrid_search.py
@@ -7,6 +7,7 @@ results via Reciprocal Rank Fusion (RRF) algorithm.
 from __future__ import annotations
 
 import logging
+import threading
 import time
 from concurrent.futures import ThreadPoolExecutor, TimeoutError as FuturesTimeoutError, as_completed
 from contextlib import contextmanager
@@ -34,19 +35,21 @@ from codexlens.config import Config
 from codexlens.config import VECTORS_HNSW_NAME
 from codexlens.entities import SearchResult
 from codexlens.search.ranking import (
-    DEFAULT_WEIGHTS,
+    DEFAULT_WEIGHTS as RANKING_DEFAULT_WEIGHTS,
     QueryIntent,
     apply_symbol_boost,
     cross_encoder_rerank,
     detect_query_intent,
     filter_results_by_category,
     get_rrf_weights,
+    query_prefers_lexical_search,
     reciprocal_rank_fusion,
     rerank_results,
     simple_weighted_fusion,
     tag_search_source,
 )
 from codexlens.storage.dir_index import DirIndexStore
+from codexlens.storage.index_filters import filter_index_paths
 
 # Optional LSP imports (for real-time graph expansion)
 try:
@@ -67,8 +70,13 @@ class HybridSearchEngine:
         default_weights: Default RRF weights for each source
     """
 
-    # NOTE: DEFAULT_WEIGHTS imported from ranking.py - single source of truth
-    # FTS + vector hybrid mode (exact: 0.3, fuzzy: 0.1, vector: 0.6)
+    # Public compatibility contract for callers/tests that expect the legacy
+    # three-backend defaults on the engine instance.
+    DEFAULT_WEIGHTS = {
+        "exact": 0.3,
+        "fuzzy": 0.1,
+        "vector": 0.6,
+    }
 
     def __init__(
         self,
@@ -95,11 +103,172 @@ class HybridSearchEngine:
                 f"Did you mean to pass index_path to search() instead of __init__()?"
             )
 
-        self.weights = weights or DEFAULT_WEIGHTS.copy()
+        self.weights = weights
         self._config = config
         self.embedder = embedder
         self.reranker: Any = None
         self._use_gpu = config.embedding_use_gpu if config else True
+        self._centralized_cache_lock = threading.RLock()
+        self._centralized_model_config_cache: Dict[str, Any] = {}
+        self._centralized_embedder_cache: Dict[tuple[Any, ...], Any] = {}
+        self._centralized_ann_cache: Dict[tuple[str, int], Any] = {}
+        self._centralized_query_embedding_cache: Dict[tuple[Any, ...], Any] = {}
+
+    @property
+    def weights(self) -> Dict[str, float]:
+        """Public/default weights exposed for backwards compatibility."""
+        return dict(self._weights)
+
+    @weights.setter
+    def weights(self, value: Optional[Dict[str, float]]) -> None:
+        """Update public and internal fusion weights together."""
+        if value is None:
+            public_weights = self.DEFAULT_WEIGHTS.copy()
+            fusion_weights = dict(RANKING_DEFAULT_WEIGHTS)
+            fusion_weights.update(public_weights)
+        else:
+            if not isinstance(value, dict):
+                raise TypeError(f"weights must be a dict, got {type(value).__name__}")
+            public_weights = dict(value)
+            fusion_weights = dict(value)
+
+        self._weights = public_weights
+        self._fusion_weights = fusion_weights
+
+    @staticmethod
+    def _clamp_search_score(score: float) -> float:
+        """Keep ANN-derived similarity scores within SearchResult's valid domain."""
+
+        return max(0.0, float(score))
+
+    def _get_centralized_model_config(self, index_root: Path) -> Optional[Dict[str, Any]]:
+        """Load and cache the centralized embedding model config for an index root."""
+        root_key = str(Path(index_root).resolve())
+
+        with self._centralized_cache_lock:
+            if root_key in self._centralized_model_config_cache:
+                cached = self._centralized_model_config_cache[root_key]
+                return dict(cached) if isinstance(cached, dict) else None
+
+        model_config: Optional[Dict[str, Any]] = None
+        try:
+            from codexlens.semantic.vector_store import VectorStore
+
+            central_index_path = Path(root_key) / "_index.db"
+            if central_index_path.exists():
+                with VectorStore(central_index_path) as vs:
+                    loaded = vs.get_model_config()
+                if isinstance(loaded, dict):
+                    model_config = dict(loaded)
+                self.logger.debug(
+                    "Loaded model config from centralized index: %s",
+                    model_config,
+                )
+        except Exception as exc:
+            self.logger.debug(
+                "Failed to load model config from centralized index: %s",
+                exc,
+            )
+
+        with self._centralized_cache_lock:
+            self._centralized_model_config_cache[root_key] = (
+                dict(model_config) if isinstance(model_config, dict) else None
+            )
+
+        return dict(model_config) if isinstance(model_config, dict) else None
+
+    def _get_centralized_embedder(
+        self,
+        model_config: Optional[Dict[str, Any]],
+    ) -> tuple[Any, int, tuple[Any, ...]]:
+        """Resolve and cache the embedder used for centralized vector search."""
+        from codexlens.semantic.factory import get_embedder
+
+        backend = "fastembed"
+        model_name: Optional[str] = None
+        model_profile = "code"
+        use_gpu = bool(self._use_gpu)
+        embedding_dim: Optional[int] = None
+
+        if model_config:
+            backend = str(model_config.get("backend", "fastembed") or "fastembed")
+            model_name = model_config.get("model_name")
+            model_profile = str(model_config.get("model_profile", "code") or "code")
+            raw_dim = model_config.get("embedding_dim")
+            embedding_dim = int(raw_dim) if raw_dim else None
+
+        if backend == "litellm":
+            embedder_key: tuple[Any, ...] = ("litellm", model_name or "", None)
+        else:
+            embedder_key = ("fastembed", model_profile, use_gpu)
+
+        with self._centralized_cache_lock:
+            cached = self._centralized_embedder_cache.get(embedder_key)
+        if cached is None:
+            if backend == "litellm":
+                cached = get_embedder(backend="litellm", model=model_name)
+            else:
+                cached = get_embedder(
+                    backend="fastembed",
+                    profile=model_profile,
+                    use_gpu=use_gpu,
+                )
+            with self._centralized_cache_lock:
+                existing = self._centralized_embedder_cache.get(embedder_key)
+                if existing is None:
+                    self._centralized_embedder_cache[embedder_key] = cached
+                else:
+                    cached = existing
+
+        if embedding_dim is None:
+            embedding_dim = int(getattr(cached, "embedding_dim", 0) or 0)
+
+        return cached, embedding_dim, embedder_key
+
+    def _get_centralized_ann_index(self, index_root: Path, dim: int) -> Any:
+        """Load and cache a centralized ANN index for repeated searches."""
+        from codexlens.semantic.ann_index import ANNIndex
+
+        resolved_root = Path(index_root).resolve()
+        cache_key = (str(resolved_root), int(dim))
+
+        with self._centralized_cache_lock:
+            cached = self._centralized_ann_cache.get(cache_key)
+        if cached is not None:
+            return cached
+
+        ann_index = ANNIndex.create_central(index_root=resolved_root, dim=int(dim))
+        if not ann_index.load():
+            return None
+
+        with self._centralized_cache_lock:
+            existing = self._centralized_ann_cache.get(cache_key)
+            if existing is None:
+                self._centralized_ann_cache[cache_key] = ann_index
+                return ann_index
+            return existing
+
+    def _get_cached_query_embedding(
+        self,
+        query: str,
+        embedder: Any,
+        embedder_key: tuple[Any, ...],
+    ) -> Any:
+        """Cache repeated query embeddings for the same embedder settings."""
+        cache_key = embedder_key + (query,)
+
+        with self._centralized_cache_lock:
+            cached = self._centralized_query_embedding_cache.get(cache_key)
+        if cached is not None:
+            return cached
+
+        query_embedding = embedder.embed_single(query)
+        with self._centralized_cache_lock:
+            existing = self._centralized_query_embedding_cache.get(cache_key)
+            if existing is None:
+                self._centralized_query_embedding_cache[cache_key] = query_embedding
+                return query_embedding
+            return existing
 
     def search(
         self,
@@ -154,6 +323,7 @@ class HybridSearchEngine:
 
         # Detect query intent early for category filtering at index level
         query_intent = detect_query_intent(query)
+        lexical_priority_query = query_prefers_lexical_search(query)
         # Map intent to category for vector search:
         # - KEYWORD (code intent) -> filter to 'code' only
         # - SEMANTIC (doc intent) -> no filter (allow docs to surface)
@@ -182,11 +352,11 @@ class HybridSearchEngine:
             backends["exact"] = True
             if enable_fuzzy:
                 backends["fuzzy"] = True
-            if enable_vector:
+            if enable_vector and not lexical_priority_query:
                 backends["vector"] = True
 
         # Add LSP graph expansion if requested and available
-        if enable_lsp_graph and HAS_LSP:
+        if enable_lsp_graph and HAS_LSP and not lexical_priority_query:
             backends["lsp_graph"] = True
         elif enable_lsp_graph and not HAS_LSP:
             self.logger.warning(
@@ -214,7 +384,7 @@ class HybridSearchEngine:
         # Filter weights to only active backends
         active_weights = {
             source: weight
-            for source, weight in self.weights.items()
+            for source, weight in self._fusion_weights.items()
             if source in results_map
         }
 
@@ -247,10 +417,16 @@ class HybridSearchEngine:
             )
 
         # Optional: embedding-based reranking on top results
-        if self._config is not None and self._config.enable_reranking:
+        if (
+            self._config is not None
+            and self._config.enable_reranking
+            and not lexical_priority_query
+        ):
             with timer("reranking", self.logger):
                 if self.embedder is None:
-                    self.embedder = self._get_reranking_embedder()
+                    with self._centralized_cache_lock:
+                        if self.embedder is None:
+                            self.embedder = self._get_reranking_embedder()
                 fused_results = rerank_results(
                     query,
                     fused_results[:100],
@@ -267,10 +443,13 @@ class HybridSearchEngine:
             self._config is not None
             and self._config.enable_reranking
             and self._config.enable_cross_encoder_rerank
+            and not lexical_priority_query
         ):
             with timer("cross_encoder_rerank", self.logger):
                 if self.reranker is None:
-                    self.reranker = self._get_cross_encoder_reranker()
+                    with self._centralized_cache_lock:
+                        if self.reranker is None:
+                            self.reranker = self._get_cross_encoder_reranker()
                 if self.reranker is not None:
                     fused_results = cross_encoder_rerank(
                         query,
@@ -363,11 +542,18 @@ class HybridSearchEngine:
 
             device: str | None = None
             kwargs: dict[str, Any] = {}
+            reranker_use_gpu = bool(
+                getattr(
+                    self._config,
+                    "reranker_use_gpu",
+                    getattr(self._config, "embedding_use_gpu", True),
+                )
+            )
 
             if backend == "onnx":
-                kwargs["use_gpu"] = bool(getattr(self._config, "embedding_use_gpu", True))
+                kwargs["use_gpu"] = reranker_use_gpu
             elif backend == "legacy":
-                if not bool(getattr(self._config, "embedding_use_gpu", True)):
+                if not reranker_use_gpu:
                     device = "cpu"
             elif backend == "api":
                 # Pass max_input_tokens for adaptive batching
@@ -573,60 +759,16 @@ class HybridSearchEngine:
             List of SearchResult objects ordered by semantic similarity
         """
         try:
-            import sqlite3
-            import json
-            from codexlens.semantic.factory import get_embedder
-            from codexlens.semantic.ann_index import ANNIndex
-
-            # Get model config from the first index database we can find
-            # (all indexes should use the same embedding model)
             index_root = hnsw_path.parent
-            model_config = None
-
-            # Try to get model config from the centralized index root first
-            # (not the sub-directory index_path, which may have outdated config)
-            try:
-                from codexlens.semantic.vector_store import VectorStore
-                central_index_path = index_root / "_index.db"
-                if central_index_path.exists():
-                    with VectorStore(central_index_path) as vs:
-                        model_config = vs.get_model_config()
-                    self.logger.debug(
-                        "Loaded model config from centralized index: %s",
-                        model_config
-                    )
-            except Exception as e:
-                self.logger.debug("Failed to load model config from centralized index: %s", e)
-
-            # Detect dimension from HNSW file if model config not found
+            model_config = self._get_centralized_model_config(index_root)
             if model_config is None:
-                self.logger.debug("Model config not found, will detect from HNSW index")
-                # Create a temporary ANNIndex to load and detect dimension
-                # We need to know the dimension to properly load the index
-
-            # Get embedder based on model config or default
-            if model_config:
-                backend = model_config.get("backend", "fastembed")
-                model_name = model_config["model_name"]
-                model_profile = model_config["model_profile"]
-                embedding_dim = model_config["embedding_dim"]
-
-                if backend == "litellm":
-                    embedder = get_embedder(backend="litellm", model=model_name)
-                else:
-                    embedder = get_embedder(backend="fastembed", profile=model_profile)
-            else:
-                # Default to code profile
-                embedder = get_embedder(backend="fastembed", profile="code")
-                embedding_dim = embedder.embedding_dim
+                self.logger.debug("Model config not found, will detect from cached embedder")
+            embedder, embedding_dim, embedder_key = self._get_centralized_embedder(model_config)
 
             # Load centralized ANN index
             start_load = time.perf_counter()
-            ann_index = ANNIndex.create_central(
-                index_root=index_root,
-                dim=embedding_dim,
-            )
-            if not ann_index.load():
+            ann_index = self._get_centralized_ann_index(index_root=index_root, dim=embedding_dim)
+            if ann_index is None:
                 self.logger.warning("Failed to load centralized vector index from %s", hnsw_path)
                 return []
             self.logger.debug(
@@ -637,7 +779,7 @@ class HybridSearchEngine:
 
             # Generate query embedding
             start_embed = time.perf_counter()
-            query_embedding = embedder.embed_single(query)
+            query_embedding = self._get_cached_query_embedding(query, embedder, embedder_key)
             self.logger.debug(
                 "[TIMING] query_embedding: %.2fms",
                 (time.perf_counter() - start_embed) * 1000
@@ -658,7 +800,7 @@ class HybridSearchEngine:
                 return []
 
             # Convert distances to similarity scores (for cosine: score = 1 - distance)
-            scores = [1.0 - d for d in distances]
+            scores = [self._clamp_search_score(1.0 - d) for d in distances]
 
             # Fetch chunk metadata from semantic_chunks tables
             # We need to search across all _index.db files in the project
@@ -755,7 +897,7 @@ class HybridSearchEngine:
                 start_line = row.get("start_line")
                 end_line = row.get("end_line")
 
-                score = score_map.get(chunk_id, 0.0)
+                score = self._clamp_search_score(score_map.get(chunk_id, 0.0))
 
                 # Build excerpt
                 excerpt = content[:200] + "..." if len(content) > 200 else content
@@ -818,7 +960,7 @@ class HybridSearchEngine:
         import json
 
         # Find all _index.db files
-        index_files = list(index_root.rglob("_index.db"))
+        index_files = filter_index_paths(index_root.rglob("_index.db"), index_root)
 
         results = []
         found_ids = set()
@@ -870,7 +1012,7 @@ class HybridSearchEngine:
                         metadata_json = row["metadata"]
                         metadata = json.loads(metadata_json) if metadata_json else {}
 
-                        score = score_map.get(chunk_id, 0.0)
+                        score = self._clamp_search_score(score_map.get(chunk_id, 0.0))
 
                         # Build excerpt
                         excerpt = content[:200] + "..." if len(content) > 200 else content
diff --git a/codex-lens/src/codexlens/search/ranking.py b/codex-lens/src/codexlens/search/ranking.py
index a578466b..5c6bf346 100644
--- a/codex-lens/src/codexlens/search/ranking.py
+++ b/codex-lens/src/codexlens/search/ranking.py
@@ -6,6 +6,7 @@ for combining results from heterogeneous search backends (exact FTS, fuzzy FTS,
 
 from __future__ import annotations
 
+import logging
 import re
 import math
 from enum import Enum
@@ -14,6 +15,8 @@ from typing import Any, Dict, List, Optional
 
 from codexlens.entities import SearchResult, AdditionalLocation
 
+logger = logging.getLogger(__name__)
+
 
 # Default RRF weights for hybrid search
 DEFAULT_WEIGHTS = {
@@ -32,6 +35,229 @@ class QueryIntent(str, Enum):
     MIXED = "mixed"
 
 
+_TEST_QUERY_RE = re.compile(
+    r"\b(test|tests|spec|specs|fixture|fixtures|benchmark|benchmarks)\b",
+    flags=re.IGNORECASE,
+)
+_AUXILIARY_QUERY_RE = re.compile(
+    r"\b(example|examples|demo|demos|sample|samples|debug|benchmark|benchmarks|profile|profiling)\b",
+    flags=re.IGNORECASE,
+)
+_ARTIFACT_QUERY_RE = re.compile(
+    r"(?<!\w)(dist|build|out|coverage|htmlcov|generated|bundle|compiled|artifact|artifacts|\.workflow)(?!\w)",
+    flags=re.IGNORECASE,
+)
+_ENV_STYLE_QUERY_RE = re.compile(r"\b[A-Z][A-Z0-9]+(?:_[A-Z0-9]+)+\b")
+_AUXILIARY_DIR_NAMES = frozenset(
+    {
+        "example",
+        "examples",
+        "demo",
+        "demos",
+        "sample",
+        "samples",
+        "benchmark",
+        "benchmarks",
+        "profile",
+        "profiles",
+    }
+)
+_GENERATED_DIR_NAMES = frozenset(
+    {
+        "dist",
+        "build",
+        "out",
+        "coverage",
+        "htmlcov",
+        ".cache",
+        ".workflow",
+        ".next",
+        ".nuxt",
+        ".parcel-cache",
+        ".turbo",
+        "tmp",
+        "temp",
+        "generated",
+    }
+)
+_GENERATED_FILE_SUFFIXES = (
+    ".generated.ts",
+    ".generated.tsx",
+    ".generated.js",
+    ".generated.jsx",
+    ".generated.py",
+    ".gen.ts",
+    ".gen.tsx",
+    ".gen.js",
+    ".gen.jsx",
+    ".min.js",
+    ".min.css",
+    ".bundle.js",
+    ".bundle.css",
+)
+_SOURCE_DIR_NAMES = frozenset(
+    {
+        "src",
+        "lib",
+        "core",
+        "app",
+        "server",
+        "client",
+        "services",
+    }
+)
+_IDENTIFIER_QUERY_RE = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$")
+_TOPIC_TOKEN_RE = re.compile(r"[A-Za-z][A-Za-z0-9]*")
+_EXPLICIT_PATH_HINT_MARKER_RE = re.compile(r"[_\-/\\.]")
+_SEMANTIC_QUERY_STOPWORDS = frozenset(
+    {
+        "the",
+        "a",
+        "an",
+        "is",
+        "are",
+        "was",
+        "were",
+        "be",
+        "been",
+        "being",
+        "have",
+        "has",
+        "had",
+        "do",
+        "does",
+        "did",
+        "will",
+        "would",
+        "could",
+        "should",
+        "may",
+        "might",
+        "must",
+        "can",
+        "to",
+        "of",
+        "in",
+        "for",
+        "on",
+        "with",
+        "at",
+        "by",
+        "from",
+        "as",
+        "into",
+        "through",
+        "and",
+        "but",
+        "if",
+        "or",
+        "not",
+        "this",
+        "that",
+        "these",
+        "those",
+        "it",
+        "its",
+        "how",
+        "what",
+        "where",
+        "when",
+        "why",
+        "which",
+        "who",
+        "whom",
+    }
+)
+_PATH_TOPIC_STOPWORDS = frozenset(
+    {
+        *_SOURCE_DIR_NAMES,
+        *_AUXILIARY_DIR_NAMES,
+        *_GENERATED_DIR_NAMES,
+        "tool",
+        "tools",
+        "util",
+        "utils",
+        "test",
+        "tests",
+        "spec",
+        "specs",
+        "fixture",
+        "fixtures",
+        "index",
+        "main",
+        "ts",
+        "tsx",
+        "js",
+        "jsx",
+        "mjs",
+        "cjs",
+        "py",
+        "java",
+        "go",
+        "rs",
+        "rb",
+        "php",
+        "cs",
+        "cpp",
+        "cc",
+        "c",
+        "h",
+    }
+)
+_LEXICAL_PRIORITY_SURFACE_TOKENS = frozenset(
+    {
+        "config",
+        "configs",
+        "configuration",
+        "configurations",
+        "setting",
+        "settings",
+        "backend",
+        "backends",
+        "environment",
+        "env",
+        "variable",
+        "variables",
+        "factory",
+        "factories",
+        "override",
+        "overrides",
+        "option",
+        "options",
+        "flag",
+        "flags",
+        "mode",
+        "modes",
+    }
+)
+_LEXICAL_PRIORITY_FOCUS_TOKENS = frozenset(
+    {
+        "embedding",
+        "embeddings",
+        "reranker",
+        "rerankers",
+        "onnx",
+        "api",
+        "litellm",
+        "fastembed",
+        "local",
+        "legacy",
+        "stage",
+        "stage2",
+        "stage3",
+        "stage4",
+        "precomputed",
+        "realtime",
+        "static",
+        "global",
+        "graph",
+        "selection",
+        "model",
+        "models",
+    }
+)
+
+
 def normalize_weights(weights: Dict[str, float | None]) -> Dict[str, float | None]:
     """Normalize weights to sum to 1.0 (best-effort)."""
     total = sum(float(v) for v in weights.values() if v is not None)
@@ -66,6 +292,7 @@ def detect_query_intent(query: str) -> QueryIntent:
     has_code_signals = bool(
         re.search(r"(::|->|\.)", trimmed)
         or re.search(r"[A-Z][a-z]+[A-Z]", trimmed)
+        or re.search(r"\b[a-z]+[A-Z][A-Za-z0-9_]*\b", trimmed)
         or re.search(r"\b\w+_\w+\b", trimmed)
         or re.search(
             r"\b(def|class|function|const|let|var|import|from|return|async|await|interface|type)\b",
@@ -119,6 +346,56 @@ def get_rrf_weights(
     return adjust_weights_by_intent(detect_query_intent(query), base_weights)
 
 
+def query_targets_test_files(query: str) -> bool:
+    """Return True when the query explicitly targets tests/spec fixtures."""
+    return bool(_TEST_QUERY_RE.search((query or "").strip()))
+
+
+def query_targets_generated_files(query: str) -> bool:
+    """Return True when the query explicitly targets generated/build artifacts."""
+    return bool(_ARTIFACT_QUERY_RE.search((query or "").strip()))
+
+
+def query_targets_auxiliary_files(query: str) -> bool:
+    """Return True when the query explicitly targets examples, benchmarks, or debug files."""
+    return bool(_AUXILIARY_QUERY_RE.search((query or "").strip()))
+
+
+def query_prefers_lexical_search(query: str) -> bool:
+    """Return True when config/env/factory style queries are safer with lexical-first search."""
+    trimmed = (query or "").strip()
+    if not trimmed:
+        return False
+
+    if _ENV_STYLE_QUERY_RE.search(trimmed):
+        return True
+
+    query_tokens = set(_semantic_query_topic_tokens(trimmed))
+    if not query_tokens:
+        return False
+
+    if query_tokens.intersection({"factory", "factories"}):
+        return True
+
+    if query_tokens.intersection({"environment", "env"}) and query_tokens.intersection({"variable", "variables"}):
+        return True
+
+    if "backend" in query_tokens and query_tokens.intersection(
+        {"embedding", "embeddings", "reranker", "rerankers", "onnx", "api", "litellm", "fastembed", "local", "legacy"}
+    ):
+        return True
+
+    surface_hits = query_tokens.intersection(_LEXICAL_PRIORITY_SURFACE_TOKENS)
+    focus_hits = query_tokens.intersection(_LEXICAL_PRIORITY_FOCUS_TOKENS)
+    return bool(surface_hits and focus_hits)
+
+
+def _normalized_path_parts(path: str) -> List[str]:
+    """Normalize a path string into casefolded components for heuristics."""
+    normalized = (path or "").replace("\\", "/")
+    return [part.casefold() for part in normalized.split("/") if part and part != "."]
+
+
 # File extensions to category mapping for fast lookup
 _EXT_TO_CATEGORY: Dict[str, str] = {
     # Code extensions
@@ -196,6 +473,482 @@ def filter_results_by_category(
     return filtered
 
 
+def is_test_file(path: str) -> bool:
+    """Return True when a path clearly refers to a test/spec file."""
+    parts = _normalized_path_parts(path)
+    if not parts:
+        return False
+    basename = parts[-1]
+    return (
+        basename.startswith("test_")
+        or basename.endswith("_test.py")
+        or basename.endswith(".test.ts")
+        or basename.endswith(".test.tsx")
+        or basename.endswith(".test.js")
+        or basename.endswith(".test.jsx")
+        or basename.endswith(".spec.ts")
+        or basename.endswith(".spec.tsx")
+        or basename.endswith(".spec.js")
+        or basename.endswith(".spec.jsx")
+        or "tests" in parts[:-1]
+        or "test" in parts[:-1]
+        or "__fixtures__" in parts[:-1]
+        or "fixtures" in parts[:-1]
+    )
+
+
+def is_generated_artifact_path(path: str) -> bool:
+    """Return True when a path clearly points at generated/build artifacts."""
+    parts = _normalized_path_parts(path)
+    if not parts:
+        return False
+    basename = parts[-1]
+    return any(part in _GENERATED_DIR_NAMES for part in parts[:-1]) or basename.endswith(
+        _GENERATED_FILE_SUFFIXES
+    )
+
+
+def is_auxiliary_reference_path(path: str) -> bool:
+    """Return True for examples, benchmarks, demos, and debug helper files."""
+    parts = _normalized_path_parts(path)
+    if not parts:
+        return False
+    basename = parts[-1]
+    if any(part in _AUXILIARY_DIR_NAMES for part in parts[:-1]):
+        return True
+    return (
+        basename.startswith("debug_")
+        or basename.startswith("benchmark")
+        or basename.startswith("profile_")
+        or "_benchmark" in basename
+        or "_profile" in basename
+    )
+
+
+def _extract_identifier_query(query: str) -> Optional[str]:
+    """Return a single-token identifier query when definition boosting is safe."""
+    trimmed = (query or "").strip()
+    if not trimmed or " " in trimmed:
+        return None
+    if not _IDENTIFIER_QUERY_RE.fullmatch(trimmed):
+        return None
+    return trimmed
+
+
+def extract_explicit_path_hints(query: str) -> List[List[str]]:
+    """Extract explicit path/file hints from separator-style query tokens.
+
+    Natural-language queries often contain one or two high-signal feature/file
+    hints such as ``smart_search`` or ``smart-search.ts`` alongside broader
+    platform words like ``CodexLens``. These hints should be treated as more
+    specific than the surrounding prose.
+    """
+    hints: List[List[str]] = []
+    seen: set[tuple[str, ...]] = set()
+    for raw_part in re.split(r"\s+", query or ""):
+        candidate = raw_part.strip().strip("\"'`()[]{}<>:,;")
+        if not candidate or not _EXPLICIT_PATH_HINT_MARKER_RE.search(candidate):
+            continue
+        tokens = [
+            token
+            for token in _split_identifier_like_tokens(candidate)
+            if token not in _PATH_TOPIC_STOPWORDS
+        ]
+        if len(tokens) < 2:
+            continue
+        key = tuple(tokens)
+        if key in seen:
+            continue
+        seen.add(key)
+        hints.append(list(key))
+    return hints
+
+
+def _is_source_implementation_path(path: str) -> bool:
+    """Return True when a path looks like an implementation file under a source dir."""
+    parts = _normalized_path_parts(path)
+    if not parts:
+        return False
+    return any(part in _SOURCE_DIR_NAMES for part in parts[:-1])
+
+
+def _result_text_candidates(result: SearchResult) -> List[str]:
+    """Collect short text snippets that may contain a symbol definition."""
+    candidates: List[str] = []
+    for text in (result.excerpt, result.content):
+        if not isinstance(text, str) or not text.strip():
+            continue
+        for line in text.splitlines():
+            stripped = line.strip()
+            if stripped:
+                candidates.append(stripped)
+            if len(candidates) >= 6:
+                break
+        if len(candidates) >= 6:
+            break
+
+    symbol_name = result.symbol_name
+    if not symbol_name and result.symbol is not None:
+        symbol_name = getattr(result.symbol, "name", None)
+    if isinstance(symbol_name, str) and symbol_name.strip():
+        candidates.append(symbol_name.strip())
+    return candidates
+
+
+def _result_defines_identifier(result: SearchResult, symbol: str) -> bool:
+    """Best-effort check for whether a result snippet looks like a symbol definition."""
+    escaped_symbol = re.escape(symbol)
+    definition_patterns = (
+        rf"^\s*(?:export\s+)?(?:default\s+)?(?:async\s+)?def\s+{escaped_symbol}\b",
+        rf"^\s*(?:export\s+)?(?:default\s+)?(?:async\s+)?function\s+{escaped_symbol}\b",
+        rf"^\s*(?:export\s+)?(?:default\s+)?class\s+{escaped_symbol}\b",
+        rf"^\s*(?:export\s+)?(?:default\s+)?interface\s+{escaped_symbol}\b",
+        rf"^\s*(?:export\s+)?(?:default\s+)?type\s+{escaped_symbol}\b",
+        rf"^\s*(?:export\s+)?(?:default\s+)?(?:const|let|var)\s+{escaped_symbol}\b",
+        rf"^\s*{escaped_symbol}\s*=\s*(?:async\s+)?\(",
+        rf"^\s*{escaped_symbol}\s*=\s*(?:async\s+)?[^=]*=>",
+    )
+    for candidate in _result_text_candidates(result):
+        if any(re.search(pattern, candidate) for pattern in definition_patterns):
+            return True
+    return False
+
+
+def _split_identifier_like_tokens(text: str) -> List[str]:
+    """Split identifier-like text into normalized word tokens."""
+    if not text:
+        return []
+
+    tokens: List[str] = []
+    for raw_token in _TOPIC_TOKEN_RE.findall(text):
+        expanded = re.sub(r"([a-z0-9])([A-Z])", r"\1 \2", raw_token)
+        expanded = re.sub(r"([A-Z]+)([A-Z][a-z])", r"\1 \2", expanded)
+        for token in expanded.split():
+            normalized = _normalize_topic_token(token)
+            if normalized:
+                tokens.append(normalized)
+    return tokens
+
+
+def _normalize_topic_token(token: str) -> Optional[str]:
+    """Normalize lightweight topic tokens for query/path overlap heuristics."""
+    normalized = (token or "").casefold()
+    if len(normalized) < 2 or normalized.isdigit():
+        return None
+    if len(normalized) > 4 and normalized.endswith("ies"):
+        normalized = f"{normalized[:-3]}y"
+    elif len(normalized) > 3 and normalized.endswith("s") and not normalized.endswith("ss"):
+        normalized = normalized[:-1]
+    return normalized or None
+
+
+def _dedupe_preserve_order(tokens: List[str]) -> List[str]:
+    """Deduplicate tokens while preserving the first-seen order."""
+    deduped: List[str] = []
+    seen: set[str] = set()
+    for token in tokens:
+        if token in seen:
+            continue
+        seen.add(token)
+        deduped.append(token)
+    return deduped
+
+
+def _semantic_query_topic_tokens(query: str) -> List[str]:
+    """Extract salient natural-language tokens for lightweight topic matching."""
+    tokens = [
+        token
+        for token in _split_identifier_like_tokens(query)
+        if token not in _SEMANTIC_QUERY_STOPWORDS
+    ]
+    return _dedupe_preserve_order(tokens)
+
+
+def _path_topic_tokens(path: str) -> tuple[List[str], List[str]]:
+    """Extract normalized topic tokens from a path and its basename."""
+    parts = _normalized_path_parts(path)
+    if not parts:
+        return [], []
+
+    path_tokens: List[str] = []
+    basename_tokens: List[str] = []
+    last_index = len(parts) - 1
+    for index, part in enumerate(parts):
+        target = basename_tokens if index == last_index else path_tokens
+        for token in _split_identifier_like_tokens(part):
+            if token in _PATH_TOPIC_STOPWORDS:
+                continue
+            target.append(token)
+    return _dedupe_preserve_order(path_tokens), _dedupe_preserve_order(basename_tokens)
+
+
+def _source_path_topic_boost(
+    query: str,
+    path: str,
+    query_intent: QueryIntent,
+) -> tuple[float, List[str]]:
+    """Return a path/topic boost when a query strongly overlaps a source path."""
+    query_tokens = _semantic_query_topic_tokens(query)
+    if len(query_tokens) < 2:
+        return 1.0, []
+
+    path_tokens, basename_tokens = _path_topic_tokens(path)
+    if not path_tokens and not basename_tokens:
+        return 1.0, []
+
+    path_token_set = set(path_tokens) | set(basename_tokens)
+    basename_overlap = [token for token in query_tokens if token in basename_tokens]
+    all_overlap = [token for token in query_tokens if token in path_token_set]
+    explicit_hint_tokens = extract_explicit_path_hints(query)
+
+    for hint_tokens in explicit_hint_tokens:
+        if basename_tokens == hint_tokens:
+            if query_intent == QueryIntent.KEYWORD:
+                return 4.5, hint_tokens[:3]
+            return 2.4, hint_tokens[:3]
+        if all(token in basename_tokens for token in hint_tokens):
+            if query_intent == QueryIntent.KEYWORD:
+                return 4.5, hint_tokens[:3]
+            return 1.6, hint_tokens[:3]
+
+    if query_prefers_lexical_search(query):
+        lexical_surface_overlap = [
+            token for token in basename_tokens if token in query_tokens and token in _LEXICAL_PRIORITY_SURFACE_TOKENS
+        ]
+        if lexical_surface_overlap:
+            lexical_overlap = lexical_surface_overlap[:3]
+            if query_intent == QueryIntent.KEYWORD:
+                return 5.5, lexical_overlap
+            return 5.0, lexical_overlap
+
+    if query_intent == QueryIntent.KEYWORD:
+        if len(basename_overlap) >= 2:
+            # Multi-token identifier-style queries often name the feature/file directly.
+            # Give basename matches a stronger lift so they can survive workspace fan-out.
+            multiplier = min(4.5, 2.0 + 1.25 * float(len(basename_overlap)))
+            return multiplier, basename_overlap[:3]
+        if len(all_overlap) >= 3:
+            multiplier = min(2.0, 1.1 + 0.2 * len(all_overlap))
+            return multiplier, all_overlap[:3]
+        return 1.0, []
+
+    if len(basename_overlap) >= 2:
+        multiplier = min(1.45, 1.15 + 0.1 * len(basename_overlap))
+        return multiplier, basename_overlap[:3]
+    if len(all_overlap) >= 3:
+        multiplier = min(1.3, 1.05 + 0.05 * len(all_overlap))
+        return multiplier, all_overlap[:3]
+    return 1.0, []
+
+
+def apply_path_penalties(
+    results: List[SearchResult],
+    query: str,
+    *,
+    test_file_penalty: float = 0.15,
+    generated_file_penalty: float = 0.35,
+) -> List[SearchResult]:
+    """Apply lightweight path-based penalties to reduce noisy rankings."""
+    if not results or (test_file_penalty <= 0 and generated_file_penalty <= 0):
+        return results
+
+    query_intent = detect_query_intent(query)
+    skip_test_penalty = query_targets_test_files(query)
+    skip_auxiliary_penalty = query_targets_auxiliary_files(query)
+    skip_generated_penalty = query_targets_generated_files(query)
+    query_topic_tokens = _semantic_query_topic_tokens(query)
+    keyword_path_query = query_intent == QueryIntent.KEYWORD and len(query_topic_tokens) >= 2
+    explicit_feature_query = bool(extract_explicit_path_hints(query))
+    source_oriented_query = (
+        explicit_feature_query
+        or keyword_path_query
+        or (
+            query_intent in {QueryIntent.SEMANTIC, QueryIntent.MIXED}
+            and len(query_topic_tokens) >= 2
+        )
+    )
+    identifier_query = None
+    if query_intent == QueryIntent.KEYWORD:
+        identifier_query = _extract_identifier_query(query)
+    effective_test_penalty = float(test_file_penalty)
+    if effective_test_penalty > 0 and not skip_test_penalty:
+        if query_intent == QueryIntent.KEYWORD:
+            # Identifier-style queries should prefer implementation files over test references.
+            effective_test_penalty = max(effective_test_penalty, 0.35)
+        elif query_intent in {QueryIntent.SEMANTIC, QueryIntent.MIXED}:
+            # Natural-language code queries should still prefer implementation files over references.
+            effective_test_penalty = max(effective_test_penalty, 0.25)
+        if explicit_feature_query:
+            # Explicit feature/file hints should be even more biased toward source implementations.
+            effective_test_penalty = max(effective_test_penalty, 0.45)
+    effective_auxiliary_penalty = effective_test_penalty
+    if effective_auxiliary_penalty > 0 and not skip_auxiliary_penalty and explicit_feature_query:
+        # Examples/benchmarks are usually descriptive noise for feature-targeted implementation queries.
+        effective_auxiliary_penalty = max(effective_auxiliary_penalty, 0.5)
+    effective_generated_penalty = float(generated_file_penalty)
+    if effective_generated_penalty > 0 and not skip_generated_penalty:
+        if source_oriented_query:
+            effective_generated_penalty = max(effective_generated_penalty, 0.45)
+        if explicit_feature_query:
+            effective_generated_penalty = max(effective_generated_penalty, 0.6)
+
+    penalized: List[SearchResult] = []
+    for result in results:
+        multiplier = 1.0
+        penalty_multiplier = 1.0
+        boost_multiplier = 1.0
+        penalty_reasons: List[str] = []
+        boost_reasons: List[str] = []
+
+        if effective_test_penalty > 0 and not skip_test_penalty and is_test_file(result.path):
+            penalty_multiplier *= max(0.0, 1.0 - effective_test_penalty)
+            penalty_reasons.append("test_file")
+
+        if (
+            effective_auxiliary_penalty > 0
+            and not skip_auxiliary_penalty
+            and not is_test_file(result.path)
+            and is_auxiliary_reference_path(result.path)
+        ):
+            penalty_multiplier *= max(0.0, 1.0 - effective_auxiliary_penalty)
+            penalty_reasons.append("auxiliary_file")
+
+        if (
+            effective_generated_penalty > 0
+            and not skip_generated_penalty
+            and is_generated_artifact_path(result.path)
+        ):
+            penalty_multiplier *= max(0.0, 1.0 - effective_generated_penalty)
+            penalty_reasons.append("generated_artifact")
+
+        if (
+            identifier_query
+            and not is_test_file(result.path)
+            and not is_generated_artifact_path(result.path)
+            and _result_defines_identifier(result, identifier_query)
+        ):
+            if _is_source_implementation_path(result.path):
+                boost_multiplier *= 2.0
+                boost_reasons.append("source_definition")
+            else:
+                boost_multiplier *= 1.35
+                boost_reasons.append("symbol_definition")
+
+        if (
+            (query_intent in {QueryIntent.SEMANTIC, QueryIntent.MIXED} or keyword_path_query)
+            and not skip_test_penalty
+            and not skip_auxiliary_penalty
+            and not skip_generated_penalty
+            and not is_test_file(result.path)
+            and not is_generated_artifact_path(result.path)
+            and not is_auxiliary_reference_path(result.path)
+            and _is_source_implementation_path(result.path)
+        ):
+                semantic_path_boost, overlap_tokens = _source_path_topic_boost(
+                    query,
+                    result.path,
+                    query_intent,
+                )
+                if semantic_path_boost > 1.0:
+                    boost_multiplier *= semantic_path_boost
+                    boost_reasons.append("source_path_topic_overlap")
+
+        multiplier = penalty_multiplier * boost_multiplier
+        if penalty_reasons or boost_reasons:
+            metadata = {
+                **result.metadata,
+                "path_rank_multiplier": multiplier,
+            }
+            if penalty_reasons:
+                metadata["path_penalty_reasons"] = penalty_reasons
+                metadata["path_penalty_multiplier"] = penalty_multiplier
+            if boost_reasons:
+                metadata["path_boost_reasons"] = boost_reasons
+                metadata["path_boost_multiplier"] = boost_multiplier
+            if "source_path_topic_overlap" in boost_reasons and overlap_tokens:
+                metadata["path_boost_overlap_tokens"] = overlap_tokens
+            penalized.append(
+                result.model_copy(
+                    deep=True,
+                    update={
+                        "score": max(0.0, float(result.score) * multiplier),
+                        "metadata": metadata,
+                    },
+                )
+            )
+        else:
+            penalized.append(result)
+
+    penalized.sort(key=lambda r: r.score, reverse=True)
+    return penalized
+
+
+def rebalance_noisy_results(
+    results: List[SearchResult],
+    query: str,
+) -> List[SearchResult]:
+    """Move noisy test/generated/auxiliary results behind implementation hits when safe."""
+    if not results:
+        return []
+
+    query_intent = detect_query_intent(query)
+    skip_test_penalty = query_targets_test_files(query)
+    skip_auxiliary_penalty = query_targets_auxiliary_files(query)
+    skip_generated_penalty = query_targets_generated_files(query)
+    query_topic_tokens = _semantic_query_topic_tokens(query)
+    keyword_path_query = query_intent == QueryIntent.KEYWORD and len(query_topic_tokens) >= 2
+    explicit_feature_query = bool(extract_explicit_path_hints(query))
+    source_oriented_query = (
+        explicit_feature_query
+        or keyword_path_query
+        or (
+            query_intent in {QueryIntent.SEMANTIC, QueryIntent.MIXED}
+            and len(query_topic_tokens) >= 2
+        )
+    )
+    if not source_oriented_query:
+        return results
+
+    max_generated_results = len(results) if skip_generated_penalty else 0
+    max_test_results = len(results) if skip_test_penalty else (0 if explicit_feature_query else 1)
+    max_auxiliary_results = len(results) if skip_auxiliary_penalty else (0 if explicit_feature_query else 1)
+
+    selected: List[SearchResult] = []
+    deferred: List[SearchResult] = []
+    generated_count = 0
+    test_count = 0
+    auxiliary_count = 0
+
+    for result in results:
+        if not skip_generated_penalty and is_generated_artifact_path(result.path):
+            if generated_count >= max_generated_results:
+                deferred.append(result)
+                continue
+            generated_count += 1
+            selected.append(result)
+            continue
+
+        if not skip_test_penalty and is_test_file(result.path):
+            if test_count >= max_test_results:
+                deferred.append(result)
+                continue
+            test_count += 1
+            selected.append(result)
+            continue
+
+        if not skip_auxiliary_penalty and is_auxiliary_reference_path(result.path):
+            if auxiliary_count >= max_auxiliary_results:
+                deferred.append(result)
+                continue
+            auxiliary_count += 1
+            selected.append(result)
+            continue
+
+        selected.append(result)
+
+    return selected + deferred
+
+
 def simple_weighted_fusion(
     results_map: Dict[str, List[SearchResult]],
     weights: Dict[str, float] = None,
@@ -633,10 +1386,16 @@ def cross_encoder_rerank(
             raw_scores = reranker.predict(pairs, batch_size=int(batch_size))
         else:
             return results
-    except Exception:
+    except Exception as exc:
+        logger.debug("Cross-encoder rerank failed; returning original ranking: %s", exc)
         return results
 
     if not raw_scores or len(raw_scores) != rerank_count:
+        logger.debug(
+            "Cross-encoder rerank returned %d scores for %d candidates; returning original ranking",
+            len(raw_scores) if raw_scores else 0,
+            rerank_count,
+        )
         return results
 
     scores = [float(s) for s in raw_scores]
@@ -653,26 +1412,13 @@ def cross_encoder_rerank(
     else:
         probs = [sigmoid(s) for s in scores]
 
+    query_intent = detect_query_intent(query)
+    skip_test_penalty = query_targets_test_files(query)
+    skip_auxiliary_penalty = query_targets_auxiliary_files(query)
+    skip_generated_penalty = query_targets_generated_files(query)
+    keyword_path_query = query_intent == QueryIntent.KEYWORD and len(_semantic_query_topic_tokens(query)) >= 2
     reranked_results: List[SearchResult] = []
 
-    # Helper to detect test files
-    def is_test_file(path: str) -> bool:
-        if not path:
-            return False
-        basename = path.split("/")[-1].split("\\")[-1]
-        return (
-            basename.startswith("test_") or
-            basename.endswith("_test.py") or
-            basename.endswith(".test.ts") or
-            basename.endswith(".test.js") or
-            basename.endswith(".spec.ts") or
-            basename.endswith(".spec.js") or
-            "/tests/" in path or
-            "\\tests\\" in path or
-            "/test/" in path or
-            "\\test\\" in path
-        )
-
     for idx, result in enumerate(results):
         if idx < rerank_count:
             prev_score = float(result.score)
@@ -699,6 +1445,52 @@ def cross_encoder_rerank(
             if test_file_penalty > 0 and is_test_file(result.path):
                 combined_score = combined_score * (1.0 - test_file_penalty)
 
+            cross_encoder_floor_reason = None
+            cross_encoder_floor_score = None
+            cross_encoder_floor_overlap_tokens: List[str] = []
+            if (
+                (query_intent in {QueryIntent.SEMANTIC, QueryIntent.MIXED} or keyword_path_query)
+                and not skip_test_penalty
+                and not skip_auxiliary_penalty
+                and not skip_generated_penalty
+                and not is_test_file(result.path)
+                and not is_generated_artifact_path(result.path)
+                and not is_auxiliary_reference_path(result.path)
+                and _is_source_implementation_path(result.path)
+            ):
+                semantic_path_boost, overlap_tokens = _source_path_topic_boost(
+                    query,
+                    result.path,
+                    query_intent,
+                )
+                if semantic_path_boost > 1.0:
+                    floor_ratio = 0.8 if semantic_path_boost >= 1.35 else 0.75
+                    candidate_floor = prev_score * floor_ratio
+                    if candidate_floor > combined_score:
+                        combined_score = candidate_floor
+                        cross_encoder_floor_reason = (
+                            "keyword_source_path_overlap"
+                            if query_intent == QueryIntent.KEYWORD
+                            else "semantic_source_path_overlap"
+                        )
+                        cross_encoder_floor_score = candidate_floor
+                        cross_encoder_floor_overlap_tokens = overlap_tokens
+
+            metadata = {
+                **result.metadata,
+                "pre_cross_encoder_score": prev_score,
+                "cross_encoder_score": ce_score,
+                "cross_encoder_prob": ce_prob,
+                "cross_encoder_reranked": True,
+            }
+            if cross_encoder_floor_reason is not None:
+                metadata["cross_encoder_floor_reason"] = cross_encoder_floor_reason
+                metadata["cross_encoder_floor_score"] = cross_encoder_floor_score
+                if cross_encoder_floor_overlap_tokens:
+                    metadata["cross_encoder_floor_overlap_tokens"] = (
+                        cross_encoder_floor_overlap_tokens
+                    )
+
             reranked_results.append(
                 SearchResult(
                     path=result.path,
@@ -707,13 +1499,7 @@ def cross_encoder_rerank(
                     content=result.content,
                     symbol=result.symbol,
                     chunk=result.chunk,
-                    metadata={
-                        **result.metadata,
-                        "pre_cross_encoder_score": prev_score,
-                        "cross_encoder_score": ce_score,
-                        "cross_encoder_prob": ce_prob,
-                        "cross_encoder_reranked": True,
-                    },
+                    metadata=metadata,
                     start_line=result.start_line,
                     end_line=result.end_line,
                     symbol_name=result.symbol_name,
diff --git a/codex-lens/src/codexlens/semantic/ann_index.py b/codex-lens/src/codexlens/semantic/ann_index.py
index 0d10e742..f5280c0e 100644
--- a/codex-lens/src/codexlens/semantic/ann_index.py
+++ b/codex-lens/src/codexlens/semantic/ann_index.py
@@ -383,8 +383,37 @@ class ANNIndex:
                 if self._index is None or self._current_count == 0:
                     return [], []  # Empty index
 
-                # Perform kNN search
-                labels, distances = self._index.knn_query(query, k=top_k)
+                effective_k = min(max(int(top_k), 0), self._current_count)
+                if effective_k == 0:
+                    return [], []
+
+                try:
+                    self._index.set_ef(max(self.ef, effective_k))
+                except Exception:
+                    pass
+
+                while True:
+                    try:
+                        labels, distances = self._index.knn_query(query, k=effective_k)
+                        break
+                    except Exception as exc:
+                        if "contiguous 2D array" in str(exc) and effective_k > 1:
+                            next_k = max(1, effective_k // 2)
+                            logger.debug(
+                                "ANN search knn_query failed for k=%d; retrying with k=%d: %s",
+                                effective_k,
+                                next_k,
+                                exc,
+                            )
+                            if next_k == effective_k:
+                                raise
+                            effective_k = next_k
+                            try:
+                                self._index.set_ef(max(self.ef, effective_k))
+                            except Exception:
+                                pass
+                            continue
+                        raise
 
                 # Convert to lists and flatten (knn_query returns 2D arrays)
                 ids = labels[0].tolist()
diff --git a/codex-lens/src/codexlens/semantic/reranker/factory.py b/codex-lens/src/codexlens/semantic/reranker/factory.py
index 5dccc758..459034b5 100644
--- a/codex-lens/src/codexlens/semantic/reranker/factory.py
+++ b/codex-lens/src/codexlens/semantic/reranker/factory.py
@@ -15,7 +15,7 @@ def check_reranker_available(backend: str) -> tuple[bool, str | None]:
 
     Notes:
     - "fastembed" uses fastembed TextCrossEncoder (pip install fastembed>=0.4.0). [Recommended]
-    - "onnx" redirects to "fastembed" for backward compatibility.
+    - "onnx" uses Optimum + ONNX Runtime (pip install onnxruntime optimum[onnxruntime] transformers).
     - "legacy" uses sentence-transformers CrossEncoder (pip install codexlens[reranker-legacy]).
     - "api" uses a remote reranking HTTP API (requires httpx).
     - "litellm" uses `ccw-litellm` for unified access to LLM providers.
@@ -33,10 +33,9 @@ def check_reranker_available(backend: str) -> tuple[bool, str | None]:
         return check_fastembed_reranker_available()
 
     if backend == "onnx":
-        # Redirect to fastembed for backward compatibility
-        from .fastembed_reranker import check_fastembed_reranker_available
+        from .onnx_reranker import check_onnx_reranker_available
 
-        return check_fastembed_reranker_available()
+        return check_onnx_reranker_available()
 
     if backend == "litellm":
         try:
@@ -66,7 +65,7 @@ def check_reranker_available(backend: str) -> tuple[bool, str | None]:
 
 
 def get_reranker(
-    backend: str = "fastembed",
+    backend: str = "onnx",
     model_name: str | None = None,
     *,
     device: str | None = None,
@@ -76,18 +75,18 @@ def get_reranker(
 
     Args:
         backend: Reranker backend to use. Options:
-            - "fastembed": FastEmbed TextCrossEncoder backend (default, recommended)
-            - "onnx": Redirects to fastembed for backward compatibility
+            - "onnx": Optimum + ONNX Runtime backend (default)
+            - "fastembed": FastEmbed TextCrossEncoder backend
             - "api": HTTP API backend (remote providers)
             - "litellm": LiteLLM backend (LLM-based, for API mode)
             - "legacy": sentence-transformers CrossEncoder backend (optional)
         model_name: Model identifier for model-based backends. Defaults depend on backend:
+            - onnx: Xenova/ms-marco-MiniLM-L-6-v2
             - fastembed: Xenova/ms-marco-MiniLM-L-6-v2
-            - onnx: (redirects to fastembed)
             - api: BAAI/bge-reranker-v2-m3 (SiliconFlow)
             - legacy: cross-encoder/ms-marco-MiniLM-L-6-v2
             - litellm: default
-        device: Optional device string for backends that support it (legacy only).
+        device: Optional device string for backends that support it (legacy and onnx).
         **kwargs: Additional backend-specific arguments.
 
     Returns:
@@ -111,16 +110,17 @@ def get_reranker(
         return FastEmbedReranker(model_name=resolved_model_name, **kwargs)
 
     if backend == "onnx":
-        # Redirect to fastembed for backward compatibility
-        ok, err = check_reranker_available("fastembed")
+        ok, err = check_reranker_available("onnx")
         if not ok:
             raise ImportError(err)
 
-        from .fastembed_reranker import FastEmbedReranker
+        from .onnx_reranker import ONNXReranker
 
-        resolved_model_name = (model_name or "").strip() or FastEmbedReranker.DEFAULT_MODEL
-        _ = device  # Device selection is managed via fastembed providers.
-        return FastEmbedReranker(model_name=resolved_model_name, **kwargs)
+        resolved_model_name = (model_name or "").strip() or ONNXReranker.DEFAULT_MODEL
+        effective_kwargs = dict(kwargs)
+        if "use_gpu" not in effective_kwargs and device is not None:
+            effective_kwargs["use_gpu"] = str(device).strip().lower() not in {"cpu", "none"}
+        return ONNXReranker(model_name=resolved_model_name, **effective_kwargs)
 
     if backend == "legacy":
         ok, err = check_reranker_available("legacy")
diff --git a/codex-lens/src/codexlens/semantic/reranker/onnx_reranker.py b/codex-lens/src/codexlens/semantic/reranker/onnx_reranker.py
index 0b22f45e..a56fb953 100644
--- a/codex-lens/src/codexlens/semantic/reranker/onnx_reranker.py
+++ b/codex-lens/src/codexlens/semantic/reranker/onnx_reranker.py
@@ -58,6 +58,38 @@ def _iter_batches(items: Sequence[Any], batch_size: int) -> Iterable[Sequence[An
         yield items[i : i + batch_size]
 
 
+def _normalize_provider_specs(
+    providers: Sequence[Any] | None,
+) -> tuple[list[str], list[dict[str, Any]]]:
+    """Split execution-provider specs into Optimum-compatible names and options."""
+    normalized_providers: list[str] = []
+    normalized_options: list[dict[str, Any]] = []
+
+    for provider in providers or ():
+        provider_name: str | None = None
+        provider_options: dict[str, Any] = {}
+
+        if isinstance(provider, tuple):
+            if provider:
+                provider_name = str(provider[0]).strip()
+            if len(provider) > 1 and isinstance(provider[1], dict):
+                provider_options = dict(provider[1])
+        elif provider is not None:
+            provider_name = str(provider).strip()
+
+        if not provider_name:
+            continue
+
+        normalized_providers.append(provider_name)
+        normalized_options.append(provider_options)
+
+    if not normalized_providers:
+        normalized_providers.append("CPUExecutionProvider")
+        normalized_options.append({})
+
+    return normalized_providers, normalized_options
+
+
 class ONNXReranker(BaseReranker):
     """Cross-encoder reranker using Optimum + ONNX Runtime with lazy loading."""
 
@@ -110,19 +142,21 @@ class ONNXReranker(BaseReranker):
                     use_gpu=self.use_gpu, with_device_options=True
                 )
 
+            provider_names, provider_options = _normalize_provider_specs(self.providers)
+
             # Some Optimum versions accept `providers`, others accept a single `provider`.
             # Prefer passing the full providers list, with a conservative fallback.
             model_kwargs: dict[str, Any] = {}
             try:
                 params = signature(ORTModelForSequenceClassification.from_pretrained).parameters
                 if "providers" in params:
-                    model_kwargs["providers"] = self.providers
+                    model_kwargs["providers"] = provider_names
+                    if "provider_options" in params:
+                        model_kwargs["provider_options"] = provider_options
                 elif "provider" in params:
-                    provider_name = "CPUExecutionProvider"
-                    if self.providers:
-                        first = self.providers[0]
-                        provider_name = first[0] if isinstance(first, tuple) else str(first)
-                    model_kwargs["provider"] = provider_name
+                    model_kwargs["provider"] = provider_names[0]
+                    if "provider_options" in params and provider_options[0]:
+                        model_kwargs["provider_options"] = provider_options[0]
             except Exception:
                 model_kwargs = {}
 
diff --git a/codex-lens/src/codexlens/storage/index_filters.py b/codex-lens/src/codexlens/storage/index_filters.py
new file mode 100644
index 00000000..4f4a163f
--- /dev/null
+++ b/codex-lens/src/codexlens/storage/index_filters.py
@@ -0,0 +1,47 @@
+from __future__ import annotations
+
+from pathlib import Path
+from typing import Iterable, List, Optional, Set
+
+from codexlens.storage.index_tree import DEFAULT_IGNORE_DIRS
+
+
+EXTRA_IGNORED_INDEX_DIRS = frozenset({".workflow"})
+IGNORED_INDEX_DIRS = frozenset({name.casefold() for name in DEFAULT_IGNORE_DIRS | set(EXTRA_IGNORED_INDEX_DIRS)})
+
+
+def is_ignored_index_path(
+    index_path: Path,
+    scan_root: Path,
+    *,
+    ignored_dir_names: Optional[Set[str]] = None,
+) -> bool:
+    """Return True when an index lives under an ignored/generated subtree."""
+
+    ignored = (
+        {name.casefold() for name in ignored_dir_names}
+        if ignored_dir_names is not None
+        else IGNORED_INDEX_DIRS
+    )
+
+    try:
+        relative_parts = index_path.resolve().relative_to(scan_root.resolve()).parts[:-1]
+    except ValueError:
+        return False
+
+    return any(part.casefold() in ignored for part in relative_parts)
+
+
+def filter_index_paths(
+    index_paths: Iterable[Path],
+    scan_root: Path,
+    *,
+    ignored_dir_names: Optional[Set[str]] = None,
+) -> List[Path]:
+    """Filter out discovered indexes that belong to ignored/generated subtrees."""
+
+    return [
+        path
+        for path in index_paths
+        if not is_ignored_index_path(path, scan_root, ignored_dir_names=ignored_dir_names)
+    ]
diff --git a/codex-lens/src/codexlens/storage/index_tree.py b/codex-lens/src/codexlens/storage/index_tree.py
index 8b82f3b2..0a7f7894 100644
--- a/codex-lens/src/codexlens/storage/index_tree.py
+++ b/codex-lens/src/codexlens/storage/index_tree.py
@@ -252,6 +252,18 @@ class IndexTreeBuilder:
         # Collect directories by depth
         dirs_by_depth = self._collect_dirs_by_depth(source_root, languages)
 
+        if force_full:
+            pruned_dirs = self._prune_stale_project_dirs(
+                project_id=project_info.id,
+                source_root=source_root,
+                dirs_by_depth=dirs_by_depth,
+            )
+            if pruned_dirs:
+                self.logger.info(
+                    "Pruned %d stale directory mappings before full rebuild",
+                    len(pruned_dirs),
+                )
+
         if not dirs_by_depth:
             self.logger.warning("No indexable directories found in %s", source_root)
             if global_index is not None:
@@ -450,6 +462,52 @@ class IndexTreeBuilder:
 
     # === Internal Methods ===
 
+    def _prune_stale_project_dirs(
+        self,
+        *,
+        project_id: int,
+        source_root: Path,
+        dirs_by_depth: Dict[int, List[Path]],
+    ) -> List[Path]:
+        """Remove registry mappings for directories no longer included in the index tree."""
+        source_root = source_root.resolve()
+        valid_dirs: Set[Path] = {
+            path.resolve()
+            for paths in dirs_by_depth.values()
+            for path in paths
+        }
+        valid_dirs.add(source_root)
+
+        stale_mappings = []
+        for mapping in self.registry.get_project_dirs(project_id):
+            mapping_path = mapping.source_path.resolve()
+            if mapping_path in valid_dirs:
+                continue
+            try:
+                mapping_path.relative_to(source_root)
+            except ValueError:
+                continue
+            stale_mappings.append(mapping)
+
+        stale_mappings.sort(
+            key=lambda mapping: len(mapping.source_path.resolve().relative_to(source_root).parts),
+            reverse=True,
+        )
+
+        pruned_paths: List[Path] = []
+        for mapping in stale_mappings:
+            try:
+                if self.registry.unregister_dir(mapping.source_path):
+                    pruned_paths.append(mapping.source_path.resolve())
+            except Exception as exc:
+                self.logger.warning(
+                    "Failed to prune stale mapping for %s: %s",
+                    mapping.source_path,
+                    exc,
+                )
+
+        return pruned_paths
+
     def _collect_dirs_by_depth(
         self, source_root: Path, languages: List[str] = None
     ) -> Dict[int, List[Path]]:
@@ -620,8 +678,9 @@ class IndexTreeBuilder:
             "static_graph_enabled": self.config.static_graph_enabled,
             "static_graph_relationship_types": self.config.static_graph_relationship_types,
             "use_astgrep": getattr(self.config, "use_astgrep", False),
-            "ignore_patterns": list(getattr(self.config, "ignore_patterns", [])),
-            "extension_filters": list(getattr(self.config, "extension_filters", [])),
+            "ignore_patterns": list(self.ignore_patterns),
+            "extension_filters": list(self.extension_filters),
+            "incremental": bool(self.incremental),
         }
 
         worker_args = [
@@ -693,6 +752,9 @@ class IndexTreeBuilder:
             # Ensure index directory exists
             index_db_path.parent.mkdir(parents=True, exist_ok=True)
 
+            if not self.incremental:
+                _reset_index_db_files(index_db_path)
+
             # Create directory index
             if self.config.global_symbol_index_enabled:
                 global_index = GlobalSymbolIndex(global_index_db_path, project_id=project_id)
@@ -1100,6 +1162,18 @@ def _matches_extension_filters(path: Path, patterns: List[str], source_root: Opt
     return _matches_path_patterns(path, patterns, source_root)
 
 
+def _reset_index_db_files(index_db_path: Path) -> None:
+    """Best-effort removal of a directory index DB and common SQLite sidecars."""
+    for suffix in ("", "-wal", "-shm", "-journal"):
+        target = Path(f"{index_db_path}{suffix}") if suffix else index_db_path
+        try:
+            target.unlink()
+        except FileNotFoundError:
+            continue
+        except OSError:
+            continue
+
+
 def _build_dir_worker(args: tuple) -> DirBuildResult:
     """Worker function for parallel directory building.
 
@@ -1140,6 +1214,9 @@ def _build_dir_worker(args: tuple) -> DirBuildResult:
             global_index = GlobalSymbolIndex(Path(global_index_db_path), project_id=int(project_id))
             global_index.initialize()
 
+        if not bool(config_dict.get("incremental", True)):
+            _reset_index_db_files(index_db_path)
+
         store = DirIndexStore(index_db_path, config=config, global_index=global_index)
         store.initialize()
 
diff --git a/codex-lens/src/codexlens/storage/registry.py b/codex-lens/src/codexlens/storage/registry.py
index 6a4469ab..af667a90 100644
--- a/codex-lens/src/codexlens/storage/registry.py
+++ b/codex-lens/src/codexlens/storage/registry.py
@@ -591,6 +591,56 @@ class RegistryStore:
 
             return [self._row_to_dir_mapping(row) for row in rows]
 
+    def find_descendant_project_roots(self, source_root: Path) -> List[DirMapping]:
+        """Return root directory mappings for nested projects under ``source_root``."""
+        with self._lock:
+            conn = self._get_connection()
+            source_root_resolved = source_root.resolve()
+            source_root_str = self._normalize_path_for_comparison(source_root_resolved)
+
+            rows = conn.execute(
+                """
+                SELECT dm.*
+                FROM dir_mapping dm
+                INNER JOIN projects p ON p.id = dm.project_id
+                WHERE dm.source_path = p.source_root
+                  AND p.source_root LIKE ?
+                ORDER BY p.source_root ASC
+                """,
+                (f"{source_root_str}%",),
+            ).fetchall()
+
+            descendant_roots: List[DirMapping] = []
+            normalized_root_path = Path(source_root_str)
+
+            for row in rows:
+                mapping = self._row_to_dir_mapping(row)
+                normalized_mapping_path = Path(
+                    self._normalize_path_for_comparison(mapping.source_path.resolve())
+                )
+
+                if normalized_mapping_path == normalized_root_path:
+                    continue
+
+                try:
+                    normalized_mapping_path.relative_to(normalized_root_path)
+                except ValueError:
+                    continue
+
+                descendant_roots.append(mapping)
+
+            descendant_roots.sort(
+                key=lambda mapping: (
+                    len(
+                        mapping.source_path.resolve().relative_to(
+                            source_root_resolved
+                        ).parts
+                    ),
+                    self._normalize_path_for_comparison(mapping.source_path.resolve()),
+                )
+            )
+            return descendant_roots
+
     def update_dir_stats(self, source_path: Path, files_count: int) -> None:
         """Update directory statistics.
 
diff --git a/codex-lens/tests/conftest.py b/codex-lens/tests/conftest.py
index 90691d35..40915fff 100644
--- a/codex-lens/tests/conftest.py
+++ b/codex-lens/tests/conftest.py
@@ -11,12 +11,25 @@ Common Fixtures:
 - sample_code_files: Factory for creating sample code files
 """
 
-import pytest
-import tempfile
-import shutil
-from pathlib import Path
-from typing import Dict, Any
 import sqlite3
+import shutil
+import tempfile
+import warnings
+from pathlib import Path
+from typing import Any, Dict
+
+import pytest
+
+warnings.filterwarnings(
+    "ignore",
+    message=r"'BaseCommand' is deprecated and will be removed in Click 9\.0\..*",
+    category=DeprecationWarning,
+)
+warnings.filterwarnings(
+    "ignore",
+    message=r"The '__version__' attribute is deprecated and will be removed in Click 9\.1\..*",
+    category=DeprecationWarning,
+)
 
 
 @pytest.fixture
diff --git a/codex-lens/tests/test_ann_index.py b/codex-lens/tests/test_ann_index.py
index 6c8ce17d..964f7a1a 100644
--- a/codex-lens/tests/test_ann_index.py
+++ b/codex-lens/tests/test_ann_index.py
@@ -98,6 +98,23 @@ class TestANNIndex:
         assert ids[0] == 1  # ID of first vector
         assert distances[0] < 0.01  # Very small distance (almost identical)
 
+    @pytest.mark.skipif(
+        not _hnswlib_available(),
+        reason="hnswlib not installed"
+    )
+    def test_search_clamps_top_k_to_available_vectors(self, temp_db, sample_vectors, sample_ids):
+        """Search should clamp top_k to the loaded vector count."""
+        from codexlens.semantic.ann_index import ANNIndex
+
+        index = ANNIndex(temp_db, dim=384)
+        index.add_vectors(sample_ids[:3], sample_vectors[:3])
+
+        ids, distances = index.search(sample_vectors[0], top_k=10)
+
+        assert len(ids) == 3
+        assert len(distances) == 3
+        assert ids[0] == 1
+
     @pytest.mark.skipif(
         not _hnswlib_available(),
         reason="hnswlib not installed"
diff --git a/codex-lens/tests/test_chain_search.py b/codex-lens/tests/test_chain_search.py
index de46fe41..3e498e43 100644
--- a/codex-lens/tests/test_chain_search.py
+++ b/codex-lens/tests/test_chain_search.py
@@ -1,14 +1,26 @@
 import logging
 import os
+import sqlite3
 import tempfile
 from pathlib import Path
 from unittest.mock import MagicMock
 
 import pytest
 
-from codexlens.config import Config
-from codexlens.entities import Symbol
-from codexlens.search.chain_search import ChainSearchEngine, SearchOptions
+from codexlens.config import (
+    BINARY_VECTORS_MMAP_NAME,
+    Config,
+    VECTORS_HNSW_NAME,
+    VECTORS_META_DB_NAME,
+)
+from codexlens.entities import SearchResult, Symbol
+import codexlens.search.chain_search as chain_search_module
+from codexlens.search.chain_search import (
+    ChainSearchEngine,
+    ChainSearchResult,
+    SearchOptions,
+    SearchStats,
+)
 from codexlens.storage.global_index import GlobalSymbolIndex
 from codexlens.storage.path_mapper import PathMapper
 from codexlens.storage.registry import RegistryStore
@@ -189,3 +201,1434 @@ def test_vector_warmup_uses_embedding_config(monkeypatch: pytest.MonkeyPatch, te
             "use_gpu": False,
         }
     ]
+
+
+def test_search_single_index_passes_config_to_hybrid_engine(
+    monkeypatch: pytest.MonkeyPatch, temp_paths: Path
+) -> None:
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(data_dir=temp_paths / "data", embedding_backend="fastembed", embedding_model="code")
+
+    engine = ChainSearchEngine(registry, mapper, config=config)
+    index_path = temp_paths / "indexes" / "project" / "_index.db"
+    index_path.parent.mkdir(parents=True, exist_ok=True)
+    index_path.write_bytes(b"\x00" * 128)
+
+    captured: dict[str, object] = {}
+
+    class FakeHybridSearchEngine:
+        def __init__(self, *, weights=None, config=None):
+            captured["weights"] = weights
+            captured["config"] = config
+
+        def search(self, *_args, **_kwargs):
+            return [SearchResult(path="src/app.py", score=0.9, excerpt="hit")]
+
+    monkeypatch.setattr(chain_search_module, "HybridSearchEngine", FakeHybridSearchEngine)
+
+    results = engine._search_single_index(
+        index_path,
+        "auth flow",
+        limit=5,
+        hybrid_mode=True,
+        enable_vector=True,
+        hybrid_weights={"vector": 1.0},
+    )
+
+    assert captured["config"] is config
+    assert captured["weights"] == {"vector": 1.0}
+    assert len(results) == 1
+    assert results[0].path == "src/app.py"
+
+
+def test_search_parallel_reuses_shared_hybrid_engine(
+    monkeypatch: pytest.MonkeyPatch,
+    temp_paths: Path,
+) -> None:
+    from concurrent.futures import Future
+
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(data_dir=temp_paths / "data")
+
+    engine = ChainSearchEngine(registry, mapper, config=config)
+    index_root = temp_paths / "indexes" / "project"
+    index_a = index_root / "src" / "_index.db"
+    index_b = index_root / "tests" / "_index.db"
+    index_a.parent.mkdir(parents=True, exist_ok=True)
+    index_b.parent.mkdir(parents=True, exist_ok=True)
+    index_a.write_bytes(b"\x00" * 128)
+    index_b.write_bytes(b"\x00" * 128)
+
+    created_engines: list[object] = []
+    search_calls: list[tuple[object, Path]] = []
+
+    class FakeHybridSearchEngine:
+        def __init__(self, *, weights=None, config=None):
+            self.weights = weights
+            self.config = config
+            created_engines.append(self)
+
+        def search(self, index_path, *_args, **_kwargs):
+            search_calls.append((self, index_path))
+            return [SearchResult(path=str(index_path), score=0.9, excerpt="hit")]
+
+    class ImmediateExecutor:
+        def submit(self, fn, *args):
+            future: Future = Future()
+            try:
+                future.set_result(fn(*args))
+            except Exception as exc:
+                future.set_exception(exc)
+            return future
+
+    monkeypatch.setattr(chain_search_module, "HybridSearchEngine", FakeHybridSearchEngine)
+    monkeypatch.setattr(engine, "_get_executor", lambda _workers: ImmediateExecutor())
+
+    results, stats = engine._search_parallel(
+        [index_a, index_b],
+        "auth flow",
+        SearchOptions(
+            hybrid_mode=True,
+            enable_vector=True,
+            limit_per_dir=5,
+            hybrid_weights={"vector": 1.0},
+        ),
+    )
+
+    assert stats.errors == []
+    assert len(created_engines) == 1
+    assert [path for _, path in search_calls] == [index_a, index_b]
+    assert all(shared is created_engines[0] for shared, _ in search_calls)
+    assert len(results) == 2
+
+
+def test_search_injects_feature_query_anchors_into_merge(
+    monkeypatch: pytest.MonkeyPatch,
+    temp_paths: Path,
+) -> None:
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(data_dir=temp_paths / "data")
+    engine = ChainSearchEngine(registry, mapper, config=config)
+
+    source_path = temp_paths / "project"
+    start_index = temp_paths / "indexes" / "project" / "_index.db"
+    start_index.parent.mkdir(parents=True, exist_ok=True)
+    start_index.write_text("", encoding="utf-8")
+
+    feature_path = str(source_path / "src" / "tools" / "smart-search.ts")
+    platform_path = str(source_path / "src" / "utils" / "path-resolver.ts")
+    anchor_result = SearchResult(
+        path=feature_path,
+        score=8.0,
+        excerpt="smart search anchor",
+        metadata={"feature_query_hint": "smart search"},
+    )
+
+    monkeypatch.setattr(engine, "_find_start_index", lambda _source_path: start_index)
+    monkeypatch.setattr(
+        engine,
+        "_collect_index_paths",
+        lambda _start_index, _options: [start_index],
+    )
+    monkeypatch.setattr(
+        engine,
+        "_search_parallel",
+        lambda *_args, **_kwargs: (
+            [
+                SearchResult(
+                    path=platform_path,
+                    score=0.9,
+                    excerpt="platform hit",
+                )
+            ],
+            SearchStats(),
+        ),
+    )
+    monkeypatch.setattr(engine, "_search_symbols_parallel", lambda *_args, **_kwargs: [])
+    collected_queries: list[str] = []
+    monkeypatch.setattr(
+        engine,
+        "_collect_query_feature_anchor_results",
+        lambda query, *_args, **_kwargs: (
+            collected_queries.append(query),
+            [anchor_result],
+        )[1],
+    )
+
+    result = engine.search(
+        "parse CodexLens JSON output strip ANSI smart_search",
+        source_path,
+        options=SearchOptions(
+            total_limit=5,
+            hybrid_mode=True,
+            enable_fuzzy=False,
+            enable_vector=True,
+        ),
+    )
+
+    assert collected_queries == ["parse CodexLens JSON output strip ANSI smart_search"]
+    result_by_path = {item.path: item for item in result.results}
+    assert feature_path in result_by_path
+    assert platform_path in result_by_path
+    assert result_by_path[feature_path].metadata["feature_query_anchor"] is True
+    assert result_by_path[feature_path].metadata["feature_query_hint"] == "smart search"
+
+
+def test_group_index_paths_by_dense_root(temp_paths: Path) -> None:
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    engine = ChainSearchEngine(registry, mapper, config=Config(data_dir=temp_paths / "data"))
+
+    dense_root_a = temp_paths / "indexes" / "project-a"
+    dense_root_b = temp_paths / "indexes" / "project-b"
+    orphan_root = temp_paths / "indexes" / "orphan" / "pkg"
+
+    dense_root_a.mkdir(parents=True, exist_ok=True)
+    dense_root_b.mkdir(parents=True, exist_ok=True)
+    orphan_root.mkdir(parents=True, exist_ok=True)
+    (dense_root_a / VECTORS_HNSW_NAME).write_bytes(b"a")
+    (dense_root_b / VECTORS_HNSW_NAME).write_bytes(b"b")
+
+    index_a = dense_root_a / "src" / "_index.db"
+    index_b = dense_root_b / "tests" / "_index.db"
+    orphan_index = orphan_root / "_index.db"
+    index_a.parent.mkdir(parents=True, exist_ok=True)
+    index_b.parent.mkdir(parents=True, exist_ok=True)
+    index_a.write_text("", encoding="utf-8")
+    index_b.write_text("", encoding="utf-8")
+    orphan_index.write_text("", encoding="utf-8")
+
+    roots, ungrouped = engine._group_index_paths_by_dense_root(
+        [index_a, orphan_index, index_b]
+    )
+
+    assert roots == [dense_root_a, dense_root_b]
+    assert ungrouped == [orphan_index]
+    assert engine._find_nearest_dense_hnsw_root(index_a.parent) == dense_root_a
+    assert engine._find_nearest_dense_hnsw_root(orphan_index.parent) is None
+
+
+def test_stage1_binary_search_merges_multiple_centralized_roots(
+    monkeypatch: pytest.MonkeyPatch,
+    temp_paths: Path,
+) -> None:
+    import numpy as np
+
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(data_dir=temp_paths / "data", embedding_use_gpu=False)
+    engine = ChainSearchEngine(registry, mapper, config=config)
+
+    root_a = temp_paths / "indexes" / "project-a"
+    root_b = temp_paths / "indexes" / "project-b"
+    for root in (root_a, root_b):
+        root.mkdir(parents=True, exist_ok=True)
+        (root / BINARY_VECTORS_MMAP_NAME).write_bytes(b"binary")
+        (root / VECTORS_META_DB_NAME).write_bytes(b"meta")
+
+    index_a = root_a / "src" / "_index.db"
+    index_b = root_b / "src" / "_index.db"
+    index_a.parent.mkdir(parents=True, exist_ok=True)
+    index_b.parent.mkdir(parents=True, exist_ok=True)
+    index_a.write_text("", encoding="utf-8")
+    index_b.write_text("", encoding="utf-8")
+
+    class FakeBinarySearcher:
+        def __init__(self, root: Path) -> None:
+            self.root = root
+            self.backend = "fastembed"
+            self.model = None
+            self.model_profile = "code"
+
+        def search(self, _query_dense, top_k: int):
+            return [(1, 8)] if self.root == root_a else [(2, 16)]
+
+    class FakeEmbedder:
+        def embed_to_numpy(self, _queries):
+            return np.ones((1, 4), dtype=np.float32)
+
+    class FakeVectorMetadataStore:
+        def __init__(self, path: Path) -> None:
+            self.path = Path(path)
+
+        def get_chunks_by_ids(self, chunk_ids):
+            return [
+                {
+                    "id": chunk_id,
+                    "file_path": str(self.path.parent / f"file{chunk_id}.py"),
+                    "content": f"chunk {chunk_id}",
+                    "metadata": "{\"start_line\": 1, \"end_line\": 2}",
+                    "category": "code",
+                }
+                for chunk_id in chunk_ids
+            ]
+
+    import codexlens.semantic.embedder as embedder_module
+    from codexlens.search.chain_search import SearchStats
+
+    monkeypatch.setattr(
+        engine,
+        "_get_centralized_binary_searcher",
+        lambda root: FakeBinarySearcher(root),
+    )
+    monkeypatch.setattr(embedder_module, "get_embedder", lambda **_kwargs: FakeEmbedder())
+    monkeypatch.setattr(chain_search_module, "VectorMetadataStore", FakeVectorMetadataStore)
+
+    coarse_results, stage2_root = engine._stage1_binary_search(
+        "binary query",
+        [index_a, index_b],
+        coarse_k=5,
+        stats=SearchStats(),
+        index_root=index_a.parent,
+    )
+
+    assert stage2_root is None
+    assert len(coarse_results) == 2
+    assert {Path(result.path).name for result in coarse_results} == {"file1.py", "file2.py"}
+
+
+def test_stage1_binary_search_keeps_duplicate_chunk_ids_isolated_per_root(
+    monkeypatch: pytest.MonkeyPatch,
+    temp_paths: Path,
+) -> None:
+    import numpy as np
+
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(data_dir=temp_paths / "data", embedding_use_gpu=False)
+    engine = ChainSearchEngine(registry, mapper, config=config)
+
+    root_a = temp_paths / "indexes" / "project-a"
+    root_b = temp_paths / "indexes" / "project-b"
+    for root in (root_a, root_b):
+        root.mkdir(parents=True, exist_ok=True)
+        (root / BINARY_VECTORS_MMAP_NAME).write_bytes(b"binary")
+        (root / VECTORS_META_DB_NAME).write_bytes(b"meta")
+
+    index_a = root_a / "src" / "_index.db"
+    index_b = root_b / "src" / "_index.db"
+    index_a.parent.mkdir(parents=True, exist_ok=True)
+    index_b.parent.mkdir(parents=True, exist_ok=True)
+    index_a.write_text("", encoding="utf-8")
+    index_b.write_text("", encoding="utf-8")
+
+    class FakeBinarySearcher:
+        def __init__(self, root: Path) -> None:
+            self.root = root
+            self.backend = "fastembed"
+            self.model = None
+            self.model_profile = "code"
+
+        def search(self, _query_dense, top_k: int):
+            return [(1, 8)] if self.root == root_a else [(1, 16)]
+
+    class FakeEmbedder:
+        def embed_to_numpy(self, _queries):
+            return np.ones((1, 4), dtype=np.float32)
+
+    class FakeVectorMetadataStore:
+        def __init__(self, path: Path) -> None:
+            self.path = Path(path)
+
+        def get_chunks_by_ids(self, chunk_ids):
+            return [
+                {
+                    "id": chunk_id,
+                    "file_path": str(self.path.parent / f"{self.path.parent.name}-file{chunk_id}.py"),
+                    "content": f"chunk {self.path.parent.name}-{chunk_id}",
+                    "metadata": "{\"start_line\": 1, \"end_line\": 2}",
+                    "category": "code",
+                }
+                for chunk_id in chunk_ids
+            ]
+
+    import codexlens.semantic.embedder as embedder_module
+    from codexlens.search.chain_search import SearchStats
+
+    monkeypatch.setattr(
+        engine,
+        "_get_centralized_binary_searcher",
+        lambda root: FakeBinarySearcher(root),
+    )
+    monkeypatch.setattr(embedder_module, "get_embedder", lambda **_kwargs: FakeEmbedder())
+    monkeypatch.setattr(chain_search_module, "VectorMetadataStore", FakeVectorMetadataStore)
+
+    coarse_results, stage2_root = engine._stage1_binary_search(
+        "binary query",
+        [index_a, index_b],
+        coarse_k=5,
+        stats=SearchStats(),
+        index_root=index_a.parent,
+    )
+
+    assert stage2_root is None
+    scores_by_name = {Path(result.path).name: result.score for result in coarse_results}
+    assert scores_by_name["project-a-file1.py"] == pytest.approx(1.0 - (8.0 / 256.0))
+    assert scores_by_name["project-b-file1.py"] == pytest.approx(1.0 - (16.0 / 256.0))
+
+
+
+def test_collect_index_paths_includes_nested_registered_project_roots(
+    temp_paths: Path,
+) -> None:
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    engine = ChainSearchEngine(registry, mapper, config=Config(data_dir=temp_paths / "data"))
+
+    workspace_root = temp_paths / "workspace"
+    child_root = workspace_root / "packages" / "child"
+    ignored_root = workspace_root / "dist" / "generated"
+
+    workspace_index = mapper.source_to_index_db(workspace_root)
+    child_index = mapper.source_to_index_db(child_root)
+    ignored_index = mapper.source_to_index_db(ignored_root)
+
+    for index_path in (workspace_index, child_index, ignored_index):
+        index_path.parent.mkdir(parents=True, exist_ok=True)
+        index_path.write_text("", encoding="utf-8")
+
+    workspace_project = registry.register_project(
+        workspace_root,
+        mapper.source_to_index_dir(workspace_root),
+    )
+    child_project = registry.register_project(
+        child_root,
+        mapper.source_to_index_dir(child_root),
+    )
+    ignored_project = registry.register_project(
+        ignored_root,
+        mapper.source_to_index_dir(ignored_root),
+    )
+
+    registry.register_dir(
+        workspace_project.id,
+        workspace_root,
+        workspace_index,
+        depth=0,
+    )
+    registry.register_dir(
+        child_project.id,
+        child_root,
+        child_index,
+        depth=0,
+    )
+    registry.register_dir(
+        ignored_project.id,
+        ignored_root,
+        ignored_index,
+        depth=0,
+    )
+
+    collected = engine._collect_index_paths(workspace_index, depth=-1)
+
+    assert collected == [workspace_index, child_index]
+
+
+def test_collect_index_paths_respects_depth_for_nested_registered_project_roots(
+    temp_paths: Path,
+) -> None:
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    engine = ChainSearchEngine(registry, mapper, config=Config(data_dir=temp_paths / "data"))
+
+    workspace_root = temp_paths / "workspace"
+    direct_child_root = workspace_root / "apps"
+    deep_child_root = workspace_root / "packages" / "deep" / "child"
+
+    workspace_index = mapper.source_to_index_db(workspace_root)
+    direct_child_index = mapper.source_to_index_db(direct_child_root)
+    deep_child_index = mapper.source_to_index_db(deep_child_root)
+
+    for index_path in (workspace_index, direct_child_index, deep_child_index):
+        index_path.parent.mkdir(parents=True, exist_ok=True)
+        index_path.write_text("", encoding="utf-8")
+
+    workspace_project = registry.register_project(
+        workspace_root,
+        mapper.source_to_index_dir(workspace_root),
+    )
+    direct_child_project = registry.register_project(
+        direct_child_root,
+        mapper.source_to_index_dir(direct_child_root),
+    )
+    deep_child_project = registry.register_project(
+        deep_child_root,
+        mapper.source_to_index_dir(deep_child_root),
+    )
+
+    registry.register_dir(workspace_project.id, workspace_root, workspace_index, depth=0)
+    registry.register_dir(
+        direct_child_project.id,
+        direct_child_root,
+        direct_child_index,
+        depth=0,
+    )
+    registry.register_dir(
+        deep_child_project.id,
+        deep_child_root,
+        deep_child_index,
+        depth=0,
+    )
+
+    collected = engine._collect_index_paths(workspace_index, depth=1)
+
+    assert collected == [workspace_index, direct_child_index]
+
+
+def test_binary_rerank_cascade_search_merges_multiple_centralized_roots(
+    monkeypatch: pytest.MonkeyPatch,
+    temp_paths: Path,
+) -> None:
+    import numpy as np
+
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(data_dir=temp_paths / "data", embedding_use_gpu=False)
+    engine = ChainSearchEngine(registry, mapper, config=config)
+
+    root_a = temp_paths / "indexes" / "project-a"
+    root_b = temp_paths / "indexes" / "project-b"
+    for root in (root_a, root_b):
+        root.mkdir(parents=True, exist_ok=True)
+        (root / BINARY_VECTORS_MMAP_NAME).write_bytes(b"binary")
+        (root / VECTORS_META_DB_NAME).write_bytes(b"meta")
+
+    index_a = root_a / "src" / "_index.db"
+    index_b = root_b / "src" / "_index.db"
+    index_a.parent.mkdir(parents=True, exist_ok=True)
+    index_b.parent.mkdir(parents=True, exist_ok=True)
+    index_a.write_text("", encoding="utf-8")
+    index_b.write_text("", encoding="utf-8")
+
+    class FakeBinarySearcher:
+        def __init__(self, root: Path) -> None:
+            self.root = root
+            self.backend = "fastembed"
+            self.model = None
+            self.model_profile = "code"
+
+        def search(self, _query_dense, top_k: int):
+            return [(1, 8)] if self.root == root_a else [(2, 16)]
+
+    class FakeEmbedder:
+        def embed_to_numpy(self, _queries):
+            return np.ones((1, 4), dtype=np.float32)
+
+    class FakeVectorMetadataStore:
+        def __init__(self, path: Path) -> None:
+            self.path = Path(path)
+
+        def get_chunks_by_ids(self, chunk_ids):
+            return [
+                {
+                    "chunk_id": chunk_id,
+                    "file_path": str(self.path.parent / f"file{chunk_id}.py"),
+                    "content": f"chunk {chunk_id}",
+                    "metadata": "{}",
+                    "category": "code",
+                }
+                for chunk_id in chunk_ids
+            ]
+
+    import codexlens.semantic.embedder as embedder_module
+
+    monkeypatch.setattr(engine, "_find_start_index", lambda _source_path: index_a)
+    monkeypatch.setattr(engine, "_collect_index_paths", lambda _start_index, _depth: [index_a, index_b])
+    monkeypatch.setattr(
+        engine,
+        "_get_centralized_binary_searcher",
+        lambda root: FakeBinarySearcher(root),
+    )
+    monkeypatch.setattr(embedder_module, "get_embedder", lambda **_kwargs: FakeEmbedder())
+    monkeypatch.setattr(chain_search_module, "VectorMetadataStore", FakeVectorMetadataStore)
+    monkeypatch.setattr(engine, "_cross_encoder_rerank", lambda _query, results, top_k: results[:top_k])
+    monkeypatch.setattr(engine, "search", lambda *_args, **_kwargs: (_ for _ in ()).throw(AssertionError("unexpected fallback")))
+
+    result = engine.binary_rerank_cascade_search(
+        "binary query",
+        index_a.parent,
+        k=5,
+        coarse_k=5,
+    )
+
+    assert len(result.results) == 2
+    assert {Path(item.path).name for item in result.results} == {"file1.py", "file2.py"}
+
+
+def test_dense_rerank_cascade_search_overfetches_and_applies_path_penalties(
+    monkeypatch: pytest.MonkeyPatch,
+    temp_paths: Path,
+) -> None:
+    import numpy as np
+    import codexlens.semantic.ann_index as ann_index_module
+
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(
+        data_dir=temp_paths / "data",
+        embedding_use_gpu=False,
+        reranker_top_k=3,
+        test_file_penalty=0.35,
+        generated_file_penalty=0.35,
+    )
+    engine = ChainSearchEngine(registry, mapper, config=config)
+
+    dense_root = temp_paths / "indexes" / "project"
+    dense_root.mkdir(parents=True, exist_ok=True)
+    (dense_root / VECTORS_HNSW_NAME).write_bytes(b"hnsw")
+
+    meta_db_path = dense_root / VECTORS_META_DB_NAME
+    conn = sqlite3.connect(meta_db_path)
+    conn.execute(
+        """
+        CREATE TABLE chunk_metadata (
+            chunk_id INTEGER PRIMARY KEY,
+            file_path TEXT NOT NULL,
+            content TEXT NOT NULL,
+            start_line INTEGER,
+            end_line INTEGER
+        )
+        """
+    )
+    conn.executemany(
+        """
+        INSERT INTO chunk_metadata (chunk_id, file_path, content, start_line, end_line)
+        VALUES (?, ?, ?, ?, ?)
+        """,
+        [
+            (
+                1,
+                "project/tests/test_auth.py",
+                "def test_auth_flow():\n    pass",
+                1,
+                2,
+            ),
+            (
+                2,
+                "project/src/auth.py",
+                "def auth_flow():\n    return True",
+                1,
+                2,
+            ),
+            (
+                3,
+                "project/dist/bundle.js",
+                "function authFlow(){return true;}",
+                1,
+                1,
+            ),
+        ],
+    )
+    conn.commit()
+    conn.close()
+
+    index_path = dense_root / "src" / "_index.db"
+    index_path.parent.mkdir(parents=True, exist_ok=True)
+    index_path.write_text("", encoding="utf-8")
+
+    class FakeANNIndex:
+        def __init__(self, root: Path, dim: int) -> None:
+            self.root = root
+            self.dim = dim
+
+        @classmethod
+        def create_central(cls, *, index_root: Path, dim: int):
+            return cls(index_root, dim)
+
+        def load(self) -> bool:
+            return True
+
+        def count(self) -> int:
+            return 3
+
+        def search(self, _query_dense, top_k: int):
+            ids = [1, 2, 3][:top_k]
+            distances = [0.01, 0.02, 0.03][:top_k]
+            return ids, distances
+
+    rerank_calls: list[int] = []
+
+    def fake_cross_encoder(_query: str, results: list[SearchResult], top_k: int):
+        rerank_calls.append(top_k)
+        return results[:top_k]
+
+    monkeypatch.setattr(engine, "_find_start_index", lambda _source_path: index_path)
+    monkeypatch.setattr(engine, "_collect_index_paths", lambda _start_index, _depth: [index_path])
+    monkeypatch.setattr(engine, "_embed_dense_query", lambda *_args, **_kwargs: np.ones(4, dtype=np.float32))
+    monkeypatch.setattr(engine, "_cross_encoder_rerank", fake_cross_encoder)
+    monkeypatch.setattr(
+        engine,
+        "search",
+        lambda *_args, **_kwargs: (_ for _ in ()).throw(AssertionError("unexpected fallback")),
+    )
+    monkeypatch.setattr(ann_index_module, "ANNIndex", FakeANNIndex)
+
+    result = engine.dense_rerank_cascade_search(
+        "auth",
+        index_path.parent,
+        k=1,
+        coarse_k=3,
+    )
+
+    assert rerank_calls == [3]
+    assert len(result.results) == 1
+    assert result.results[0].path.endswith("src\\auth.py") or result.results[0].path.endswith("src/auth.py")
+    assert result.results[0].metadata == {}
+
+
+def test_collect_query_feature_anchor_results_uses_explicit_file_hints(
+    monkeypatch: pytest.MonkeyPatch,
+    temp_paths: Path,
+) -> None:
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(data_dir=temp_paths / "data", embedding_use_gpu=False)
+    engine = ChainSearchEngine(registry, mapper, config=config)
+
+    recorded_queries: list[str] = []
+
+    def fake_search(query: str, _source_path: Path, options: SearchOptions | None = None):
+        recorded_queries.append(query)
+        return ChainSearchResult(
+            query=query,
+            results=[
+                SearchResult(
+                    path="/repo/src/tools/smart-search.ts",
+                    score=8.7,
+                    excerpt="smart search path anchor",
+                ),
+                SearchResult(
+                    path="/repo/src/tools/codex-lens-lsp.ts",
+                    score=7.4,
+                    excerpt="platform term overlap",
+                ),
+            ],
+            symbols=[],
+            stats=SearchStats(),
+        )
+
+    monkeypatch.setattr(engine, "search", fake_search)
+
+    anchors = engine._collect_query_feature_anchor_results(
+        "parse CodexLens JSON output strip ANSI smart_search",
+        temp_paths,
+        SearchOptions(),
+        limit=4,
+    )
+
+    assert recorded_queries == ["smart search"]
+    assert [Path(result.path).name for result in anchors] == ["smart-search.ts"]
+    assert anchors[0].metadata["feature_query_anchor"] is True
+    assert anchors[0].metadata["feature_query_hint_tokens"] == ["smart", "search"]
+
+
+def test_collect_query_feature_anchor_results_falls_back_to_full_lexical_query(
+    monkeypatch: pytest.MonkeyPatch,
+    temp_paths: Path,
+) -> None:
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(data_dir=temp_paths / "data", embedding_use_gpu=False)
+    engine = ChainSearchEngine(registry, mapper, config=config)
+
+    recorded_calls: list[tuple[str, bool]] = []
+    full_query = "EMBEDDING_BACKEND and RERANKER_BACKEND environment variables"
+
+    def fake_search(query: str, _source_path: Path, options: SearchOptions | None = None):
+        recorded_calls.append((query, bool(options.inject_feature_anchors) if options else True))
+        if query == full_query:
+            return ChainSearchResult(
+                query=query,
+                results=[
+                    SearchResult(
+                        path="/repo/src/codexlens/env_config.py",
+                        score=8.5,
+                        excerpt="ENV vars",
+                    ),
+                    SearchResult(
+                        path="/repo/src/codexlens/config.py",
+                        score=8.1,
+                        excerpt="backend config",
+                    ),
+                ],
+                symbols=[],
+                stats=SearchStats(),
+            )
+
+        return ChainSearchResult(
+            query=query,
+            results=[
+                SearchResult(
+                    path="/repo/src/codexlens/env_config.py",
+                    score=7.0,
+                    excerpt="hint candidate",
+                )
+            ],
+            symbols=[],
+            stats=SearchStats(),
+        )
+
+    monkeypatch.setattr(engine, "search", fake_search)
+
+    anchors = engine._collect_query_feature_anchor_results(
+        full_query,
+        temp_paths,
+        SearchOptions(),
+        limit=2,
+    )
+
+    assert recorded_calls == [
+        ("embedding backend", False),
+        ("reranker backend", False),
+        (full_query, False),
+    ]
+    assert [Path(result.path).name for result in anchors] == ["env_config.py", "config.py"]
+    assert anchors[0].metadata["feature_query_seed_kind"] == "lexical_query"
+    assert anchors[0].metadata["feature_query_hint"] == full_query
+
+
+def test_stage3_cluster_prune_preserves_feature_query_anchors(temp_paths: Path) -> None:
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(data_dir=temp_paths / "data", embedding_use_gpu=False)
+    config.staged_clustering_strategy = "score"
+    engine = ChainSearchEngine(registry, mapper, config=config)
+
+    anchor = SearchResult(
+        path="/repo/src/tools/smart-search.ts",
+        score=0.02,
+        excerpt="parse JSON output and strip ANSI",
+        metadata={
+            "feature_query_anchor": True,
+            "feature_query_hint": "smart search",
+            "feature_query_hint_tokens": ["smart", "search"],
+        },
+    )
+    others = [
+        SearchResult(
+            path=f"/repo/src/feature-{index}.ts",
+            score=0.9 - (0.05 * index),
+            excerpt="generic feature implementation",
+        )
+        for index in range(6)
+    ]
+
+    clustered = engine._stage3_cluster_prune(
+        [anchor, *others],
+        target_count=4,
+        query="parse CodexLens JSON output strip ANSI smart_search",
+    )
+
+    assert len(clustered) == 4
+    assert any(Path(result.path).name == "smart-search.ts" for result in clustered)
+
+
+def test_dense_rerank_cascade_search_interleaves_mixed_embedding_groups(
+    monkeypatch: pytest.MonkeyPatch,
+    temp_paths: Path,
+) -> None:
+    import numpy as np
+    import codexlens.semantic.ann_index as ann_index_module
+
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(data_dir=temp_paths / "data", embedding_use_gpu=False)
+    engine = ChainSearchEngine(registry, mapper, config=config)
+
+    root_a = temp_paths / "indexes" / "project-a"
+    root_b = temp_paths / "indexes" / "project-b"
+    for root in (root_a, root_b):
+        root.mkdir(parents=True, exist_ok=True)
+        (root / VECTORS_HNSW_NAME).write_bytes(b"hnsw")
+
+    for meta_db_path, rows in (
+        (
+            root_a / VECTORS_META_DB_NAME,
+            [
+                (1, str(root_a / "src" / "a.py"), "def a():\n    return 1", 1, 2),
+                (3, str(root_a / "src" / "a2.py"), "def a2():\n    return 2", 1, 2),
+            ],
+        ),
+        (
+            root_b / VECTORS_META_DB_NAME,
+            [
+                (2, str(root_b / "src" / "b.py"), "def b():\n    return 3", 1, 2),
+            ],
+        ),
+    ):
+        conn = sqlite3.connect(meta_db_path)
+        conn.execute(
+            """
+            CREATE TABLE chunk_metadata (
+                chunk_id INTEGER PRIMARY KEY,
+                file_path TEXT NOT NULL,
+                content TEXT NOT NULL,
+                start_line INTEGER,
+                end_line INTEGER
+            )
+            """
+        )
+        conn.executemany(
+            """
+            INSERT INTO chunk_metadata (chunk_id, file_path, content, start_line, end_line)
+            VALUES (?, ?, ?, ?, ?)
+            """,
+            rows,
+        )
+        conn.commit()
+        conn.close()
+
+    index_a = root_a / "src" / "_index.db"
+    index_b = root_b / "src" / "_index.db"
+    index_a.parent.mkdir(parents=True, exist_ok=True)
+    index_b.parent.mkdir(parents=True, exist_ok=True)
+    index_a.write_text("", encoding="utf-8")
+    index_b.write_text("", encoding="utf-8")
+
+    class FakeANNIndex:
+        def __init__(self, index_path: Path, dim: int) -> None:
+            source = Path(index_path)
+            self.root = source if source.name != "_index.db" else source.parent
+            self.dim = dim
+
+        @classmethod
+        def create_central(cls, *, index_root: Path, dim: int):
+            return cls(index_root, dim)
+
+        def load(self) -> bool:
+            return True
+
+        def count(self) -> int:
+            return 2 if self.root == root_a else 1
+
+        def search(self, _query_dense, top_k: int):
+            if self.root == root_a:
+                return [1, 3][:top_k], [0.01, 0.011][:top_k]
+            return [2][:top_k], [0.02][:top_k]
+
+    monkeypatch.setattr(engine, "_find_start_index", lambda _source_path: index_a)
+    monkeypatch.setattr(engine, "_collect_index_paths", lambda _start_index, _depth: [index_a, index_b])
+    monkeypatch.setattr(
+        engine,
+        "_resolve_dense_embedding_settings",
+        lambda *, index_root: (
+            ("fastembed", "code", False)
+            if Path(index_root) == root_a
+            else ("litellm", "qwen3-embedding-sf", False)
+        ),
+    )
+    monkeypatch.setattr(
+        engine,
+        "_embed_dense_query",
+        lambda _query, *, index_root=None, query_cache=None: (
+            np.ones(4, dtype=np.float32)
+            if Path(index_root) == root_a
+            else np.ones(8, dtype=np.float32)
+        ),
+    )
+    monkeypatch.setattr(engine, "_cross_encoder_rerank", lambda _query, results, top_k: results[:top_k])
+    monkeypatch.setattr(
+        engine,
+        "search",
+        lambda *_args, **_kwargs: (_ for _ in ()).throw(AssertionError("unexpected fallback")),
+    )
+    monkeypatch.setattr(ann_index_module, "ANNIndex", FakeANNIndex)
+
+    result = engine.dense_rerank_cascade_search(
+        "route query",
+        index_a.parent,
+        k=2,
+        coarse_k=2,
+    )
+
+    assert [Path(item.path).name for item in result.results] == ["a.py", "b.py"]
+
+
+def test_dense_rerank_cascade_search_reuses_cached_dense_indexes(
+    monkeypatch: pytest.MonkeyPatch,
+    temp_paths: Path,
+) -> None:
+    import numpy as np
+    import codexlens.semantic.ann_index as ann_index_module
+
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(data_dir=temp_paths / "data", embedding_use_gpu=False)
+    engine = ChainSearchEngine(registry, mapper, config=config)
+
+    dense_root = temp_paths / "indexes" / "project"
+    dense_root.mkdir(parents=True, exist_ok=True)
+    (dense_root / VECTORS_HNSW_NAME).write_bytes(b"hnsw")
+
+    meta_db_path = dense_root / VECTORS_META_DB_NAME
+    conn = sqlite3.connect(meta_db_path)
+    conn.execute(
+        """
+        CREATE TABLE chunk_metadata (
+            chunk_id INTEGER PRIMARY KEY,
+            file_path TEXT NOT NULL,
+            content TEXT NOT NULL,
+            start_line INTEGER,
+            end_line INTEGER
+        )
+        """
+    )
+    conn.execute(
+        "INSERT INTO chunk_metadata (chunk_id, file_path, content, start_line, end_line) VALUES (?, ?, ?, ?, ?)",
+        (1, str((temp_paths / "src" / "impl.py").resolve()), "def impl():\n    return 1", 1, 2),
+    )
+    conn.commit()
+    conn.close()
+
+    index_path = dense_root / "src" / "_index.db"
+    index_path.parent.mkdir(parents=True, exist_ok=True)
+    index_path.write_text("", encoding="utf-8")
+
+    create_calls: list[tuple[Path, int]] = []
+
+    class FakeANNIndex:
+        def __init__(self, root: Path, dim: int) -> None:
+            self.root = root
+            self.dim = dim
+
+        @classmethod
+        def create_central(cls, *, index_root: Path, dim: int):
+            create_calls.append((Path(index_root), int(dim)))
+            return cls(index_root, dim)
+
+        def load(self) -> bool:
+            return True
+
+        def count(self) -> int:
+            return 1
+
+        def search(self, _query_dense, top_k: int):
+            return [1][:top_k], [0.01][:top_k]
+
+    monkeypatch.setattr(engine, "_find_start_index", lambda _source_path: index_path)
+    monkeypatch.setattr(engine, "_collect_index_paths", lambda _start_index, _depth: [index_path])
+    monkeypatch.setattr(engine, "_embed_dense_query", lambda *_args, **_kwargs: np.ones(4, dtype=np.float32))
+    monkeypatch.setattr(engine, "_cross_encoder_rerank", lambda _query, results, top_k: results[:top_k])
+    monkeypatch.setattr(
+        engine,
+        "search",
+        lambda *_args, **_kwargs: (_ for _ in ()).throw(AssertionError("unexpected fallback")),
+    )
+    monkeypatch.setattr(ann_index_module, "ANNIndex", FakeANNIndex)
+
+    first = engine.dense_rerank_cascade_search("route query", index_path.parent, k=1, coarse_k=1)
+    second = engine.dense_rerank_cascade_search("route query", index_path.parent, k=1, coarse_k=1)
+
+    assert len(first.results) == 1
+    assert len(second.results) == 1
+    assert create_calls == [(dense_root, 4)]
+
+
+def test_dense_rerank_cascade_search_short_circuits_lexical_priority_queries(
+    monkeypatch: pytest.MonkeyPatch,
+    temp_paths: Path,
+) -> None:
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(data_dir=temp_paths / "data")
+    engine = ChainSearchEngine(registry, mapper, config=config)
+
+    expected = ChainSearchResult(
+        query="embedding backend fastembed local litellm api config",
+        results=[SearchResult(path="src/config.py", score=0.9, excerpt="embedding_backend = ...")],
+        symbols=[],
+        stats=SearchStats(dirs_searched=3, files_matched=1, time_ms=12.5),
+    )
+    search_calls: list[tuple[str, Path, SearchOptions | None]] = []
+
+    def fake_search(query: str, source_path: Path, options: SearchOptions | None = None):
+        search_calls.append((query, source_path, options))
+        return expected
+
+    monkeypatch.setattr(engine, "search", fake_search)
+    monkeypatch.setattr(
+        engine,
+        "_find_start_index",
+        lambda *_args, **_kwargs: (_ for _ in ()).throw(AssertionError("dense path should not run")),
+    )
+    monkeypatch.setattr(
+        engine,
+        "_embed_dense_query",
+        lambda *_args, **_kwargs: (_ for _ in ()).throw(AssertionError("dense query should not run")),
+    )
+    monkeypatch.setattr(
+        engine,
+        "_cross_encoder_rerank",
+        lambda *_args, **_kwargs: (_ for _ in ()).throw(AssertionError("rerank should not run")),
+    )
+
+    options = SearchOptions(
+        depth=2,
+        max_workers=3,
+        limit_per_dir=4,
+        total_limit=7,
+        include_symbols=True,
+        files_only=False,
+        code_only=True,
+        exclude_extensions=["md"],
+        inject_feature_anchors=False,
+    )
+
+    result = engine.dense_rerank_cascade_search(
+        "embedding backend fastembed local litellm api config",
+        temp_paths / "workspace",
+        k=5,
+        coarse_k=50,
+        options=options,
+    )
+
+    assert result is not expected
+    assert result.results == expected.results
+    assert result.related_results == expected.related_results
+    assert result.symbols == []
+    assert result.stats == expected.stats
+    assert len(search_calls) == 1
+    called_query, called_source_path, lexical_options = search_calls[0]
+    assert called_query == "embedding backend fastembed local litellm api config"
+    assert called_source_path == temp_paths / "workspace"
+    assert lexical_options is not None
+    assert lexical_options.depth == 2
+    assert lexical_options.max_workers == 3
+    assert lexical_options.limit_per_dir == 10
+    assert lexical_options.total_limit == 20
+    assert lexical_options.include_symbols is False
+    assert lexical_options.enable_vector is False
+    assert lexical_options.hybrid_mode is False
+    assert lexical_options.enable_cascade is False
+    assert lexical_options.code_only is True
+    assert lexical_options.exclude_extensions == ["md"]
+    assert lexical_options.inject_feature_anchors is False
+
+
+def test_cross_encoder_rerank_reuses_cached_reranker_instance(
+    monkeypatch: pytest.MonkeyPatch,
+    temp_paths: Path,
+) -> None:
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(
+        data_dir=temp_paths / "data",
+        enable_cross_encoder_rerank=True,
+        reranker_backend="onnx",
+        reranker_use_gpu=False,
+    )
+    engine = ChainSearchEngine(registry, mapper, config=config)
+
+    calls: dict[str, object] = {"check": [], "get": []}
+
+    class DummyReranker:
+        def score_pairs(self, pairs, batch_size=32):
+            _ = batch_size
+            return [1.0 for _ in pairs]
+
+    def fake_check_reranker_available(backend: str):
+        calls["check"].append(backend)
+        return True, None
+
+    def fake_get_reranker(*, backend: str, model_name=None, device=None, **kwargs):
+        calls["get"].append(
+            {
+                "backend": backend,
+                "model_name": model_name,
+                "device": device,
+                "kwargs": kwargs,
+            }
+        )
+        return DummyReranker()
+
+    monkeypatch.setattr(
+        "codexlens.semantic.reranker.check_reranker_available",
+        fake_check_reranker_available,
+    )
+    monkeypatch.setattr(
+        "codexlens.semantic.reranker.get_reranker",
+        fake_get_reranker,
+    )
+
+    results = [
+        SearchResult(path=str((temp_paths / f"file_{idx}.py").resolve()), score=1.0 / (idx + 1), excerpt=f"def fn_{idx}(): pass")
+        for idx in range(3)
+    ]
+
+    first = engine._cross_encoder_rerank("find function", results, top_k=2)
+    second = engine._cross_encoder_rerank("find function", results, top_k=2)
+
+    assert len(first) == len(second) == len(results)
+    assert calls["check"] == ["onnx"]
+    assert len(calls["get"]) == 1
+    get_call = calls["get"][0]
+    assert isinstance(get_call, dict)
+    assert get_call["backend"] == "onnx"
+    assert get_call["kwargs"]["use_gpu"] is False
+
+
+def test_collect_binary_coarse_candidates_interleaves_mixed_dense_fallback_groups(
+    monkeypatch: pytest.MonkeyPatch,
+    temp_paths: Path,
+) -> None:
+    import numpy as np
+    import codexlens.semantic.ann_index as ann_index_module
+
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(data_dir=temp_paths / "data", embedding_use_gpu=False)
+    engine = ChainSearchEngine(registry, mapper, config=config)
+
+    root_a = temp_paths / "indexes" / "project-a"
+    root_b = temp_paths / "indexes" / "project-b"
+    for root in (root_a, root_b):
+        root.mkdir(parents=True, exist_ok=True)
+        (root / VECTORS_HNSW_NAME).write_bytes(b"hnsw")
+
+    index_a = root_a / "src" / "_index.db"
+    index_b = root_b / "src" / "_index.db"
+    index_a.parent.mkdir(parents=True, exist_ok=True)
+    index_b.parent.mkdir(parents=True, exist_ok=True)
+    index_a.write_text("", encoding="utf-8")
+    index_b.write_text("", encoding="utf-8")
+
+    class FakeANNIndex:
+        def __init__(self, index_path: Path, dim: int) -> None:
+            source = Path(index_path)
+            self.root = source if source.name != "_index.db" else source.parent
+            self.dim = dim
+
+        @classmethod
+        def create_central(cls, *, index_root: Path, dim: int):
+            return cls(index_root, dim)
+
+        def load(self) -> bool:
+            return True
+
+        def count(self) -> int:
+            return 2 if self.root == root_a else 1
+
+        def search(self, _query_dense, top_k: int):
+            if self.root == root_a:
+                return [1, 3][:top_k], [0.01, 0.011][:top_k]
+            return [2][:top_k], [0.02][:top_k]
+
+    monkeypatch.setattr(
+        engine,
+        "_resolve_dense_embedding_settings",
+        lambda *, index_root: (
+            ("fastembed", "code", False)
+            if Path(index_root) == root_a
+            else ("litellm", "qwen3-embedding-sf", False)
+        ),
+    )
+    monkeypatch.setattr(
+        engine,
+        "_embed_dense_query",
+        lambda _query, *, index_root=None, query_cache=None: (
+            np.ones(4, dtype=np.float32)
+            if Path(index_root) == root_a
+            else np.ones(8, dtype=np.float32)
+        ),
+    )
+    monkeypatch.setattr(ann_index_module, "ANNIndex", FakeANNIndex)
+
+    coarse_candidates, used_centralized, using_dense_fallback, stage2_index_root = (
+        engine._collect_binary_coarse_candidates(
+            "route query",
+            [index_a, index_b],
+            coarse_k=2,
+            stats=SearchStats(),
+            index_root=index_a.parent,
+            allow_dense_fallback=True,
+        )
+    )
+
+    assert used_centralized is False
+    assert using_dense_fallback is True
+    assert stage2_index_root is None
+    assert coarse_candidates == [
+        (1, 0.01, root_a),
+        (2, 0.02, root_b),
+    ]
+
+
+def test_cross_encoder_rerank_deduplicates_duplicate_paths_before_reranking(
+    monkeypatch: pytest.MonkeyPatch,
+    temp_paths: Path,
+) -> None:
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(data_dir=temp_paths / "data", embedding_use_gpu=False)
+    engine = ChainSearchEngine(registry, mapper, config=config)
+
+    captured: dict[str, object] = {}
+
+    monkeypatch.setattr(
+        "codexlens.semantic.reranker.check_reranker_available",
+        lambda _backend: (True, None),
+    )
+    monkeypatch.setattr(
+        "codexlens.semantic.reranker.get_reranker",
+        lambda **_kwargs: object(),
+    )
+
+    def fake_cross_encoder_rerank(
+        *,
+        query: str,
+        results: list[SearchResult],
+        reranker,
+        top_k: int = 50,
+        batch_size: int = 32,
+        chunk_type_weights=None,
+        test_file_penalty: float = 0.0,
+    ) -> list[SearchResult]:
+        captured["query"] = query
+        captured["paths"] = [item.path for item in results]
+        captured["scores"] = [float(item.score) for item in results]
+        captured["top_k"] = top_k
+        captured["batch_size"] = batch_size
+        captured["chunk_type_weights"] = chunk_type_weights
+        captured["test_file_penalty"] = test_file_penalty
+        _ = reranker
+        return results[:top_k]
+
+    monkeypatch.setattr(
+        "codexlens.search.ranking.cross_encoder_rerank",
+        fake_cross_encoder_rerank,
+    )
+
+    reranked = engine._cross_encoder_rerank(
+        "semantic auth query",
+        [
+            SearchResult(path="/repo/src/router.py", score=0.91, excerpt="chunk 1"),
+            SearchResult(path="/repo/src/router.py", score=0.42, excerpt="chunk 2"),
+            SearchResult(path="/repo/src/config.py", score=0.73, excerpt="chunk 3"),
+        ],
+        top_k=5,
+    )
+
+    assert captured["query"] == "semantic auth query"
+    assert captured["paths"] == ["/repo/src/router.py", "/repo/src/config.py"]
+    assert captured["scores"] == pytest.approx([0.91, 0.73])
+    assert captured["top_k"] == 5
+    assert len(reranked) == 2
+
+
+def test_binary_cascade_search_merges_multiple_centralized_roots(
+    monkeypatch: pytest.MonkeyPatch,
+    temp_paths: Path,
+) -> None:
+    import sqlite3
+    import numpy as np
+
+    registry = RegistryStore(db_path=temp_paths / "registry.db")
+    registry.initialize()
+    mapper = PathMapper(index_root=temp_paths / "indexes")
+    config = Config(data_dir=temp_paths / "data", embedding_use_gpu=False)
+    engine = ChainSearchEngine(registry, mapper, config=config)
+
+    root_a = temp_paths / "indexes" / "project-a"
+    root_b = temp_paths / "indexes" / "project-b"
+    source_db_a = root_a / "source-a.db"
+    source_db_b = root_b / "source-b.db"
+
+    for root, source_db, chunk_id in ((root_a, source_db_a, 1), (root_b, source_db_b, 2)):
+        root.mkdir(parents=True, exist_ok=True)
+        (root / BINARY_VECTORS_MMAP_NAME).write_bytes(b"binary")
+        (root / VECTORS_META_DB_NAME).write_bytes(b"meta")
+        conn = sqlite3.connect(source_db)
+        conn.execute("CREATE TABLE semantic_chunks (id INTEGER PRIMARY KEY, embedding_dense BLOB)")
+        conn.execute(
+            "INSERT INTO semantic_chunks (id, embedding_dense) VALUES (?, ?)",
+            (chunk_id, np.ones(4, dtype=np.float32).tobytes()),
+        )
+        conn.commit()
+        conn.close()
+
+    index_a = root_a / "src" / "_index.db"
+    index_b = root_b / "src" / "_index.db"
+    index_a.parent.mkdir(parents=True, exist_ok=True)
+    index_b.parent.mkdir(parents=True, exist_ok=True)
+    index_a.write_text("", encoding="utf-8")
+    index_b.write_text("", encoding="utf-8")
+
+    class FakeBinarySearcher:
+        def __init__(self, root: Path) -> None:
+            self.root = root
+            self.backend = "fastembed"
+            self.model = None
+            self.model_profile = "code"
+
+        def search(self, _query_dense, top_k: int):
+            return [(1, 8)] if self.root == root_a else [(2, 16)]
+
+    class FakeEmbedder:
+        def embed_to_numpy(self, _queries):
+            return np.ones((1, 4), dtype=np.float32)
+
+    class FakeVectorMetadataStore:
+        def __init__(self, path: Path) -> None:
+            self.path = Path(path)
+
+        def get_chunks_by_ids(self, chunk_ids):
+            source_db = source_db_a if self.path.parent == root_a else source_db_b
+            return [
+                {
+                    "chunk_id": chunk_id,
+                    "file_path": str(self.path.parent / f"file{chunk_id}.py"),
+                    "content": f"chunk {chunk_id}",
+                    "source_index_db": str(source_db),
+                }
+                for chunk_id in chunk_ids
+            ]
+
+    import codexlens.semantic.embedder as embedder_module
+
+    monkeypatch.setattr(engine, "_find_start_index", lambda _source_path: index_a)
+    monkeypatch.setattr(engine, "_collect_index_paths", lambda _start_index, _depth: [index_a, index_b])
+    monkeypatch.setattr(
+        engine,
+        "_get_centralized_binary_searcher",
+        lambda root: FakeBinarySearcher(root),
+    )
+    monkeypatch.setattr(embedder_module, "get_embedder", lambda **_kwargs: FakeEmbedder())
+    monkeypatch.setattr(chain_search_module, "VectorMetadataStore", FakeVectorMetadataStore)
+    monkeypatch.setattr(
+        engine,
+        "_embed_dense_query",
+        lambda _query, *, index_root=None, query_cache=None: np.ones(4, dtype=np.float32),
+    )
+    monkeypatch.setattr(engine, "search", lambda *_args, **_kwargs: (_ for _ in ()).throw(AssertionError("unexpected fallback")))
+
+    result = engine.binary_cascade_search(
+        "binary query",
+        index_a.parent,
+        k=5,
+        coarse_k=5,
+    )
+
+    assert len(result.results) == 2
+    assert {Path(item.path).name for item in result.results} == {"file1.py", "file2.py"}
diff --git a/codex-lens/tests/test_compare_ccw_smart_search_stage2.py b/codex-lens/tests/test_compare_ccw_smart_search_stage2.py
new file mode 100644
index 00000000..901d1cd9
--- /dev/null
+++ b/codex-lens/tests/test_compare_ccw_smart_search_stage2.py
@@ -0,0 +1,350 @@
+from __future__ import annotations
+
+import importlib.util
+import json
+import sys
+from pathlib import Path
+from types import SimpleNamespace
+
+
+MODULE_PATH = Path(__file__).resolve().parents[1] / "benchmarks" / "compare_ccw_smart_search_stage2.py"
+MODULE_NAME = "compare_ccw_smart_search_stage2_test_module"
+MODULE_SPEC = importlib.util.spec_from_file_location(MODULE_NAME, MODULE_PATH)
+assert MODULE_SPEC is not None and MODULE_SPEC.loader is not None
+benchmark = importlib.util.module_from_spec(MODULE_SPEC)
+sys.modules[MODULE_NAME] = benchmark
+MODULE_SPEC.loader.exec_module(benchmark)
+
+
+class _FakeChainResult:
+    def __init__(self, paths: list[str]) -> None:
+        self.results = [SimpleNamespace(path=path) for path in paths]
+
+
+class _FakeEngine:
+    def __init__(
+        self,
+        *,
+        search_paths: list[str] | None = None,
+        cascade_paths: list[str] | None = None,
+    ) -> None:
+        self.search_paths = search_paths or []
+        self.cascade_paths = cascade_paths or []
+        self.search_calls: list[dict[str, object]] = []
+        self.cascade_calls: list[dict[str, object]] = []
+
+    def search(self, query: str, source_path: Path, options: object) -> _FakeChainResult:
+        self.search_calls.append(
+            {
+                "query": query,
+                "source_path": source_path,
+                "options": options,
+            }
+        )
+        return _FakeChainResult(self.search_paths)
+
+    def cascade_search(
+        self,
+        query: str,
+        source_path: Path,
+        *,
+        k: int,
+        coarse_k: int,
+        options: object,
+        strategy: str,
+    ) -> _FakeChainResult:
+        self.cascade_calls.append(
+            {
+                "query": query,
+                "source_path": source_path,
+                "k": k,
+                "coarse_k": coarse_k,
+                "options": options,
+                "strategy": strategy,
+            }
+        )
+        return _FakeChainResult(self.cascade_paths)
+
+
+def test_strategy_specs_include_baselines_before_stage2_modes() -> None:
+    specs = benchmark._strategy_specs(
+        ["realtime", "static_global_graph"],
+        include_dense_baseline=True,
+        baseline_methods=["auto", "fts", "hybrid"],
+    )
+
+    assert [spec.strategy_key for spec in specs] == [
+        "auto",
+        "fts",
+        "hybrid",
+        "dense_rerank",
+        "staged:realtime",
+        "staged:static_global_graph",
+    ]
+
+
+def test_select_effective_method_matches_cli_auto_routing() -> None:
+    assert benchmark._select_effective_method("find_descendant_project_roots", "auto") == "fts"
+    assert benchmark._select_effective_method("build dist artifact output", "auto") == "fts"
+    assert benchmark._select_effective_method("embedding backend fastembed local litellm api config", "auto") == "fts"
+    assert benchmark._select_effective_method("get_reranker factory onnx backend selection", "auto") == "fts"
+    assert benchmark._select_effective_method("how does the authentication flow work", "auto") == "dense_rerank"
+    assert benchmark._select_effective_method("how smart_search keyword routing works", "auto") == "hybrid"
+
+
+def test_filter_dataset_by_query_match_uses_case_insensitive_substring() -> None:
+    dataset = [
+        {"query": "embedding backend fastembed local litellm api config", "relevant_paths": ["a"]},
+        {"query": "get_reranker factory onnx backend selection", "relevant_paths": ["b"]},
+        {"query": "how does smart search route keyword queries", "relevant_paths": ["c"]},
+    ]
+
+    filtered = benchmark._filter_dataset_by_query_match(dataset, "BACKEND")
+    assert [item["query"] for item in filtered] == [
+        "embedding backend fastembed local litellm api config",
+        "get_reranker factory onnx backend selection",
+    ]
+
+    narrow_filtered = benchmark._filter_dataset_by_query_match(dataset, "FASTEMBED")
+    assert [item["query"] for item in narrow_filtered] == [
+        "embedding backend fastembed local litellm api config",
+    ]
+
+    unfiltered = benchmark._filter_dataset_by_query_match(dataset, None)
+    assert [item["query"] for item in unfiltered] == [item["query"] for item in dataset]
+
+
+def test_apply_query_limit_runs_after_filtering() -> None:
+    dataset = [
+        {"query": "executeHybridMode dense_rerank semantic smart_search", "relevant_paths": ["a"]},
+        {"query": "embedding backend fastembed local litellm api config", "relevant_paths": ["b"]},
+        {"query": "reranker backend onnx api legacy configuration", "relevant_paths": ["c"]},
+    ]
+
+    filtered = benchmark._filter_dataset_by_query_match(dataset, "backend")
+    limited = benchmark._apply_query_limit(filtered, 1)
+
+    assert [item["query"] for item in limited] == [
+        "embedding backend fastembed local litellm api config",
+    ]
+
+
+def test_make_progress_payload_reports_partial_completion() -> None:
+    args = SimpleNamespace(
+        queries_file=Path("queries.jsonl"),
+        k=10,
+        coarse_k=100,
+    )
+    strategy_specs = [
+        benchmark.StrategySpec(strategy_key="auto", strategy="auto", stage2_mode=None),
+        benchmark.StrategySpec(strategy_key="dense_rerank", strategy="dense_rerank", stage2_mode=None),
+    ]
+    evaluations = [
+        benchmark.QueryEvaluation(
+            query="embedding backend fastembed local litellm api config",
+            intent="config",
+            notes=None,
+            relevant_paths=["codex-lens/src/codexlens/config.py"],
+            runs={
+                "auto": benchmark.StrategyRun(
+                    strategy_key="auto",
+                    strategy="auto",
+                    stage2_mode=None,
+                    effective_method="fts",
+                    execution_method="fts",
+                    latency_ms=123.0,
+                    topk_paths=["config.py"],
+                    first_hit_rank=1,
+                    hit_at_k=True,
+                    recall_at_k=1.0,
+                    generated_artifact_count=0,
+                    test_file_count=0,
+                    error=None,
+                )
+            },
+        )
+    ]
+
+    payload = benchmark._make_progress_payload(
+        args=args,
+        source_root=Path("D:/repo"),
+        strategy_specs=strategy_specs,
+        evaluations=evaluations,
+        query_index=1,
+        total_queries=3,
+        run_index=2,
+        total_runs=6,
+        current_query="embedding backend fastembed local litellm api config",
+        current_strategy_key="complete",
+    )
+
+    assert payload["status"] == "running"
+    assert payload["progress"]["completed_queries"] == 1
+    assert payload["progress"]["completed_runs"] == 2
+    assert payload["progress"]["total_runs"] == 6
+    assert payload["strategy_keys"] == ["auto", "dense_rerank"]
+    assert payload["evaluations"][0]["runs"]["auto"]["effective_method"] == "fts"
+
+
+def test_write_final_outputs_updates_progress_snapshot(tmp_path: Path) -> None:
+    output_path = tmp_path / "results.json"
+    progress_path = tmp_path / "progress.json"
+    payload = {
+        "status": "completed",
+        "query_count": 1,
+        "strategies": {"auto": {"effective_methods": {"fts": 1}}},
+    }
+
+    benchmark._write_final_outputs(
+        output_path=output_path,
+        progress_output=progress_path,
+        payload=payload,
+    )
+
+    assert json.loads(output_path.read_text(encoding="utf-8")) == payload
+    assert json.loads(progress_path.read_text(encoding="utf-8")) == payload
+
+
+def test_build_parser_defaults_reranker_gpu_to_disabled() -> None:
+    parser = benchmark.build_parser()
+    args = parser.parse_args([])
+
+    assert args.embedding_use_gpu is False
+    assert args.reranker_use_gpu is False
+    assert args.reranker_model == benchmark.DEFAULT_LOCAL_ONNX_RERANKER_MODEL
+
+
+def test_build_strategy_runtime_clones_config(monkeypatch, tmp_path: Path) -> None:
+    class _FakeRegistry:
+        def __init__(self) -> None:
+            self.initialized = False
+
+        def initialize(self) -> None:
+            self.initialized = True
+
+    class _FakeMapper:
+        pass
+
+    class _FakeEngine:
+        def __init__(self, *, registry, mapper, config) -> None:
+            self.registry = registry
+            self.mapper = mapper
+            self.config = config
+
+    monkeypatch.setattr(benchmark, "RegistryStore", _FakeRegistry)
+    monkeypatch.setattr(benchmark, "PathMapper", _FakeMapper)
+    monkeypatch.setattr(benchmark, "ChainSearchEngine", _FakeEngine)
+
+    base_config = benchmark.Config(data_dir=tmp_path, reranker_use_gpu=False)
+    strategy_spec = benchmark.StrategySpec(strategy_key="dense_rerank", strategy="dense_rerank", stage2_mode=None)
+
+    runtime = benchmark._build_strategy_runtime(base_config, strategy_spec)
+
+    assert runtime.strategy_spec == strategy_spec
+    assert runtime.config is not base_config
+    assert runtime.config.reranker_use_gpu is False
+    assert runtime.registry.initialized is True
+    assert runtime.engine.config is runtime.config
+
+
+def test_run_strategy_routes_auto_keyword_queries_to_fts_search() -> None:
+    engine = _FakeEngine(
+        search_paths=[
+            "D:/repo/src/codexlens/storage/registry.py",
+            "D:/repo/build/lib/codexlens/storage/registry.py",
+        ]
+    )
+    config = SimpleNamespace(cascade_strategy="staged", staged_stage2_mode="realtime")
+    relevant = {benchmark._normalize_path_key("D:/repo/src/codexlens/storage/registry.py")}
+
+    run = benchmark._run_strategy(
+        engine,
+        config,
+        strategy_spec=benchmark.StrategySpec(strategy_key="auto", strategy="auto", stage2_mode=None),
+        query="find_descendant_project_roots",
+        source_path=Path("D:/repo"),
+        k=5,
+        coarse_k=20,
+        relevant=relevant,
+    )
+
+    assert len(engine.search_calls) == 1
+    assert len(engine.cascade_calls) == 0
+    assert run.effective_method == "fts"
+    assert run.execution_method == "fts"
+    assert run.hit_at_k is True
+    assert run.generated_artifact_count == 1
+    assert run.test_file_count == 0
+
+
+def test_run_strategy_uses_cascade_for_dense_rerank_and_restores_config() -> None:
+    engine = _FakeEngine(cascade_paths=["D:/repo/src/tools/smart-search.ts"])
+    config = SimpleNamespace(cascade_strategy="staged", staged_stage2_mode="static_global_graph")
+    relevant = {benchmark._normalize_path_key("D:/repo/src/tools/smart-search.ts")}
+
+    run = benchmark._run_strategy(
+        engine,
+        config,
+        strategy_spec=benchmark.StrategySpec(
+            strategy_key="dense_rerank",
+            strategy="dense_rerank",
+            stage2_mode=None,
+        ),
+        query="how does smart search route keyword queries",
+        source_path=Path("D:/repo"),
+        k=5,
+        coarse_k=20,
+        relevant=relevant,
+    )
+
+    assert len(engine.search_calls) == 0
+    assert len(engine.cascade_calls) == 1
+    assert engine.cascade_calls[0]["strategy"] == "dense_rerank"
+    assert run.effective_method == "dense_rerank"
+    assert run.execution_method == "cascade"
+    assert run.hit_at_k is True
+    assert config.cascade_strategy == "staged"
+    assert config.staged_stage2_mode == "static_global_graph"
+
+
+def test_summarize_runs_tracks_effective_method_and_artifact_pressure() -> None:
+    summary = benchmark._summarize_runs(
+        [
+            benchmark.StrategyRun(
+                strategy_key="auto",
+                strategy="auto",
+                stage2_mode=None,
+                effective_method="fts",
+                execution_method="fts",
+                latency_ms=10.0,
+                topk_paths=["a"],
+                first_hit_rank=1,
+                hit_at_k=True,
+                recall_at_k=1.0,
+                generated_artifact_count=1,
+                test_file_count=0,
+                error=None,
+            ),
+            benchmark.StrategyRun(
+                strategy_key="auto",
+                strategy="auto",
+                stage2_mode=None,
+                effective_method="hybrid",
+                execution_method="hybrid",
+                latency_ms=30.0,
+                topk_paths=["b"],
+                first_hit_rank=None,
+                hit_at_k=False,
+                recall_at_k=0.0,
+                generated_artifact_count=0,
+                test_file_count=2,
+                error=None,
+            ),
+        ]
+    )
+
+    assert summary["effective_methods"] == {"fts": 1, "hybrid": 1}
+    assert summary["runs_with_generated_artifacts"] == 1
+    assert summary["runs_with_test_files"] == 1
+    assert summary["avg_generated_artifact_count"] == 0.5
+    assert summary["avg_test_file_count"] == 1.0
diff --git a/codex-lens/tests/test_config_search_env_overrides.py b/codex-lens/tests/test_config_search_env_overrides.py
new file mode 100644
index 00000000..f49d2880
--- /dev/null
+++ b/codex-lens/tests/test_config_search_env_overrides.py
@@ -0,0 +1,83 @@
+"""Unit tests for Config .env overrides for final search ranking penalties."""
+
+from __future__ import annotations
+
+import tempfile
+from pathlib import Path
+
+import pytest
+
+from codexlens.config import Config
+
+
+@pytest.fixture
+def temp_config_dir() -> Path:
+    """Create temporary directory for config data_dir."""
+    tmpdir = tempfile.TemporaryDirectory(ignore_cleanup_errors=True)
+    yield Path(tmpdir.name)
+    try:
+        tmpdir.cleanup()
+    except (PermissionError, OSError):
+        pass
+
+
+def test_search_penalty_env_overrides_apply(temp_config_dir: Path) -> None:
+    config = Config(data_dir=temp_config_dir)
+
+    env_path = temp_config_dir / ".env"
+    env_path.write_text(
+        "\n".join(
+            [
+                "TEST_FILE_PENALTY=0.25",
+                "GENERATED_FILE_PENALTY=0.4",
+                "",
+            ]
+        ),
+        encoding="utf-8",
+    )
+
+    config.load_settings()
+
+    assert config.test_file_penalty == 0.25
+    assert config.generated_file_penalty == 0.4
+
+
+def test_reranker_gpu_env_override_apply(temp_config_dir: Path) -> None:
+    config = Config(data_dir=temp_config_dir)
+
+    env_path = temp_config_dir / ".env"
+    env_path.write_text(
+        "\n".join(
+            [
+                "RERANKER_USE_GPU=false",
+                "",
+            ]
+        ),
+        encoding="utf-8",
+    )
+
+    config.load_settings()
+
+    assert config.reranker_use_gpu is False
+
+
+def test_search_penalty_env_overrides_invalid_ignored(temp_config_dir: Path) -> None:
+    config = Config(data_dir=temp_config_dir)
+
+    env_path = temp_config_dir / ".env"
+    env_path.write_text(
+        "\n".join(
+            [
+                "TEST_FILE_PENALTY=oops",
+                "GENERATED_FILE_PENALTY=nope",
+                "",
+            ]
+        ),
+        encoding="utf-8",
+    )
+
+    config.load_settings()
+
+    assert config.test_file_penalty == 0.15
+    assert config.generated_file_penalty == 0.35
+    assert config.reranker_use_gpu is True
diff --git a/codex-lens/tests/test_embedding_status_root_model.py b/codex-lens/tests/test_embedding_status_root_model.py
new file mode 100644
index 00000000..7314d205
--- /dev/null
+++ b/codex-lens/tests/test_embedding_status_root_model.py
@@ -0,0 +1,204 @@
+import gc
+import gc
+import shutil
+import sqlite3
+import tempfile
+import time
+from pathlib import Path
+
+import pytest
+
+import codexlens.cli.embedding_manager as embedding_manager
+from codexlens.cli.embedding_manager import get_embedding_stats_summary, get_embeddings_status
+
+
+@pytest.fixture
+def status_temp_dir() -> Path:
+    temp_path = Path(tempfile.mkdtemp())
+    try:
+        yield temp_path
+    finally:
+        gc.collect()
+        for _ in range(5):
+            try:
+                if temp_path.exists():
+                    shutil.rmtree(temp_path)
+                break
+            except PermissionError:
+                time.sleep(0.1)
+
+
+def _create_index_db(index_path: Path, files: list[str], embedded_files: list[str] | None = None) -> None:
+    index_path.parent.mkdir(parents=True, exist_ok=True)
+    with sqlite3.connect(index_path) as conn:
+        cursor = conn.cursor()
+        cursor.execute(
+            """
+            CREATE TABLE files (
+                id INTEGER PRIMARY KEY,
+                path TEXT NOT NULL UNIQUE,
+                content TEXT,
+                language TEXT,
+                hash TEXT
+            )
+            """
+        )
+        cursor.executemany(
+            "INSERT INTO files (path, content, language, hash) VALUES (?, ?, ?, ?)",
+            [(file_path, "", "python", f"hash-{idx}") for idx, file_path in enumerate(files)],
+        )
+
+        if embedded_files is not None:
+            cursor.execute(
+                """
+                CREATE TABLE semantic_chunks (
+                    id INTEGER PRIMARY KEY,
+                    file_path TEXT NOT NULL,
+                    content TEXT,
+                    embedding BLOB,
+                    metadata TEXT,
+                    category TEXT
+                )
+                """
+            )
+            cursor.executemany(
+                "INSERT INTO semantic_chunks (file_path, content, embedding, metadata, category) VALUES (?, ?, ?, ?, ?)",
+                [(file_path, "chunk", b"vec", "{}", "code") for file_path in embedded_files],
+            )
+        conn.commit()
+
+
+def _create_vectors_meta_db(meta_path: Path, embedded_files: list[str], binary_vector_count: int = 0) -> None:
+    meta_path.parent.mkdir(parents=True, exist_ok=True)
+    with sqlite3.connect(meta_path) as conn:
+        cursor = conn.cursor()
+        cursor.execute(
+            """
+            CREATE TABLE chunk_metadata (
+                chunk_id INTEGER PRIMARY KEY,
+                file_path TEXT NOT NULL,
+                content TEXT,
+                start_line INTEGER,
+                end_line INTEGER,
+                category TEXT,
+                metadata TEXT,
+                source_index_db TEXT
+            )
+            """
+        )
+        cursor.execute(
+            """
+            CREATE TABLE binary_vectors (
+                chunk_id INTEGER PRIMARY KEY,
+                vector BLOB NOT NULL
+            )
+            """
+        )
+        cursor.executemany(
+            """
+            INSERT INTO chunk_metadata (
+                chunk_id, file_path, content, start_line, end_line, category, metadata, source_index_db
+            ) VALUES (?, ?, ?, ?, ?, ?, ?, ?)
+            """,
+            [
+                (idx, file_path, "chunk", 1, 1, "code", "{}", str(meta_path.parent / "_index.db"))
+                for idx, file_path in enumerate(embedded_files, start=1)
+            ],
+        )
+        cursor.executemany(
+            "INSERT INTO binary_vectors (chunk_id, vector) VALUES (?, ?)",
+            [(idx, b"\x01") for idx in range(1, binary_vector_count + 1)],
+        )
+        conn.commit()
+
+
+def test_root_status_does_not_inherit_child_embeddings(
+    monkeypatch: pytest.MonkeyPatch, status_temp_dir: Path
+) -> None:
+    workspace = status_temp_dir / "workspace"
+    workspace.mkdir()
+    _create_index_db(workspace / "_index.db", ["a.py", "b.py"])
+    _create_index_db(workspace / "child" / "_index.db", ["child.py"], embedded_files=["child.py"])
+
+    monkeypatch.setattr(
+        embedding_manager,
+        "_get_model_info_from_index",
+        lambda index_path: {
+            "model_profile": "fast",
+            "model_name": "unit-test-model",
+            "embedding_dim": 384,
+            "backend": "fastembed",
+            "created_at": "2026-03-13T00:00:00Z",
+            "updated_at": "2026-03-13T00:00:00Z",
+        } if index_path.parent.name == "child" else None,
+    )
+
+    status = get_embeddings_status(workspace)
+    assert status["success"] is True
+
+    result = status["result"]
+    assert result["coverage_percent"] == 0.0
+    assert result["files_with_embeddings"] == 0
+    assert result["root"]["has_embeddings"] is False
+    assert result["model_info"] is None
+    assert result["subtree"]["indexes_with_embeddings"] == 1
+    assert result["subtree"]["coverage_percent"] > 0
+
+
+def test_root_status_uses_validated_centralized_metadata(status_temp_dir: Path) -> None:
+    workspace = status_temp_dir / "workspace"
+    workspace.mkdir()
+    _create_index_db(workspace / "_index.db", ["a.py", "b.py"])
+    _create_vectors_meta_db(workspace / "_vectors_meta.db", ["a.py"])
+    (workspace / "_vectors.hnsw").write_bytes(b"hnsw")
+
+    status = get_embeddings_status(workspace)
+    assert status["success"] is True
+
+    result = status["result"]
+    assert result["coverage_percent"] == 50.0
+    assert result["files_with_embeddings"] == 1
+    assert result["total_chunks"] == 1
+    assert result["root"]["has_embeddings"] is True
+    assert result["root"]["storage_mode"] == "centralized"
+    assert result["centralized"]["dense_ready"] is True
+    assert result["centralized"]["usable"] is True
+
+
+def test_embedding_stats_summary_skips_ignored_artifact_indexes(status_temp_dir: Path) -> None:
+    workspace = status_temp_dir / "workspace"
+    workspace.mkdir()
+    _create_index_db(workspace / "_index.db", ["root.py"])
+    _create_index_db(workspace / "src" / "_index.db", ["src.py"])
+    _create_index_db(workspace / "dist" / "_index.db", ["bundle.py"], embedded_files=["bundle.py"])
+    _create_index_db(workspace / ".workflow" / "_index.db", ["trace.py"], embedded_files=["trace.py"])
+
+    summary = get_embedding_stats_summary(workspace)
+
+    assert summary["success"] is True
+    result = summary["result"]
+    assert result["total_indexes"] == 2
+    assert {Path(item["path"]).relative_to(workspace).as_posix() for item in result["indexes"]} == {
+        "_index.db",
+        "src/_index.db",
+    }
+
+
+def test_root_status_ignores_empty_centralized_artifacts(status_temp_dir: Path) -> None:
+    workspace = status_temp_dir / "workspace"
+    workspace.mkdir()
+    _create_index_db(workspace / "_index.db", ["a.py", "b.py"])
+    _create_vectors_meta_db(workspace / "_vectors_meta.db", [])
+    (workspace / "_vectors.hnsw").write_bytes(b"hnsw")
+    (workspace / "_binary_vectors.mmap").write_bytes(b"mmap")
+
+    status = get_embeddings_status(workspace)
+    assert status["success"] is True
+
+    result = status["result"]
+    assert result["coverage_percent"] == 0.0
+    assert result["files_with_embeddings"] == 0
+    assert result["root"]["has_embeddings"] is False
+    assert result["centralized"]["chunk_metadata_rows"] == 0
+    assert result["centralized"]["binary_vector_rows"] == 0
+    assert result["centralized"]["usable"] is False
diff --git a/codex-lens/tests/test_hybrid_search_e2e.py b/codex-lens/tests/test_hybrid_search_e2e.py
index 66f513ea..131aad14 100644
--- a/codex-lens/tests/test_hybrid_search_e2e.py
+++ b/codex-lens/tests/test_hybrid_search_e2e.py
@@ -833,6 +833,36 @@ class TestHybridSearchAdaptiveWeights:
 
         assert captured["weights"]["vector"] > 0.6
 
+    def test_default_engine_weights_keep_lsp_graph_backend_available(self):
+        """Legacy public defaults should not discard LSP graph fusion weights internally."""
+        from unittest.mock import patch
+
+        engine = HybridSearchEngine()
+
+        results_map = {
+            "exact": [SearchResult(path="a.py", score=10.0, excerpt="a")],
+            "fuzzy": [SearchResult(path="b.py", score=9.0, excerpt="b")],
+            "vector": [SearchResult(path="c.py", score=0.9, excerpt="c")],
+            "lsp_graph": [SearchResult(path="d.py", score=0.8, excerpt="d")],
+        }
+
+        captured = {}
+        from codexlens.search import ranking as ranking_module
+
+        def capture_rrf(map_in, weights_in, k=60):
+            captured["weights"] = dict(weights_in)
+            return ranking_module.reciprocal_rank_fusion(map_in, weights_in, k=k)
+
+        with patch.object(HybridSearchEngine, "_search_parallel", return_value=results_map), patch(
+            "codexlens.search.hybrid_search.reciprocal_rank_fusion",
+            side_effect=capture_rrf,
+        ):
+            engine.search(Path("dummy.db"), "auth flow", enable_vector=True, enable_lsp_graph=True)
+
+        assert engine.weights == HybridSearchEngine.DEFAULT_WEIGHTS
+        assert "lsp_graph" in captured["weights"]
+        assert captured["weights"]["lsp_graph"] > 0.0
+
     def test_reranking_enabled(self, tmp_path):
         """Reranking runs only when explicitly enabled via config."""
         from unittest.mock import patch
diff --git a/codex-lens/tests/test_hybrid_search_reranker_backend.py b/codex-lens/tests/test_hybrid_search_reranker_backend.py
index 85a8564f..1e832640 100644
--- a/codex-lens/tests/test_hybrid_search_reranker_backend.py
+++ b/codex-lens/tests/test_hybrid_search_reranker_backend.py
@@ -93,7 +93,8 @@ def test_get_cross_encoder_reranker_uses_factory_backend_onnx_gpu_flag(
         enable_reranking=True,
         enable_cross_encoder_rerank=True,
         reranker_backend="onnx",
-        embedding_use_gpu=False,
+        embedding_use_gpu=True,
+        reranker_use_gpu=False,
     )
     engine = HybridSearchEngine(config=config)
 
@@ -109,6 +110,58 @@ def test_get_cross_encoder_reranker_uses_factory_backend_onnx_gpu_flag(
     assert get_args["kwargs"]["use_gpu"] is False
 
 
+def test_get_cross_encoder_reranker_uses_cpu_device_for_legacy_when_reranker_gpu_disabled(
+    monkeypatch: pytest.MonkeyPatch,
+    tmp_path,
+) -> None:
+    calls: dict[str, object] = {}
+
+    def fake_check_reranker_available(backend: str):
+        calls["check_backend"] = backend
+        return True, None
+
+    sentinel = object()
+
+    def fake_get_reranker(*, backend: str, model_name=None, device=None, **kwargs):
+        calls["get_args"] = {
+            "backend": backend,
+            "model_name": model_name,
+            "device": device,
+            "kwargs": kwargs,
+        }
+        return sentinel
+
+    monkeypatch.setattr(
+        "codexlens.semantic.reranker.check_reranker_available",
+        fake_check_reranker_available,
+    )
+    monkeypatch.setattr(
+        "codexlens.semantic.reranker.get_reranker",
+        fake_get_reranker,
+    )
+
+    config = Config(
+        data_dir=tmp_path / "legacy-cpu",
+        enable_reranking=True,
+        enable_cross_encoder_rerank=True,
+        reranker_backend="legacy",
+        reranker_model="dummy-model",
+        embedding_use_gpu=True,
+        reranker_use_gpu=False,
+    )
+    engine = HybridSearchEngine(config=config)
+
+    reranker = engine._get_cross_encoder_reranker()
+    assert reranker is sentinel
+    assert calls["check_backend"] == "legacy"
+
+    get_args = calls["get_args"]
+    assert isinstance(get_args, dict)
+    assert get_args["backend"] == "legacy"
+    assert get_args["model_name"] == "dummy-model"
+    assert get_args["device"] == "cpu"
+
+
 def test_get_cross_encoder_reranker_returns_none_when_backend_unavailable(
     monkeypatch: pytest.MonkeyPatch,
     tmp_path,
diff --git a/codex-lens/tests/test_hybrid_search_unit.py b/codex-lens/tests/test_hybrid_search_unit.py
index cc27a6f8..5c485291 100644
--- a/codex-lens/tests/test_hybrid_search_unit.py
+++ b/codex-lens/tests/test_hybrid_search_unit.py
@@ -150,6 +150,30 @@ class TestHybridSearchBackends:
             assert "exact" in backends
             assert "vector" in backends
 
+    def test_search_lexical_priority_query_skips_vector_backend(self, temp_paths, mock_config):
+        """Config/env/factory queries should stay lexical-first in hybrid mode."""
+        engine = HybridSearchEngine(config=mock_config)
+        index_path = temp_paths / "_index.db"
+
+        with patch.object(engine, "_search_parallel") as mock_parallel:
+            mock_parallel.return_value = {
+                "exact": [SearchResult(path="config.py", score=10.0, excerpt="exact")],
+                "fuzzy": [SearchResult(path="env_config.py", score=8.0, excerpt="fuzzy")],
+            }
+
+            results = engine.search(
+                index_path,
+                "embedding backend fastembed local litellm api config",
+                enable_fuzzy=True,
+                enable_vector=True,
+            )
+
+            assert len(results) >= 1
+            backends = mock_parallel.call_args[0][2]
+            assert "exact" in backends
+            assert "fuzzy" in backends
+            assert "vector" not in backends
+
     def test_search_pure_vector(self, temp_paths, mock_config):
         """Pure vector mode should only use vector backend."""
         engine = HybridSearchEngine(config=mock_config)
@@ -257,6 +281,39 @@ class TestHybridSearchFusion:
 
                 mock_rerank.assert_called_once()
 
+    def test_search_lexical_priority_query_skips_expensive_reranking(self, temp_paths, mock_config):
+        """Lexical-priority queries should bypass embedder and cross-encoder reranking."""
+        mock_config.enable_reranking = True
+        mock_config.enable_cross_encoder_rerank = True
+        mock_config.reranking_top_k = 50
+        mock_config.reranker_top_k = 20
+        engine = HybridSearchEngine(config=mock_config)
+        index_path = temp_paths / "_index.db"
+
+        with patch.object(engine, "_search_parallel") as mock_parallel:
+            mock_parallel.return_value = {
+                "exact": [SearchResult(path="config.py", score=10.0, excerpt="code")],
+                "fuzzy": [SearchResult(path="env_config.py", score=9.0, excerpt="env vars")],
+            }
+
+            with patch("codexlens.search.hybrid_search.rerank_results") as mock_rerank, patch(
+                "codexlens.search.hybrid_search.cross_encoder_rerank"
+            ) as mock_cross_encoder, patch.object(
+                engine,
+                "_get_cross_encoder_reranker",
+            ) as mock_get_reranker:
+                results = engine.search(
+                    index_path,
+                    "get_reranker factory onnx backend selection",
+                    enable_fuzzy=True,
+                    enable_vector=True,
+                )
+
+                assert len(results) >= 1
+                mock_rerank.assert_not_called()
+                mock_cross_encoder.assert_not_called()
+                mock_get_reranker.assert_not_called()
+
     def test_search_category_filtering(self, temp_paths, mock_config):
         """Category filtering should separate code/doc results by intent."""
         mock_config.enable_category_filter = True
@@ -316,6 +373,217 @@ class TestSearchParallel:
             mock_fuzzy.assert_called_once()
 
 
+class TestCentralizedMetadataFetch:
+    """Tests for centralized metadata retrieval helpers."""
+
+    def test_fetch_from_vector_meta_store_clamps_negative_scores(self, temp_paths, mock_config, monkeypatch):
+        engine = HybridSearchEngine(config=mock_config)
+
+        class FakeMetaStore:
+            def __init__(self, _path):
+                pass
+
+            def __enter__(self):
+                return self
+
+            def __exit__(self, exc_type, exc, tb):
+                return False
+
+            def get_chunks_by_ids(self, _chunk_ids, category=None):
+                assert category is None
+                return [
+                    {
+                        "chunk_id": 7,
+                        "file_path": "src/app.py",
+                        "content": "def app(): pass",
+                        "metadata": {},
+                        "start_line": 1,
+                        "end_line": 1,
+                    }
+                ]
+
+        import codexlens.storage.vector_meta_store as vector_meta_store
+
+        monkeypatch.setattr(vector_meta_store, "VectorMetadataStore", FakeMetaStore)
+
+        results = engine._fetch_from_vector_meta_store(
+            temp_paths / "_vectors_meta.db",
+            [7],
+            {7: -0.01},
+        )
+
+        assert len(results) == 1
+        assert results[0].path == "src/app.py"
+        assert results[0].score == 0.0
+
+
+class TestCentralizedVectorCaching:
+    """Tests for centralized vector search runtime caches."""
+
+    def test_search_vector_centralized_reuses_cached_resources(
+        self,
+        temp_paths,
+        mock_config,
+    ):
+        engine = HybridSearchEngine(config=mock_config)
+        hnsw_path = temp_paths / "_vectors.hnsw"
+        hnsw_path.write_bytes(b"hnsw")
+
+        vector_store_opened: List[Path] = []
+
+        class FakeVectorStore:
+            def __init__(self, path):
+                vector_store_opened.append(Path(path))
+
+            def __enter__(self):
+                return self
+
+            def __exit__(self, exc_type, exc, tb):
+                return False
+
+            def get_model_config(self):
+                return {
+                    "backend": "fastembed",
+                    "model_name": "BAAI/bge-small-en-v1.5",
+                    "model_profile": "fast",
+                    "embedding_dim": 384,
+                }
+
+        class FakeEmbedder:
+            embedding_dim = 384
+
+            def __init__(self):
+                self.embed_calls: List[str] = []
+
+            def embed_single(self, query):
+                self.embed_calls.append(query)
+                return [0.1, 0.2, 0.3]
+
+        class FakeAnnIndex:
+            def __init__(self):
+                self.load_calls = 0
+                self.search_calls = 0
+
+            def load(self):
+                self.load_calls += 1
+                return True
+
+            def count(self):
+                return 3
+
+            def search(self, _query_vec, top_k):
+                self.search_calls += 1
+                assert top_k == 10
+                return [7], [0.2]
+
+        fake_embedder = FakeEmbedder()
+        fake_ann_index = FakeAnnIndex()
+
+        with patch("codexlens.semantic.vector_store.VectorStore", FakeVectorStore), patch(
+            "codexlens.semantic.factory.get_embedder",
+            return_value=fake_embedder,
+        ) as mock_get_embedder, patch(
+            "codexlens.semantic.ann_index.ANNIndex.create_central",
+            return_value=fake_ann_index,
+        ) as mock_create_central, patch.object(
+            engine,
+            "_fetch_chunks_by_ids_centralized",
+            return_value=[SearchResult(path="src/app.py", score=0.8, excerpt="hit")],
+        ) as mock_fetch:
+            first = engine._search_vector_centralized(
+                temp_paths / "child-a" / "_index.db",
+                hnsw_path,
+                "smart search routing",
+                limit=5,
+            )
+            second = engine._search_vector_centralized(
+                temp_paths / "child-b" / "_index.db",
+                hnsw_path,
+                "smart search routing",
+                limit=5,
+            )
+
+        assert [result.path for result in first] == ["src/app.py"]
+        assert [result.path for result in second] == ["src/app.py"]
+        assert vector_store_opened == [temp_paths / "_index.db"]
+        assert mock_get_embedder.call_count == 1
+        assert mock_create_central.call_count == 1
+        assert fake_ann_index.load_calls == 1
+        assert fake_embedder.embed_calls == ["smart search routing"]
+        assert fake_ann_index.search_calls == 2
+        assert mock_fetch.call_count == 2
+
+    def test_search_vector_centralized_respects_embedding_use_gpu(
+        self,
+        temp_paths,
+        mock_config,
+    ):
+        engine = HybridSearchEngine(config=mock_config)
+        hnsw_path = temp_paths / "_vectors.hnsw"
+        hnsw_path.write_bytes(b"hnsw")
+
+        class FakeVectorStore:
+            def __init__(self, _path):
+                pass
+
+            def __enter__(self):
+                return self
+
+            def __exit__(self, exc_type, exc, tb):
+                return False
+
+            def get_model_config(self):
+                return {
+                    "backend": "fastembed",
+                    "model_name": "BAAI/bge-small-en-v1.5",
+                    "model_profile": "code",
+                    "embedding_dim": 384,
+                }
+
+        class FakeEmbedder:
+            embedding_dim = 384
+
+            def embed_single(self, _query):
+                return [0.1, 0.2]
+
+        class FakeAnnIndex:
+            def load(self):
+                return True
+
+            def count(self):
+                return 1
+
+            def search(self, _query_vec, top_k):
+                assert top_k == 6
+                return [9], [0.1]
+
+        with patch("codexlens.semantic.vector_store.VectorStore", FakeVectorStore), patch(
+            "codexlens.semantic.factory.get_embedder",
+            return_value=FakeEmbedder(),
+        ) as mock_get_embedder, patch(
+            "codexlens.semantic.ann_index.ANNIndex.create_central",
+            return_value=FakeAnnIndex(),
+        ), patch.object(
+            engine,
+            "_fetch_chunks_by_ids_centralized",
+            return_value=[SearchResult(path="src/app.py", score=0.9, excerpt="hit")],
+        ):
+            results = engine._search_vector_centralized(
+                temp_paths / "_index.db",
+                hnsw_path,
+                "semantic query",
+                limit=3,
+            )
+
+        assert len(results) == 1
+        assert mock_get_embedder.call_count == 1
+        assert mock_get_embedder.call_args.kwargs == {
+            "backend": "fastembed",
+            "profile": "code",
+            "use_gpu": False,
+        }
+
+
 # =============================================================================
 # Tests: _search_lsp_graph
 # =============================================================================
diff --git a/codex-lens/tests/test_index_status_cli_contract.py b/codex-lens/tests/test_index_status_cli_contract.py
new file mode 100644
index 00000000..cac0549c
--- /dev/null
+++ b/codex-lens/tests/test_index_status_cli_contract.py
@@ -0,0 +1,674 @@
+import json
+
+from typer.testing import CliRunner
+
+import codexlens.cli.commands as commands
+from codexlens.cli.commands import app
+import codexlens.cli.embedding_manager as embedding_manager
+from codexlens.config import Config
+from codexlens.entities import SearchResult
+from codexlens.search.chain_search import ChainSearchResult, SearchStats
+
+
+def test_index_status_json_preserves_legacy_embeddings_contract(
+    monkeypatch,
+    tmp_path,
+) -> None:
+    workspace = tmp_path / "workspace"
+    workspace.mkdir()
+    (workspace / "_index.db").touch()
+
+    legacy_summary = {
+        "total_indexes": 3,
+        "indexes_with_embeddings": 1,
+        "total_chunks": 42,
+        "indexes": [
+            {
+                "project": "child",
+                "path": str(workspace / "child" / "_index.db"),
+                "has_embeddings": True,
+                "total_chunks": 42,
+                "total_files": 1,
+                "coverage_percent": 100.0,
+            }
+        ],
+    }
+    root_status = {
+        "total_indexes": 3,
+        "total_files": 2,
+        "files_with_embeddings": 0,
+        "files_without_embeddings": 2,
+        "total_chunks": 0,
+        "coverage_percent": 0.0,
+        "indexes_with_embeddings": 1,
+        "indexes_without_embeddings": 2,
+        "model_info": None,
+        "root": {
+            "index_path": str(workspace / "_index.db"),
+            "exists": False,
+            "total_files": 2,
+            "files_with_embeddings": 0,
+            "files_without_embeddings": 2,
+            "total_chunks": 0,
+            "coverage_percent": 0.0,
+            "has_embeddings": False,
+            "storage_mode": "none",
+        },
+        "subtree": {
+            "total_indexes": 3,
+            "total_files": 3,
+            "files_with_embeddings": 1,
+            "files_without_embeddings": 2,
+            "total_chunks": 42,
+            "coverage_percent": 33.3,
+            "indexes_with_embeddings": 1,
+            "indexes_without_embeddings": 2,
+        },
+        "centralized": {
+            "dense_index_exists": False,
+            "binary_index_exists": False,
+            "dense_ready": False,
+            "binary_ready": False,
+            "usable": False,
+            "chunk_metadata_rows": 0,
+            "binary_vector_rows": 0,
+            "files_with_embeddings": 0,
+        },
+    }
+
+    monkeypatch.setattr(
+        embedding_manager,
+        "get_embeddings_status",
+        lambda _index_root: {"success": True, "result": root_status},
+    )
+    monkeypatch.setattr(
+        embedding_manager,
+        "get_embedding_stats_summary",
+        lambda _index_root: {"success": True, "result": legacy_summary},
+    )
+    monkeypatch.setattr(
+        commands,
+        "RegistryStore",
+        type(
+            "FakeRegistryStore",
+            (),
+            {
+                "initialize": lambda self: None,
+                "close": lambda self: None,
+            },
+        ),
+    )
+    monkeypatch.setattr(
+        commands,
+        "PathMapper",
+        type(
+            "FakePathMapper",
+            (),
+            {
+                "source_to_index_db": lambda self, _target_path: workspace / "_index.db",
+            },
+        ),
+    )
+
+    runner = CliRunner()
+    result = runner.invoke(app, ["index", "status", str(workspace), "--json"])
+
+    assert result.exit_code == 0, result.output
+    payload = json.loads(result.stdout)
+    body = payload["result"]
+    assert body["embeddings"] == legacy_summary
+    assert body["embeddings_error"] is None
+    assert body["embeddings_status"] == root_status
+    assert body["embeddings_status_error"] is None
+    assert body["embeddings_summary"] == legacy_summary
+
+
+def test_search_json_preserves_dense_rerank_method_label(
+    monkeypatch,
+    tmp_path,
+) -> None:
+    workspace = tmp_path / "workspace"
+    workspace.mkdir()
+
+    search_result = ChainSearchResult(
+        query="greet function",
+        results=[
+            SearchResult(
+                path=str(workspace / "src" / "app.py"),
+                score=0.97,
+                excerpt="def greet(name):",
+                content="def greet(name):\n    return f'hello {name}'\n",
+            )
+        ],
+        symbols=[],
+        stats=SearchStats(dirs_searched=2, files_matched=1, time_ms=12.5),
+    )
+    captured: dict[str, object] = {}
+
+    monkeypatch.setattr(commands.Config, "load", staticmethod(lambda: Config(data_dir=tmp_path / "data")))
+    monkeypatch.setattr(
+        commands,
+        "RegistryStore",
+        type(
+            "FakeRegistryStore",
+            (),
+            {
+                "initialize": lambda self: None,
+                "close": lambda self: None,
+            },
+        ),
+    )
+    monkeypatch.setattr(
+        commands,
+        "PathMapper",
+        type(
+            "FakePathMapper",
+            (),
+            {},
+        ),
+    )
+
+    class FakeChainSearchEngine:
+        def __init__(self, registry, mapper, config=None):
+            captured["registry"] = registry
+            captured["mapper"] = mapper
+            captured["config"] = config
+
+        def search(self, *_args, **_kwargs):
+            raise AssertionError("dense_rerank should dispatch via cascade_search")
+
+        def cascade_search(self, query, source_path, k=10, options=None, strategy=None):
+            captured["query"] = query
+            captured["source_path"] = source_path
+            captured["limit"] = k
+            captured["options"] = options
+            captured["strategy"] = strategy
+            return search_result
+
+    monkeypatch.setattr(commands, "ChainSearchEngine", FakeChainSearchEngine)
+
+    runner = CliRunner()
+    result = runner.invoke(
+        app,
+        ["search", "greet function", "--path", str(workspace), "--method", "dense_rerank", "--json"],
+    )
+
+    assert result.exit_code == 0, result.output
+    payload = json.loads(result.stdout)
+    body = payload["result"]
+    assert body["method"] == "dense_rerank"
+    assert body["count"] == 1
+    assert body["results"][0]["path"] == str(workspace / "src" / "app.py")
+    assert captured["strategy"] == "dense_rerank"
+    assert captured["limit"] == 20
+
+
+def test_search_json_auto_routes_keyword_queries_to_fts(
+    monkeypatch,
+    tmp_path,
+) -> None:
+    workspace = tmp_path / "workspace"
+    workspace.mkdir()
+
+    search_result = ChainSearchResult(
+        query="windowsHide",
+        results=[
+            SearchResult(
+                path=str(workspace / "src" / "spawn.ts"),
+                score=0.91,
+                excerpt="windowsHide: true",
+                content="spawn('node', [], { windowsHide: true })",
+            )
+        ],
+        symbols=[],
+        stats=SearchStats(dirs_searched=2, files_matched=1, time_ms=8.0),
+    )
+    captured: dict[str, object] = {}
+
+    monkeypatch.setattr(commands.Config, "load", staticmethod(lambda: Config(data_dir=tmp_path / "data")))
+    monkeypatch.setattr(
+        commands,
+        "RegistryStore",
+        type("FakeRegistryStore", (), {"initialize": lambda self: None, "close": lambda self: None}),
+    )
+    monkeypatch.setattr(commands, "PathMapper", type("FakePathMapper", (), {}))
+
+    class FakeChainSearchEngine:
+        def __init__(self, registry, mapper, config=None):
+            captured["config"] = config
+
+        def search(self, query, source_path, options=None):
+            captured["query"] = query
+            captured["source_path"] = source_path
+            captured["options"] = options
+            return search_result
+
+        def cascade_search(self, *_args, **_kwargs):
+            raise AssertionError("auto keyword queries should not dispatch to cascade_search")
+
+    monkeypatch.setattr(commands, "ChainSearchEngine", FakeChainSearchEngine)
+
+    runner = CliRunner()
+    result = runner.invoke(
+        app,
+        ["search", "windowsHide", "--path", str(workspace), "--json"],
+    )
+
+    assert result.exit_code == 0, result.output
+    body = json.loads(result.stdout)["result"]
+    assert body["method"] == "fts"
+    assert captured["options"].enable_vector is False
+    assert captured["options"].hybrid_mode is False
+
+
+def test_search_json_auto_routes_mixed_queries_to_hybrid(
+    monkeypatch,
+    tmp_path,
+) -> None:
+    workspace = tmp_path / "workspace"
+    workspace.mkdir()
+
+    search_result = ChainSearchResult(
+        query="how does my_function work",
+        results=[
+            SearchResult(
+                path=str(workspace / "src" / "app.py"),
+                score=0.81,
+                excerpt="def my_function():",
+                content="def my_function():\n    return 1\n",
+            )
+        ],
+        symbols=[],
+        stats=SearchStats(dirs_searched=2, files_matched=1, time_ms=10.0),
+    )
+    captured: dict[str, object] = {}
+
+    monkeypatch.setattr(commands.Config, "load", staticmethod(lambda: Config(data_dir=tmp_path / "data")))
+    monkeypatch.setattr(
+        commands,
+        "RegistryStore",
+        type("FakeRegistryStore", (), {"initialize": lambda self: None, "close": lambda self: None}),
+    )
+    monkeypatch.setattr(commands, "PathMapper", type("FakePathMapper", (), {}))
+
+    class FakeChainSearchEngine:
+        def __init__(self, registry, mapper, config=None):
+            captured["config"] = config
+
+        def search(self, query, source_path, options=None):
+            captured["query"] = query
+            captured["source_path"] = source_path
+            captured["options"] = options
+            return search_result
+
+        def cascade_search(self, *_args, **_kwargs):
+            raise AssertionError("mixed auto queries should not dispatch to cascade_search")
+
+    monkeypatch.setattr(commands, "ChainSearchEngine", FakeChainSearchEngine)
+
+    runner = CliRunner()
+    result = runner.invoke(
+        app,
+        ["search", "how does my_function work", "--path", str(workspace), "--json"],
+    )
+
+    assert result.exit_code == 0, result.output
+    body = json.loads(result.stdout)["result"]
+    assert body["method"] == "hybrid"
+    assert captured["options"].enable_vector is True
+    assert captured["options"].hybrid_mode is True
+    assert captured["options"].enable_cascade is False
+
+
+def test_search_json_auto_routes_generated_artifact_queries_to_fts(
+    monkeypatch,
+    tmp_path,
+) -> None:
+    workspace = tmp_path / "workspace"
+    workspace.mkdir()
+
+    search_result = ChainSearchResult(
+        query="dist bundle output",
+        results=[
+            SearchResult(
+                path=str(workspace / "dist" / "bundle.js"),
+                score=0.77,
+                excerpt="bundle output",
+                content="console.log('bundle')",
+            )
+        ],
+        symbols=[],
+        stats=SearchStats(dirs_searched=2, files_matched=1, time_ms=9.0),
+    )
+    captured: dict[str, object] = {}
+
+    monkeypatch.setattr(commands.Config, "load", staticmethod(lambda: Config(data_dir=tmp_path / "data")))
+    monkeypatch.setattr(
+        commands,
+        "RegistryStore",
+        type("FakeRegistryStore", (), {"initialize": lambda self: None, "close": lambda self: None}),
+    )
+    monkeypatch.setattr(commands, "PathMapper", type("FakePathMapper", (), {}))
+
+    class FakeChainSearchEngine:
+        def __init__(self, registry, mapper, config=None):
+            captured["config"] = config
+
+        def search(self, query, source_path, options=None):
+            captured["query"] = query
+            captured["source_path"] = source_path
+            captured["options"] = options
+            return search_result
+
+        def cascade_search(self, *_args, **_kwargs):
+            raise AssertionError("generated artifact auto queries should not dispatch to cascade_search")
+
+    monkeypatch.setattr(commands, "ChainSearchEngine", FakeChainSearchEngine)
+
+    runner = CliRunner()
+    result = runner.invoke(
+        app,
+        ["search", "dist bundle output", "--path", str(workspace), "--json"],
+    )
+
+    assert result.exit_code == 0, result.output
+    body = json.loads(result.stdout)["result"]
+    assert body["method"] == "fts"
+    assert captured["options"].enable_vector is False
+    assert captured["options"].hybrid_mode is False
+
+
+def test_auto_select_search_method_prefers_fts_for_lexical_config_queries() -> None:
+    assert commands._auto_select_search_method("embedding backend fastembed local litellm api config") == "fts"
+    assert commands._auto_select_search_method("get_reranker factory onnx backend selection") == "fts"
+    assert commands._auto_select_search_method("how to authenticate users safely?") == "dense_rerank"
+
+
+def test_search_json_fts_zero_results_uses_filesystem_fallback(
+    monkeypatch,
+    tmp_path,
+) -> None:
+    workspace = tmp_path / "workspace"
+    workspace.mkdir()
+
+    indexed_result = ChainSearchResult(
+        query="find_descendant_project_roots",
+        results=[],
+        symbols=[],
+        stats=SearchStats(dirs_searched=3, files_matched=0, time_ms=7.5),
+    )
+    fallback_result = SearchResult(
+        path=str(workspace / "src" / "registry.py"),
+        score=1.0,
+        excerpt="def find_descendant_project_roots(...):",
+        content=None,
+        metadata={
+            "filesystem_fallback": True,
+            "backend": "ripgrep-fallback",
+            "stale_index_suspected": True,
+        },
+        start_line=12,
+        end_line=12,
+    )
+    captured: dict[str, object] = {"fallback_calls": 0}
+
+    monkeypatch.setattr(commands.Config, "load", staticmethod(lambda: Config(data_dir=tmp_path / "data")))
+    monkeypatch.setattr(
+        commands,
+        "RegistryStore",
+        type("FakeRegistryStore", (), {"initialize": lambda self: None, "close": lambda self: None}),
+    )
+    monkeypatch.setattr(commands, "PathMapper", type("FakePathMapper", (), {}))
+
+    class FakeChainSearchEngine:
+        def __init__(self, registry, mapper, config=None):
+            captured["config"] = config
+
+        def search(self, query, source_path, options=None):
+            captured["query"] = query
+            captured["source_path"] = source_path
+            captured["options"] = options
+            return indexed_result
+
+        def cascade_search(self, *_args, **_kwargs):
+            raise AssertionError("fts zero-result queries should not dispatch to cascade_search")
+
+    def fake_fallback(query, source_path, *, limit, config, code_only=False, exclude_extensions=None):
+        captured["fallback_calls"] = int(captured["fallback_calls"]) + 1
+        captured["fallback_query"] = query
+        captured["fallback_path"] = source_path
+        captured["fallback_limit"] = limit
+        captured["fallback_code_only"] = code_only
+        captured["fallback_exclude_extensions"] = exclude_extensions
+        return {
+            "results": [fallback_result],
+            "time_ms": 2.5,
+            "fallback": {
+                "backend": "ripgrep-fallback",
+                "stale_index_suspected": True,
+                "reason": "Indexed FTS search returned no results; filesystem fallback used.",
+            },
+        }
+
+    monkeypatch.setattr(commands, "ChainSearchEngine", FakeChainSearchEngine)
+    monkeypatch.setattr(commands, "_filesystem_fallback_search", fake_fallback)
+
+    runner = CliRunner()
+    result = runner.invoke(
+        app,
+        ["search", "find_descendant_project_roots", "--method", "fts", "--path", str(workspace), "--json"],
+    )
+
+    assert result.exit_code == 0, result.output
+    body = json.loads(result.stdout)["result"]
+    assert body["method"] == "fts"
+    assert body["count"] == 1
+    assert body["results"][0]["path"] == str(workspace / "src" / "registry.py")
+    assert body["results"][0]["excerpt"] == "def find_descendant_project_roots(...):"
+    assert body["stats"]["files_matched"] == 1
+    assert body["stats"]["time_ms"] == 10.0
+    assert body["fallback"] == {
+        "backend": "ripgrep-fallback",
+        "stale_index_suspected": True,
+        "reason": "Indexed FTS search returned no results; filesystem fallback used.",
+    }
+    assert captured["fallback_calls"] == 1
+    assert captured["fallback_query"] == "find_descendant_project_roots"
+    assert captured["fallback_path"] == workspace
+    assert captured["fallback_limit"] == 20
+    assert captured["options"].enable_vector is False
+    assert captured["options"].hybrid_mode is False
+
+
+def test_search_json_hybrid_zero_results_does_not_use_filesystem_fallback(
+    monkeypatch,
+    tmp_path,
+) -> None:
+    workspace = tmp_path / "workspace"
+    workspace.mkdir()
+
+    indexed_result = ChainSearchResult(
+        query="how does my_function work",
+        results=[],
+        symbols=[],
+        stats=SearchStats(dirs_searched=4, files_matched=0, time_ms=11.0),
+    )
+    captured: dict[str, object] = {"fallback_calls": 0}
+
+    monkeypatch.setattr(commands.Config, "load", staticmethod(lambda: Config(data_dir=tmp_path / "data")))
+    monkeypatch.setattr(
+        commands,
+        "RegistryStore",
+        type("FakeRegistryStore", (), {"initialize": lambda self: None, "close": lambda self: None}),
+    )
+    monkeypatch.setattr(commands, "PathMapper", type("FakePathMapper", (), {}))
+
+    class FakeChainSearchEngine:
+        def __init__(self, registry, mapper, config=None):
+            captured["config"] = config
+
+        def search(self, query, source_path, options=None):
+            captured["query"] = query
+            captured["source_path"] = source_path
+            captured["options"] = options
+            return indexed_result
+
+        def cascade_search(self, *_args, **_kwargs):
+            raise AssertionError("hybrid queries should not dispatch to cascade_search")
+
+    def fake_fallback(*_args, **_kwargs):
+        captured["fallback_calls"] = int(captured["fallback_calls"]) + 1
+        return None
+
+    monkeypatch.setattr(commands, "ChainSearchEngine", FakeChainSearchEngine)
+    monkeypatch.setattr(commands, "_filesystem_fallback_search", fake_fallback)
+
+    runner = CliRunner()
+    result = runner.invoke(
+        app,
+        ["search", "how does my_function work", "--path", str(workspace), "--json"],
+    )
+
+    assert result.exit_code == 0, result.output
+    body = json.loads(result.stdout)["result"]
+    assert body["method"] == "hybrid"
+    assert body["count"] == 0
+    assert "fallback" not in body
+    assert body["stats"]["files_matched"] == 0
+    assert body["stats"]["time_ms"] == 11.0
+    assert captured["fallback_calls"] == 0
+    assert captured["options"].enable_vector is True
+    assert captured["options"].hybrid_mode is True
+
+
+def test_filesystem_fallback_search_prefers_source_definitions_for_keyword_queries(
+    monkeypatch,
+    tmp_path,
+) -> None:
+    workspace = tmp_path / "workspace"
+    workspace.mkdir()
+
+    source_path = workspace / "src" / "registry.py"
+    test_path = workspace / "tests" / "test_registry.py"
+    ref_path = workspace / "src" / "chain_search.py"
+
+    match_lines = [
+        {
+            "type": "match",
+            "data": {
+                "path": {"text": str(test_path)},
+                "lines": {"text": "def test_find_descendant_project_roots_returns_nested_project_roots():\n"},
+                "line_number": 12,
+            },
+        },
+        {
+            "type": "match",
+            "data": {
+                "path": {"text": str(source_path)},
+                "lines": {"text": "def find_descendant_project_roots(self, source_root: Path) -> List[DirMapping]:\n"},
+                "line_number": 48,
+            },
+        },
+        {
+            "type": "match",
+            "data": {
+                "path": {"text": str(ref_path)},
+                "lines": {"text": "descendant_roots = self.registry.find_descendant_project_roots(source_root)\n"},
+                "line_number": 91,
+            },
+        },
+    ]
+
+    monkeypatch.setattr(commands.shutil, "which", lambda _name: "rg")
+    monkeypatch.setattr(
+        commands.subprocess,
+        "run",
+        lambda *_args, **_kwargs: type(
+            "FakeCompletedProcess",
+            (),
+            {
+                "returncode": 0,
+                "stdout": "\n".join(json.dumps(line) for line in match_lines),
+                "stderr": "",
+            },
+        )(),
+    )
+
+    fallback = commands._filesystem_fallback_search(
+        "find_descendant_project_roots",
+        workspace,
+        limit=5,
+        config=Config(data_dir=tmp_path / "data"),
+    )
+
+    assert fallback is not None
+    assert fallback["fallback"]["backend"] == "ripgrep-fallback"
+    assert fallback["results"][0].path == str(source_path)
+    assert fallback["results"][1].path == str(ref_path)
+    assert fallback["results"][2].path == str(test_path)
+    assert fallback["results"][0].score > fallback["results"][1].score > fallback["results"][2].score
+
+
+def test_clean_json_reports_partial_success_when_locked_files_remain(
+    monkeypatch,
+    tmp_path,
+) -> None:
+    workspace = tmp_path / "workspace"
+    project_index = tmp_path / "indexes" / "workspace"
+    project_index.mkdir(parents=True)
+    (project_index / "_index.db").write_text("db", encoding="utf-8")
+    locked_path = project_index / "nested" / "_index.db"
+    locked_path.parent.mkdir(parents=True)
+    locked_path.write_text("locked", encoding="utf-8")
+
+    captured: dict[str, object] = {}
+
+    class FakePathMapper:
+        def __init__(self):
+            self.index_root = tmp_path / "indexes"
+
+        def source_to_index_dir(self, source_path):
+            captured["mapped_source"] = source_path
+            return project_index
+
+    class FakeRegistryStore:
+        def initialize(self):
+            captured["registry_initialized"] = True
+
+        def unregister_project(self, source_path):
+            captured["unregistered_project"] = source_path
+            return True
+
+        def close(self):
+            captured["registry_closed"] = True
+
+    def fake_remove_tree(target):
+        captured["removed_target"] = target
+        return {
+            "removed": False,
+            "partial": True,
+            "locked_paths": [str(locked_path)],
+            "remaining_path": str(project_index),
+            "errors": [],
+        }
+
+    monkeypatch.setattr(commands, "PathMapper", FakePathMapper)
+    monkeypatch.setattr(commands, "RegistryStore", FakeRegistryStore)
+    monkeypatch.setattr(commands, "_remove_tree_best_effort", fake_remove_tree)
+
+    runner = CliRunner()
+    result = runner.invoke(app, ["clean", str(workspace), "--json"])
+
+    assert result.exit_code == 0, result.output
+    payload = json.loads(result.stdout)
+    body = payload["result"]
+    assert payload["success"] is True
+    assert body["cleaned"] == str(workspace.resolve())
+    assert body["index_path"] == str(project_index)
+    assert body["partial"] is True
+    assert body["locked_paths"] == [str(locked_path)]
+    assert body["remaining_path"] == str(project_index)
+    assert captured["registry_initialized"] is True
+    assert captured["registry_closed"] is True
+    assert captured["unregistered_project"] == workspace.resolve()
+    assert captured["removed_target"] == project_index
diff --git a/codex-lens/tests/test_index_tree_ignore_dirs.py b/codex-lens/tests/test_index_tree_ignore_dirs.py
index c1a1fdbd..f9c51773 100644
--- a/codex-lens/tests/test_index_tree_ignore_dirs.py
+++ b/codex-lens/tests/test_index_tree_ignore_dirs.py
@@ -5,7 +5,10 @@ from pathlib import Path
 from unittest.mock import MagicMock
 
 from codexlens.config import Config
-from codexlens.storage.index_tree import IndexTreeBuilder
+from codexlens.storage.dir_index import DirIndexStore
+from codexlens.storage.index_tree import DirBuildResult, IndexTreeBuilder
+from codexlens.storage.path_mapper import PathMapper
+from codexlens.storage.registry import RegistryStore
 
 
 def _relative_dirs(source_root: Path, dirs_by_depth: dict[int, list[Path]]) -> set[str]:
@@ -145,3 +148,148 @@ def test_builder_loads_saved_ignore_and_extension_filters_by_default(tmp_path: P
 
     assert [path.name for path in source_files] == ["app.ts"]
     assert "frontend/dist" not in discovered_dirs
+
+
+def test_prune_stale_project_dirs_removes_ignored_artifact_mappings(tmp_path: Path) -> None:
+    workspace = tmp_path / "workspace"
+    src_dir = workspace / "src"
+    dist_dir = workspace / "dist"
+    src_dir.mkdir(parents=True)
+    dist_dir.mkdir(parents=True)
+    (src_dir / "app.py").write_text("print('ok')\n", encoding="utf-8")
+    (dist_dir / "bundle.py").write_text("print('artifact')\n", encoding="utf-8")
+
+    mapper = PathMapper(index_root=tmp_path / "indexes")
+    registry = RegistryStore(db_path=tmp_path / "registry.db")
+    registry.initialize()
+    project = registry.register_project(workspace, mapper.source_to_index_dir(workspace))
+    registry.register_dir(project.id, workspace, mapper.source_to_index_db(workspace), depth=0)
+    registry.register_dir(project.id, src_dir, mapper.source_to_index_db(src_dir), depth=1)
+    registry.register_dir(project.id, dist_dir, mapper.source_to_index_db(dist_dir), depth=1)
+
+    builder = IndexTreeBuilder(
+        registry=registry,
+        mapper=mapper,
+        config=Config(data_dir=tmp_path / "data"),
+        incremental=False,
+    )
+
+    dirs_by_depth = builder._collect_dirs_by_depth(workspace)
+    pruned = builder._prune_stale_project_dirs(
+        project_id=project.id,
+        source_root=workspace,
+        dirs_by_depth=dirs_by_depth,
+    )
+
+    remaining = {mapping.source_path.resolve() for mapping in registry.get_project_dirs(project.id)}
+    registry.close()
+
+    assert dist_dir.resolve() in pruned
+    assert workspace.resolve() in remaining
+    assert src_dir.resolve() in remaining
+    assert dist_dir.resolve() not in remaining
+
+
+def test_force_full_build_prunes_stale_ignored_mappings(tmp_path: Path) -> None:
+    workspace = tmp_path / "workspace"
+    src_dir = workspace / "src"
+    dist_dir = workspace / "dist"
+    src_dir.mkdir(parents=True)
+    dist_dir.mkdir(parents=True)
+    (src_dir / "app.py").write_text("print('ok')\n", encoding="utf-8")
+    (dist_dir / "bundle.py").write_text("print('artifact')\n", encoding="utf-8")
+
+    mapper = PathMapper(index_root=tmp_path / "indexes")
+    registry = RegistryStore(db_path=tmp_path / "registry.db")
+    registry.initialize()
+    project = registry.register_project(workspace, mapper.source_to_index_dir(workspace))
+    registry.register_dir(project.id, workspace, mapper.source_to_index_db(workspace), depth=0)
+    registry.register_dir(project.id, dist_dir, mapper.source_to_index_db(dist_dir), depth=1)
+
+    builder = IndexTreeBuilder(
+        registry=registry,
+        mapper=mapper,
+        config=Config(
+            data_dir=tmp_path / "data",
+            global_symbol_index_enabled=False,
+        ),
+        incremental=False,
+    )
+
+    def fake_build_level_parallel(
+        dirs: list[Path],
+        languages,
+        workers,
+        *,
+        source_root: Path,
+        project_id: int,
+        global_index_db_path: Path,
+    ) -> list[DirBuildResult]:
+        return [
+            DirBuildResult(
+                source_path=dir_path,
+                index_path=mapper.source_to_index_db(dir_path),
+                files_count=1 if dir_path == src_dir else 0,
+                symbols_count=0,
+                subdirs=[],
+            )
+            for dir_path in dirs
+        ]
+
+    builder._build_level_parallel = fake_build_level_parallel  # type: ignore[method-assign]
+    builder._link_children_to_parent = MagicMock()
+
+    build_result = builder.build(workspace, force_full=True, workers=1)
+
+    remaining = {mapping.source_path.resolve() for mapping in registry.get_project_dirs(project.id)}
+    registry.close()
+
+    assert build_result.total_dirs == 2
+    assert workspace.resolve() in remaining
+    assert src_dir.resolve() in remaining
+    assert dist_dir.resolve() not in remaining
+
+
+def test_force_full_build_rewrites_directory_db_and_drops_stale_ignored_subdirs(
+    tmp_path: Path,
+) -> None:
+    project_root = tmp_path / "project"
+    src_dir = project_root / "src"
+    build_dir = project_root / "build"
+    src_dir.mkdir(parents=True)
+    build_dir.mkdir(parents=True)
+    (src_dir / "app.py").write_text("print('ok')\n", encoding="utf-8")
+    (build_dir / "generated.py").write_text("print('artifact')\n", encoding="utf-8")
+
+    mapper = PathMapper(index_root=tmp_path / "indexes")
+    registry = RegistryStore(db_path=tmp_path / "registry.db")
+    registry.initialize()
+    config = Config(
+        data_dir=tmp_path / "data",
+        global_symbol_index_enabled=False,
+    )
+
+    root_index_db = mapper.source_to_index_db(project_root)
+    with DirIndexStore(root_index_db, config=config) as store:
+        store.register_subdir(
+            name="build",
+            index_path=mapper.source_to_index_db(build_dir),
+            files_count=1,
+        )
+
+    builder = IndexTreeBuilder(
+        registry=registry,
+        mapper=mapper,
+        config=config,
+        incremental=False,
+    )
+
+    build_result = builder.build(project_root, force_full=True, workers=1)
+
+    with DirIndexStore(root_index_db, config=config) as store:
+        subdir_names = [link.name for link in store.get_subdirs()]
+
+    registry.close()
+
+    assert build_result.total_dirs == 2
+    assert subdir_names == ["src"]
diff --git a/codex-lens/tests/test_ranking.py b/codex-lens/tests/test_ranking.py
index a5e8b36a..a082d22e 100644
--- a/codex-lens/tests/test_ranking.py
+++ b/codex-lens/tests/test_ranking.py
@@ -24,13 +24,24 @@ from codexlens.entities import SearchResult
 from codexlens.search.ranking import (
     DEFAULT_WEIGHTS,
     QueryIntent,
+    apply_path_penalties,
+    extract_explicit_path_hints,
+    cross_encoder_rerank,
     adjust_weights_by_intent,
     apply_symbol_boost,
     detect_query_intent,
     filter_results_by_category,
     get_rrf_weights,
     group_similar_results,
+    is_auxiliary_reference_path,
+    is_generated_artifact_path,
+    is_test_file,
     normalize_weights,
+    query_prefers_lexical_search,
+    query_targets_auxiliary_files,
+    query_targets_generated_files,
+    query_targets_test_files,
+    rebalance_noisy_results,
     reciprocal_rank_fusion,
     simple_weighted_fusion,
 )
@@ -73,6 +84,7 @@ class TestDetectQueryIntent:
     def test_detect_keyword_intent(self):
         """CamelCase/underscore queries should be detected as KEYWORD."""
         assert detect_query_intent("MyClassName") == QueryIntent.KEYWORD
+        assert detect_query_intent("windowsHide") == QueryIntent.KEYWORD
         assert detect_query_intent("my_function_name") == QueryIntent.KEYWORD
         assert detect_query_intent("foo::bar") == QueryIntent.KEYWORD
 
@@ -91,6 +103,25 @@ class TestDetectQueryIntent:
         assert detect_query_intent("") == QueryIntent.MIXED
         assert detect_query_intent("   ") == QueryIntent.MIXED
 
+    def test_query_targets_test_files(self):
+        """Queries explicitly mentioning tests should skip test penalties."""
+        assert query_targets_test_files("how do tests cover auth flow?")
+        assert query_targets_test_files("spec fixtures for parser")
+        assert not query_targets_test_files("windowsHide")
+
+    def test_query_targets_generated_files(self):
+        """Queries explicitly mentioning build artifacts should skip that penalty."""
+        assert query_targets_generated_files("inspect dist bundle output")
+        assert query_targets_generated_files("generated artifacts under build")
+        assert not query_targets_generated_files("cache invalidation strategy")
+
+    def test_query_prefers_lexical_search(self):
+        """Config/env/factory queries should prefer lexical-first routing."""
+        assert query_prefers_lexical_search("embedding backend fastembed local litellm api config")
+        assert query_prefers_lexical_search("get_reranker factory onnx backend selection")
+        assert query_prefers_lexical_search("EMBEDDING_BACKEND and RERANKER_BACKEND environment variables")
+        assert not query_prefers_lexical_search("how does smart search route keyword queries")
+
 
 # =============================================================================
 # Tests: adjust_weights_by_intent
@@ -129,6 +160,427 @@ class TestAdjustWeightsByIntent:
         assert adjusted["exact"] == pytest.approx(0.3, abs=0.01)
 
 
+class TestPathPenalties:
+    """Tests for lightweight path-based ranking penalties."""
+
+    def test_is_test_file(self):
+        assert is_test_file("/repo/tests/test_auth.py")
+        assert is_test_file("D:\\repo\\src\\auth.spec.ts")
+        assert is_test_file("/repo/frontend/src/pages/discoverypage.test.tsx")
+        assert is_test_file("/repo/frontend/src/pages/discoverypage.spec.jsx")
+        assert not is_test_file("/repo/src/auth.py")
+
+    def test_is_generated_artifact_path(self):
+        assert is_generated_artifact_path("/repo/dist/app.js")
+        assert is_generated_artifact_path("/repo/src/generated/client.ts")
+        assert is_generated_artifact_path("D:\\repo\\frontend\\.next\\server.js")
+        assert not is_generated_artifact_path("/repo/src/auth.py")
+
+    def test_is_auxiliary_reference_path(self):
+        assert is_auxiliary_reference_path("/repo/examples/auth_demo.py")
+        assert is_auxiliary_reference_path("/repo/benchmarks/search_eval.py")
+        assert is_auxiliary_reference_path("/repo/tools/debug_search.py")
+        assert not is_auxiliary_reference_path("/repo/src/auth.py")
+
+    def test_query_targets_auxiliary_files(self):
+        assert query_targets_auxiliary_files("show smart search examples")
+        assert query_targets_auxiliary_files("benchmark smart search")
+        assert not query_targets_auxiliary_files("smart search routing")
+
+    def test_apply_path_penalties_demotes_test_files(self):
+        results = [
+            _make_result(path="/repo/tests/test_auth.py", score=10.0),
+            _make_result(path="/repo/src/auth.py", score=9.0),
+        ]
+
+        penalized = apply_path_penalties(
+            results,
+            "authenticate user",
+            test_file_penalty=0.15,
+        )
+
+        assert penalized[0].path == "/repo/src/auth.py"
+        assert penalized[1].metadata["path_penalty_reasons"] == ["test_file"]
+
+    def test_apply_path_penalties_more_aggressively_demotes_tests_for_keyword_queries(self):
+        results = [
+            _make_result(path="/repo/tests/test_auth.py", score=5.0),
+            _make_result(path="/repo/src/auth.py", score=4.0),
+        ]
+
+        penalized = apply_path_penalties(
+            results,
+            "find_descendant_project_roots",
+            test_file_penalty=0.15,
+        )
+
+        assert penalized[0].path == "/repo/src/auth.py"
+        assert penalized[1].metadata["path_penalty_reasons"] == ["test_file"]
+        assert penalized[1].metadata["path_penalty_multiplier"] == pytest.approx(0.55)
+        assert penalized[1].metadata["path_rank_multiplier"] == pytest.approx(0.55)
+
+    def test_apply_path_penalties_more_aggressively_demotes_tests_for_semantic_queries(self):
+        results = [
+            _make_result(path="/repo/tests/test_auth.py", score=5.0),
+            _make_result(path="/repo/src/auth.py", score=4.1),
+        ]
+
+        penalized = apply_path_penalties(
+            results,
+            "how does auth routing work",
+            test_file_penalty=0.15,
+        )
+
+        assert penalized[0].path == "/repo/src/auth.py"
+        assert penalized[1].metadata["path_penalty_reasons"] == ["test_file"]
+        assert penalized[1].metadata["path_penalty_multiplier"] == pytest.approx(0.75)
+
+    def test_apply_path_penalties_boosts_source_definitions_for_identifier_queries(self):
+        results = [
+            _make_result(
+                path="/repo/tests/test_registry.py",
+                score=4.2,
+                excerpt='query="find_descendant_project_roots"',
+            ),
+            _make_result(
+                path="/repo/src/registry.py",
+                score=3.0,
+                excerpt="def find_descendant_project_roots(self, source_root: Path) -> list[str]:",
+            ),
+        ]
+
+        penalized = apply_path_penalties(
+            results,
+            "find_descendant_project_roots",
+            test_file_penalty=0.15,
+        )
+
+        assert penalized[0].path == "/repo/src/registry.py"
+        assert penalized[0].metadata["path_boost_reasons"] == ["source_definition"]
+        assert penalized[0].metadata["path_boost_multiplier"] == pytest.approx(2.0)
+        assert penalized[0].metadata["path_rank_multiplier"] == pytest.approx(2.0)
+        assert penalized[1].metadata["path_penalty_reasons"] == ["test_file"]
+
+    def test_apply_path_penalties_boosts_source_paths_for_semantic_feature_queries(self):
+        results = [
+            _make_result(
+                path="/repo/tests/smart-search-intent.test.js",
+                score=0.832,
+                excerpt="describes how smart search routes keyword queries",
+            ),
+            _make_result(
+                path="/repo/src/tools/smart-search.ts",
+                score=0.555,
+                excerpt="smart search keyword routing logic",
+            ),
+        ]
+
+        penalized = apply_path_penalties(
+            results,
+            "how does smart search route keyword queries",
+            test_file_penalty=0.15,
+        )
+
+        assert penalized[0].path == "/repo/src/tools/smart-search.ts"
+        assert penalized[0].metadata["path_boost_reasons"] == ["source_path_topic_overlap"]
+        assert penalized[0].metadata["path_boost_multiplier"] == pytest.approx(1.35)
+        assert penalized[0].metadata["path_boost_overlap_tokens"] == ["smart", "search"]
+        assert penalized[1].metadata["path_penalty_reasons"] == ["test_file"]
+
+    def test_apply_path_penalties_strongly_boosts_keyword_basename_overlap(self):
+        results = [
+            _make_result(
+                path="/repo/src/tools/core-memory.ts",
+                score=0.04032417772512223,
+                excerpt="memory listing helpers",
+            ),
+            _make_result(
+                path="/repo/src/tools/smart-search.ts",
+                score=0.009836065573770493,
+                excerpt="smart search keyword routing logic",
+            ),
+        ]
+
+        penalized = apply_path_penalties(
+            results,
+            "executeHybridMode dense_rerank semantic smart_search",
+            test_file_penalty=0.15,
+        )
+
+        assert penalized[0].path == "/repo/src/tools/smart-search.ts"
+        assert penalized[0].metadata["path_boost_reasons"] == ["source_path_topic_overlap"]
+        assert penalized[0].metadata["path_boost_multiplier"] == pytest.approx(4.5)
+        assert penalized[0].metadata["path_boost_overlap_tokens"] == ["smart", "search"]
+
+    def test_extract_explicit_path_hints_ignores_generic_platform_terms(self):
+        assert extract_explicit_path_hints(
+            "parse CodexLens JSON output strip ANSI smart_search",
+        ) == [["smart", "search"]]
+
+    def test_apply_path_penalties_prefers_explicit_feature_hint_over_platform_terms(self):
+        results = [
+            _make_result(
+                path="/repo/src/tools/codex-lens-lsp.ts",
+                score=0.045,
+                excerpt="CodexLens LSP bridge",
+            ),
+            _make_result(
+                path="/repo/src/tools/smart-search.ts",
+                score=0.03,
+                excerpt="parse JSON output and strip ANSI for plain-text fallback",
+            ),
+        ]
+
+        penalized = apply_path_penalties(
+            results,
+            "parse CodexLens JSON output strip ANSI smart_search",
+            test_file_penalty=0.15,
+        )
+
+        assert penalized[0].path == "/repo/src/tools/smart-search.ts"
+        assert penalized[0].metadata["path_boost_reasons"] == ["source_path_topic_overlap"]
+        assert penalized[0].metadata["path_boost_overlap_tokens"] == ["smart", "search"]
+
+    def test_apply_path_penalties_strongly_boosts_lexical_config_modules(self):
+        results = [
+            _make_result(
+                path="/repo/src/tools/smart-search.ts",
+                score=22.07,
+                excerpt="embedding backend local api config routing",
+            ),
+            _make_result(
+                path="/repo/src/codexlens/config.py",
+                score=4.88,
+                excerpt="embedding_backend = 'fastembed'",
+            ),
+        ]
+
+        penalized = apply_path_penalties(
+            results,
+            "embedding backend fastembed local litellm api config",
+            test_file_penalty=0.15,
+        )
+
+        assert penalized[0].path == "/repo/src/codexlens/config.py"
+        assert penalized[0].metadata["path_boost_reasons"] == ["source_path_topic_overlap"]
+        assert penalized[0].metadata["path_boost_multiplier"] == pytest.approx(5.0)
+        assert penalized[0].metadata["path_boost_overlap_tokens"] == ["config"]
+
+    def test_apply_path_penalties_more_aggressively_demotes_tests_for_explicit_feature_queries(self):
+        results = [
+            _make_result(
+                path="/repo/tests/smart-search-intent.test.js",
+                score=1.0,
+                excerpt="smart search intent coverage",
+            ),
+            _make_result(
+                path="/repo/src/tools/smart-search.ts",
+                score=0.58,
+                excerpt="plain-text JSON fallback for smart search",
+            ),
+        ]
+
+        penalized = apply_path_penalties(
+            results,
+            "parse CodexLens JSON output strip ANSI smart_search",
+            test_file_penalty=0.15,
+        )
+
+        assert penalized[0].path == "/repo/src/tools/smart-search.ts"
+        assert penalized[1].metadata["path_penalty_reasons"] == ["test_file"]
+        assert penalized[1].metadata["path_penalty_multiplier"] == pytest.approx(0.55)
+
+    def test_apply_path_penalties_demotes_generated_artifacts(self):
+        results = [
+            _make_result(path="/repo/dist/auth.js", score=10.0),
+            _make_result(path="/repo/src/auth.ts", score=9.0),
+        ]
+
+        penalized = apply_path_penalties(
+            results,
+            "authenticate user",
+            generated_file_penalty=0.35,
+        )
+
+        assert penalized[0].path == "/repo/src/auth.ts"
+        assert penalized[1].metadata["path_penalty_reasons"] == ["generated_artifact"]
+
+    def test_apply_path_penalties_more_aggressively_demotes_generated_artifacts_for_explicit_feature_queries(self):
+        results = [
+            _make_result(
+                path="/repo/dist/tools/smart-search.js",
+                score=1.0,
+                excerpt="built smart search output",
+            ),
+            _make_result(
+                path="/repo/src/tools/smart-search.ts",
+                score=0.45,
+                excerpt="plain-text JSON fallback for smart search",
+            ),
+        ]
+
+        penalized = apply_path_penalties(
+            results,
+            "parse CodexLens JSON output strip ANSI smart_search",
+            generated_file_penalty=0.35,
+        )
+
+        assert penalized[0].path == "/repo/src/tools/smart-search.ts"
+        assert penalized[1].metadata["path_penalty_reasons"] == ["generated_artifact"]
+        assert penalized[1].metadata["path_penalty_multiplier"] == pytest.approx(0.4)
+
+    def test_apply_path_penalties_demotes_auxiliary_reference_files(self):
+        results = [
+            _make_result(path="/repo/examples/simple_search_comparison.py", score=10.0),
+            _make_result(path="/repo/src/search/router.py", score=9.0),
+        ]
+
+        penalized = apply_path_penalties(
+            results,
+            "how does smart search route keyword queries",
+            test_file_penalty=0.15,
+        )
+
+        assert penalized[0].path == "/repo/src/search/router.py"
+        assert penalized[1].metadata["path_penalty_reasons"] == ["auxiliary_file"]
+
+    def test_apply_path_penalties_more_aggressively_demotes_auxiliary_files_for_explicit_feature_queries(self):
+        results = [
+            _make_result(
+                path="/repo/benchmarks/smart_search_demo.py",
+                score=1.0,
+                excerpt="demo for smart search fallback",
+            ),
+            _make_result(
+                path="/repo/src/tools/smart-search.ts",
+                score=0.52,
+                excerpt="plain-text JSON fallback for smart search",
+            ),
+        ]
+
+        penalized = apply_path_penalties(
+            results,
+            "parse CodexLens JSON output strip ANSI smart_search",
+            test_file_penalty=0.15,
+        )
+
+        assert penalized[0].path == "/repo/src/tools/smart-search.ts"
+        assert penalized[1].metadata["path_penalty_reasons"] == ["auxiliary_file"]
+        assert penalized[1].metadata["path_penalty_multiplier"] == pytest.approx(0.5)
+
+    def test_apply_path_penalties_skips_when_query_targets_tests(self):
+        results = [
+            _make_result(path="/repo/tests/test_auth.py", score=10.0),
+            _make_result(path="/repo/src/auth.py", score=9.0),
+        ]
+
+        penalized = apply_path_penalties(
+            results,
+            "auth tests",
+            test_file_penalty=0.15,
+        )
+
+        assert penalized[0].path == "/repo/tests/test_auth.py"
+
+    def test_apply_path_penalties_skips_generated_penalty_when_query_targets_artifacts(self):
+        results = [
+            _make_result(path="/repo/dist/auth.js", score=10.0),
+            _make_result(path="/repo/src/auth.ts", score=9.0),
+        ]
+
+        penalized = apply_path_penalties(
+            results,
+            "dist auth bundle",
+            generated_file_penalty=0.35,
+        )
+
+        assert penalized[0].path == "/repo/dist/auth.js"
+
+    def test_rebalance_noisy_results_pushes_explicit_feature_query_noise_behind_source_files(self):
+        results = [
+            _make_result(path="/repo/src/tools/smart-search.ts", score=0.9),
+            _make_result(path="/repo/tests/smart-search-intent.test.tsx", score=0.8),
+            _make_result(path="/repo/src/core/cli-routes.ts", score=0.7),
+            _make_result(path="/repo/dist/tools/smart-search.js", score=0.6),
+            _make_result(path="/repo/benchmarks/smart_search_demo.py", score=0.5),
+        ]
+
+        rebalanced = rebalance_noisy_results(
+            results,
+            "parse CodexLens JSON output strip ANSI smart_search",
+        )
+
+        assert [item.path for item in rebalanced[:2]] == [
+            "/repo/src/tools/smart-search.ts",
+            "/repo/src/core/cli-routes.ts",
+        ]
+
+    def test_rebalance_noisy_results_preserves_tests_when_query_targets_them(self):
+        results = [
+            _make_result(path="/repo/tests/smart-search-intent.test.tsx", score=0.9),
+            _make_result(path="/repo/src/tools/smart-search.ts", score=0.8),
+        ]
+
+        rebalanced = rebalance_noisy_results(results, "smart search tests")
+
+        assert [item.path for item in rebalanced] == [
+            "/repo/tests/smart-search-intent.test.tsx",
+            "/repo/src/tools/smart-search.ts",
+        ]
+
+    def test_apply_path_penalties_skips_auxiliary_penalty_when_query_targets_examples(self):
+        results = [
+            _make_result(path="/repo/examples/simple_search_comparison.py", score=10.0),
+            _make_result(path="/repo/src/search/router.py", score=9.0),
+        ]
+
+        penalized = apply_path_penalties(
+            results,
+            "smart search examples",
+            test_file_penalty=0.15,
+        )
+
+        assert penalized[0].path == "/repo/examples/simple_search_comparison.py"
+
+
+class TestCrossEncoderRerank:
+    """Tests for cross-encoder reranking edge cases."""
+
+    def test_cross_encoder_rerank_preserves_strong_source_candidates_for_semantic_feature_queries(self):
+        class DummyReranker:
+            def score_pairs(self, pairs, batch_size=32):
+                _ = (pairs, batch_size)
+                return [0.8323705792427063, 1.2463066923373844e-05]
+
+        reranked = cross_encoder_rerank(
+            "how does smart search route keyword queries",
+            [
+                _make_result(
+                    path="/repo/tests/smart-search-intent.test.js",
+                    score=0.5989155769348145,
+                    excerpt="describes how smart search routes keyword queries",
+                ),
+                _make_result(
+                    path="/repo/src/tools/smart-search.ts",
+                    score=0.554444432258606,
+                    excerpt="smart search keyword routing logic",
+                ),
+            ],
+            DummyReranker(),
+            top_k=2,
+        )
+        reranked = apply_path_penalties(
+            reranked,
+            "how does smart search route keyword queries",
+            test_file_penalty=0.15,
+        )
+
+        assert reranked[0].path == "/repo/src/tools/smart-search.ts"
+        assert reranked[0].metadata["cross_encoder_floor_reason"] == "semantic_source_path_overlap"
+        assert reranked[0].metadata["cross_encoder_floor_overlap_tokens"] == ["smart", "search"]
+        assert reranked[0].metadata["path_boost_reasons"] == ["source_path_topic_overlap"]
+        assert reranked[1].metadata["path_penalty_reasons"] == ["test_file"]
+
 # =============================================================================
 # Tests: get_rrf_weights
 # =============================================================================
diff --git a/codex-lens/tests/test_registry.py b/codex-lens/tests/test_registry.py
index e8f54d0d..b610140a 100644
--- a/codex-lens/tests/test_registry.py
+++ b/codex-lens/tests/test_registry.py
@@ -67,3 +67,60 @@ def test_find_nearest_index(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) ->
         assert found is not None
         assert found.id == mapping.id
 
+
+def test_find_descendant_project_roots_returns_nested_project_roots(tmp_path: Path) -> None:
+    db_path = tmp_path / "registry.db"
+    workspace_root = tmp_path / "workspace"
+    child_a = workspace_root / "packages" / "app-a"
+    child_b = workspace_root / "tools" / "app-b"
+    outside_root = tmp_path / "external"
+
+    with RegistryStore(db_path=db_path) as store:
+        workspace_project = store.register_project(
+            workspace_root,
+            tmp_path / "indexes" / "workspace",
+        )
+        child_a_project = store.register_project(
+            child_a,
+            tmp_path / "indexes" / "workspace" / "packages" / "app-a",
+        )
+        child_b_project = store.register_project(
+            child_b,
+            tmp_path / "indexes" / "workspace" / "tools" / "app-b",
+        )
+        outside_project = store.register_project(
+            outside_root,
+            tmp_path / "indexes" / "external",
+        )
+
+        store.register_dir(
+            workspace_project.id,
+            workspace_root,
+            tmp_path / "indexes" / "workspace" / "_index.db",
+            depth=0,
+        )
+        child_a_mapping = store.register_dir(
+            child_a_project.id,
+            child_a,
+            tmp_path / "indexes" / "workspace" / "packages" / "app-a" / "_index.db",
+            depth=0,
+        )
+        child_b_mapping = store.register_dir(
+            child_b_project.id,
+            child_b,
+            tmp_path / "indexes" / "workspace" / "tools" / "app-b" / "_index.db",
+            depth=0,
+        )
+        store.register_dir(
+            outside_project.id,
+            outside_root,
+            tmp_path / "indexes" / "external" / "_index.db",
+            depth=0,
+        )
+
+        descendants = store.find_descendant_project_roots(workspace_root)
+
+        assert [mapping.index_path for mapping in descendants] == [
+            child_a_mapping.index_path,
+            child_b_mapping.index_path,
+        ]
diff --git a/codex-lens/tests/test_reranker_factory.py b/codex-lens/tests/test_reranker_factory.py
index 682c410f..62647d1d 100644
--- a/codex-lens/tests/test_reranker_factory.py
+++ b/codex-lens/tests/test_reranker_factory.py
@@ -313,3 +313,89 @@ def test_onnx_reranker_scores_pairs_with_sigmoid_normalization(
 
     expected = [1.0 / (1.0 + math.exp(-float(i))) for i in range(len(pairs))]
     assert scores == pytest.approx(expected, rel=1e-6, abs=1e-6)
+
+
+def test_onnx_reranker_splits_tuple_providers_into_provider_options(
+    monkeypatch: pytest.MonkeyPatch,
+) -> None:
+    import numpy as np
+
+    captured: dict[str, object] = {}
+
+    dummy_onnxruntime = types.ModuleType("onnxruntime")
+
+    dummy_optimum = types.ModuleType("optimum")
+    dummy_optimum.__path__ = []
+    dummy_optimum_ort = types.ModuleType("optimum.onnxruntime")
+
+    class DummyModelOutput:
+        def __init__(self, logits: np.ndarray) -> None:
+            self.logits = logits
+
+    class DummyModel:
+        input_names = ["input_ids", "attention_mask"]
+
+        def __call__(self, **inputs):
+            batch = int(inputs["input_ids"].shape[0])
+            return DummyModelOutput(logits=np.zeros((batch, 1), dtype=np.float32))
+
+    class DummyORTModelForSequenceClassification:
+        @classmethod
+        def from_pretrained(
+            cls,
+            model_name: str,
+            providers=None,
+            provider_options=None,
+            **kwargs,
+        ):
+            captured["model_name"] = model_name
+            captured["providers"] = providers
+            captured["provider_options"] = provider_options
+            captured["kwargs"] = kwargs
+            return DummyModel()
+
+    dummy_optimum_ort.ORTModelForSequenceClassification = DummyORTModelForSequenceClassification
+
+    dummy_transformers = types.ModuleType("transformers")
+
+    class DummyAutoTokenizer:
+        model_max_length = 512
+
+        @classmethod
+        def from_pretrained(cls, model_name: str, **kwargs):
+            _ = model_name, kwargs
+            return cls()
+
+        def __call__(self, *, text, text_pair, return_tensors, **kwargs):
+            _ = text_pair, kwargs
+            assert return_tensors == "np"
+            batch = len(text)
+            return {
+                "input_ids": np.zeros((batch, 4), dtype=np.int64),
+                "attention_mask": np.ones((batch, 4), dtype=np.int64),
+            }
+
+    dummy_transformers.AutoTokenizer = DummyAutoTokenizer
+
+    monkeypatch.setitem(sys.modules, "onnxruntime", dummy_onnxruntime)
+    monkeypatch.setitem(sys.modules, "optimum", dummy_optimum)
+    monkeypatch.setitem(sys.modules, "optimum.onnxruntime", dummy_optimum_ort)
+    monkeypatch.setitem(sys.modules, "transformers", dummy_transformers)
+
+    reranker = get_reranker(
+        backend="onnx",
+        model_name="dummy-model",
+        use_gpu=True,
+        providers=[
+            ("DmlExecutionProvider", {"device_id": 1}),
+            "CPUExecutionProvider",
+        ],
+    )
+    assert isinstance(reranker, ONNXReranker)
+
+    scores = reranker.score_pairs([("q", "d")], batch_size=1)
+
+    assert scores == pytest.approx([0.5])
+    assert captured["model_name"] == "dummy-model"
+    assert captured["providers"] == ["DmlExecutionProvider", "CPUExecutionProvider"]
+    assert captured["provider_options"] == [{"device_id": 1}, {}]
diff --git a/codex-lens/tests/test_search_full_coverage.py b/codex-lens/tests/test_search_full_coverage.py
index 1de3c350..fa90ef82 100644
--- a/codex-lens/tests/test_search_full_coverage.py
+++ b/codex-lens/tests/test_search_full_coverage.py
@@ -428,6 +428,51 @@ class TestIndexPathCollection:
         assert len(paths) == 1
         engine.close()
 
+    def test_collect_skips_ignored_artifact_indexes(self, mock_registry, mock_mapper, temp_dir):
+        """Test collection skips dist/build-style artifact subtrees."""
+        root_dir = temp_dir / "project"
+        root_dir.mkdir()
+
+        root_db = root_dir / "_index.db"
+        root_store = DirIndexStore(root_db)
+        root_store.initialize()
+
+        src_dir = root_dir / "src"
+        src_dir.mkdir()
+        src_db = src_dir / "_index.db"
+        src_store = DirIndexStore(src_db)
+        src_store.initialize()
+
+        dist_dir = root_dir / "dist"
+        dist_dir.mkdir()
+        dist_db = dist_dir / "_index.db"
+        dist_store = DirIndexStore(dist_db)
+        dist_store.initialize()
+
+        workflow_dir = root_dir / ".workflow"
+        workflow_dir.mkdir()
+        workflow_db = workflow_dir / "_index.db"
+        workflow_store = DirIndexStore(workflow_db)
+        workflow_store.initialize()
+
+        root_store.register_subdir(name="src", index_path=src_db)
+        root_store.register_subdir(name="dist", index_path=dist_db)
+        root_store.register_subdir(name=".workflow", index_path=workflow_db)
+
+        root_store.close()
+        src_store.close()
+        dist_store.close()
+        workflow_store.close()
+
+        engine = ChainSearchEngine(mock_registry, mock_mapper)
+        paths = engine._collect_index_paths(root_db, depth=-1)
+
+        assert {path.relative_to(root_dir).as_posix() for path in paths} == {
+            "_index.db",
+            "src/_index.db",
+        }
+        engine.close()
+
 
 class TestResultMergeAndRank:
     """Tests for _merge_and_rank method."""
@@ -490,6 +535,36 @@ class TestResultMergeAndRank:
         assert merged == []
         engine.close()
 
+    def test_merge_applies_test_file_penalty_for_non_test_query(self, mock_registry, mock_mapper):
+        """Non-test queries should lightly demote test files during merge."""
+        engine = ChainSearchEngine(mock_registry, mock_mapper)
+
+        results = [
+            SearchResult(path="/repo/tests/test_auth.py", score=10.0, excerpt="match 1"),
+            SearchResult(path="/repo/src/auth.py", score=9.0, excerpt="match 2"),
+        ]
+
+        merged = engine._merge_and_rank(results, limit=10, query="authenticate users")
+
+        assert merged[0].path == "/repo/src/auth.py"
+        assert merged[1].metadata["path_penalty_reasons"] == ["test_file"]
+        engine.close()
+
+    def test_merge_applies_generated_file_penalty_for_non_artifact_query(self, mock_registry, mock_mapper):
+        """Non-artifact queries should lightly demote generated/build results during merge."""
+        engine = ChainSearchEngine(mock_registry, mock_mapper)
+
+        results = [
+            SearchResult(path="/repo/dist/auth.js", score=10.0, excerpt="match 1"),
+            SearchResult(path="/repo/src/auth.ts", score=9.0, excerpt="match 2"),
+        ]
+
+        merged = engine._merge_and_rank(results, limit=10, query="authenticate users")
+
+        assert merged[0].path == "/repo/src/auth.ts"
+        assert merged[1].metadata["path_penalty_reasons"] == ["generated_artifact"]
+        engine.close()
+
 
 # === Hierarchical Chain Search Tests ===
 
diff --git a/codex-lens/tests/test_staged_cascade.py b/codex-lens/tests/test_staged_cascade.py
index 11c8b829..2a5f44b4 100644
--- a/codex-lens/tests/test_staged_cascade.py
+++ b/codex-lens/tests/test_staged_cascade.py
@@ -400,15 +400,20 @@ class TestStage4OptionalRerank:
     """Tests for Stage 4: Optional cross-encoder reranking."""
 
     def test_stage4_reranks_with_reranker(
-        self, mock_registry, mock_mapper, mock_config
+        self, mock_registry, mock_mapper, temp_paths
     ):
-        """Test _stage4_optional_rerank uses _cross_encoder_rerank."""
-        engine = ChainSearchEngine(mock_registry, mock_mapper, config=mock_config)
+        """Test _stage4_optional_rerank overfetches before final trim."""
+        config = Config(data_dir=temp_paths / "data")
+        config.reranker_top_k = 4
+        config.reranking_top_k = 4
+        engine = ChainSearchEngine(mock_registry, mock_mapper, config=config)
 
         results = [
             SearchResult(path="a.py", score=0.9, excerpt="a"),
             SearchResult(path="b.py", score=0.8, excerpt="b"),
             SearchResult(path="c.py", score=0.7, excerpt="c"),
+            SearchResult(path="d.py", score=0.6, excerpt="d"),
+            SearchResult(path="e.py", score=0.5, excerpt="e"),
         ]
 
         # Mock the _cross_encoder_rerank method that _stage4 calls
@@ -416,12 +421,14 @@ class TestStage4OptionalRerank:
             mock_rerank.return_value = [
                 SearchResult(path="c.py", score=0.95, excerpt="c"),
                 SearchResult(path="a.py", score=0.85, excerpt="a"),
+                SearchResult(path="d.py", score=0.83, excerpt="d"),
+                SearchResult(path="e.py", score=0.81, excerpt="e"),
             ]
 
             reranked = engine._stage4_optional_rerank("query", results, k=2)
 
-            mock_rerank.assert_called_once_with("query", results, 2)
-            assert len(reranked) <= 2
+            mock_rerank.assert_called_once_with("query", results, 4)
+            assert len(reranked) == 4
             # First result should be reranked winner
             assert reranked[0].path == "c.py"
 
@@ -633,6 +640,113 @@ class TestStagedCascadeIntegration:
                             a_result = next(r for r in result.results if r.path == "a.py")
                             assert a_result.score == 0.9
 
+    def test_staged_cascade_expands_stage3_target_for_rerank_budget(
+        self, mock_registry, mock_mapper, temp_paths
+    ):
+        """Test staged cascade preserves enough Stage 3 reps for rerank budget."""
+        config = Config(data_dir=temp_paths / "data")
+        config.enable_staged_rerank = True
+        config.reranker_top_k = 6
+        config.reranking_top_k = 6
+
+        engine = ChainSearchEngine(mock_registry, mock_mapper, config=config)
+        expanded_results = [
+            SearchResult(path=f"src/file-{index}.ts", score=1.0 - (index * 0.01), excerpt="x")
+            for index in range(8)
+        ]
+
+        with patch.object(engine, "_find_start_index") as mock_find:
+            mock_find.return_value = temp_paths / "index" / "_index.db"
+
+            with patch.object(engine, "_collect_index_paths") as mock_collect:
+                mock_collect.return_value = [temp_paths / "index" / "_index.db"]
+
+                with patch.object(engine, "_stage1_binary_search") as mock_stage1:
+                    mock_stage1.return_value = (
+                        [SearchResult(path="seed.ts", score=0.9, excerpt="seed")],
+                        temp_paths / "index",
+                    )
+
+                    with patch.object(engine, "_stage2_lsp_expand") as mock_stage2:
+                        mock_stage2.return_value = expanded_results
+
+                        with patch.object(engine, "_stage3_cluster_prune") as mock_stage3:
+                            mock_stage3.return_value = expanded_results[:6]
+
+                            with patch.object(engine, "_stage4_optional_rerank") as mock_stage4:
+                                mock_stage4.return_value = expanded_results[:2]
+
+                                engine.staged_cascade_search(
+                                    "query",
+                                    temp_paths / "src",
+                                    k=2,
+                                    coarse_k=20,
+                                )
+
+        mock_stage3.assert_called_once_with(
+            expanded_results,
+            6,
+            query="query",
+        )
+
+    def test_staged_cascade_overfetches_rerank_before_final_trim(
+        self, mock_registry, mock_mapper, temp_paths
+    ):
+        """Test staged rerank keeps enough candidates for path penalties to work."""
+        config = Config(data_dir=temp_paths / "data")
+        config.enable_staged_rerank = True
+        config.reranker_top_k = 4
+        config.reranking_top_k = 4
+        config.test_file_penalty = 0.15
+        config.generated_file_penalty = 0.35
+
+        engine = ChainSearchEngine(mock_registry, mock_mapper, config=config)
+
+        src_primary = str(temp_paths / "src" / "tools" / "smart-search.ts")
+        src_secondary = str(temp_paths / "src" / "tools" / "codex-lens.ts")
+        test_primary = str(temp_paths / "tests" / "integration" / "cli-routes.test.ts")
+        test_secondary = str(
+            temp_paths / "frontend" / "tests" / "e2e" / "prompt-memory.spec.ts"
+        )
+        query = "parse CodexLens JSON output strip ANSI smart_search"
+        clustered_results = [
+            SearchResult(path=test_primary, score=0.98, excerpt="test"),
+            SearchResult(path=test_secondary, score=0.97, excerpt="test"),
+            SearchResult(path=src_primary, score=0.96, excerpt="source"),
+            SearchResult(path=src_secondary, score=0.95, excerpt="source"),
+        ]
+
+        with patch.object(engine, "_find_start_index") as mock_find:
+            mock_find.return_value = temp_paths / "index" / "_index.db"
+
+            with patch.object(engine, "_collect_index_paths") as mock_collect:
+                mock_collect.return_value = [temp_paths / "index" / "_index.db"]
+
+                with patch.object(engine, "_stage1_binary_search") as mock_stage1:
+                    mock_stage1.return_value = (
+                        [SearchResult(path=src_primary, score=0.9, excerpt="seed")],
+                        temp_paths / "index",
+                    )
+
+                    with patch.object(engine, "_stage2_lsp_expand") as mock_stage2:
+                        mock_stage2.return_value = clustered_results
+
+                        with patch.object(engine, "_stage3_cluster_prune") as mock_stage3:
+                            mock_stage3.return_value = clustered_results
+
+                            with patch.object(engine, "_cross_encoder_rerank") as mock_rerank:
+                                mock_rerank.return_value = clustered_results
+
+                                result = engine.staged_cascade_search(
+                                    query,
+                                    temp_paths / "src",
+                                    k=2,
+                                    coarse_k=20,
+                                )
+
+        mock_rerank.assert_called_once_with(query, clustered_results, 4)
+        assert [item.path for item in result.results] == [src_primary, src_secondary]
+
 
 # =============================================================================
 # Graceful Degradation Tests