Implement database migration framework and performance optimizations

- Added active memory configuration for manual interval and Gemini tool. - Created file modification rules for handling edits and writes. - Implemented migration manager for managing database schema migrations. - Added migration 001 to normalize keywords into separate tables. - Developed tests for validating performance optimizations including keyword normalization, path lookup, and symbol search. - Created validation script to manually verify optimization implementations.
2026-02-05 01:50:27 +08:00 · 2025-12-14 18:08:32 +08:00
parent 79a2953862
commit 0529b57694
18 changed files with 2085 additions and 545 deletions
--- a/.claude/rules/active_memory_config.json
+++ b/.claude/rules/active_memory_config.json
--- a/.claude/rules/cli-tools-usage.md
+++ b/.claude/rules/cli-tools-usage.md
@@ -1,36 +1,433 @@
-# CLI Tools Usage Rules
+# Intelligent Tools Selection Strategy

-## Tool Selection
+## Table of Contents
+1. [Quick Reference](#quick-reference)
+2. [Tool Specifications](#tool-specifications)
+3. [Prompt Template](#prompt-template)
+4. [CLI Execution](#cli-execution)
+5. [Configuration](#configuration)
+6. [Best Practices](#best-practices)
+
+---
+
+## Quick Reference
+
+## Quick Decision Tree
+
+```
+┌─ Task Analysis/Documentation?
+│  └─→ Use Gemini (Fallback: Codex,Qwen)
+│     └─→ MODE: analysis (default, read-only)
+│
+└─ Task Implementation/Bug Fix?
+   └─→ Use Codex  (Fallback: Gemini,Qwen)
+      └─→ MODE: auto (full operations) or write (file operations)
+```
+
+
+### Universal Prompt Template
+
+```
+PURPOSE: [what] + [why] + [success criteria] + [constraints/scope]
+TASK: • [step 1: specific action] • [step 2: specific action] • [step 3: specific action]
+MODE: [analysis|write|auto]
+CONTEXT: @[file patterns] | Memory: [session/tech/module context]
+EXPECTED: [deliverable format] + [quality criteria] + [structure requirements]
+RULES: $(cat ~/.claude/workflows/cli-templates/prompts/[category]/[template].txt) | [domain constraints] | MODE=[permission]
+```
+
+### Intent Capture Checklist (Before CLI Execution)
+
+**⚠️ CRITICAL**: Before executing any CLI command, verify these intent dimensions:
+**Intent Validation Questions**:
+- [ ] Is the objective specific and measurable?
+- [ ] Are success criteria defined?
+- [ ] Is the scope clearly bounded?
+- [ ] Are constraints and limitations stated?
+- [ ] Is the expected output format clear?
+- [ ] Is the action level (read/write) explicit?
+
+## Tool Selection Matrix
+
+| Task Category | Tool | MODE | When to Use |
+|---------------|------|------|-------------|
+| **Read/Analyze** | Gemini/Qwen | `analysis` | Code review, architecture analysis, pattern discovery, exploration |
+| **Write/Create** | Gemini/Qwen | `write` | Documentation generation, file creation (non-code) |
+| **Implement/Fix** | Codex | `auto` | Feature implementation, bug fixes, test creation, refactoring |
+
+## Essential Command Structure
+
+```bash
+ccw cli exec "<PROMPT>" --tool <gemini|qwen|codex> --mode <analysis|write|auto>
+```
+
+### Core Principles
+
+- **Use tools early and often** - Tools are faster and more thorough
+- **Unified CLI** - Always use `ccw cli exec` for consistent parameter handling
+- **One template required** - ALWAYS reference exactly ONE template in RULES (use universal fallback if no specific match)
+- **Write protection** - Require EXPLICIT `--mode write` or `--mode auto`
+- **No escape characters** - NEVER use `\$`, `\"`, `\'` in CLI commands
+
+---
+
+## Tool Specifications
+
+### MODE Options
+
+| Mode | Permission | Use For | Specification |
+|------|------------|---------|---------------|
+| `analysis` | Read-only (default) | Code review, architecture analysis, pattern discovery | Auto for Gemini/Qwen |
+| `write` | Create/Modify/Delete | Documentation, code creation, file modifications | Requires `--mode write` |
+| `auto` | Full operations | Feature implementation, bug fixes, autonomous development | Codex only, requires `--mode auto` |

 ### Gemini & Qwen
-**Use for**: Analysis, documentation, code exploration, architecture review
- Default MODE: `analysis` (read-only)
- Prefer Gemini; use Qwen as fallback
+
+**Via CCW**: `ccw cli exec "<prompt>" --tool gemini` or `--tool qwen`
+
+**Characteristics**:
 - Large context window, pattern recognition
+- Best for: Analysis, documentation, code exploration, architecture review
+- Default MODE: `analysis` (read-only)
+- Priority: Prefer Gemini; use Qwen as fallback
+
+**Models** (override via `--model`):
+- Gemini: `gemini-2.5-pro`
+- Qwen: `coder-model`, `vision-model`
+
+**Error Handling**: HTTP 429 may show error but still return results - check if results exist

 ### Codex
-**Use for**: Feature implementation, bug fixes, autonomous development
- Requires explicit `--mode auto` or `--mode write`
+
+**Via CCW**: `ccw cli exec "<prompt>" --tool codex --mode auto`
+
+**Characteristics**:
+- Autonomous development, mathematical reasoning
 - Best for: Implementation, testing, automation
+- No default MODE - must explicitly specify `--mode write` or `--mode auto`

-## Core Principles
+**Models**: `gpt-5.2`

- Use tools early and often - tools are faster and more thorough
- Always use `ccw cli exec` for consistent parameter handling
- ALWAYS reference exactly ONE template in RULES section
- Require EXPLICIT `--mode write` or `--mode auto` for modifications
- NEVER use escape characters (`\$`, `\"`, `\'`) in CLI commands
+### Session Resume

-## Permission Framework
+**Resume via `--resume` parameter**:

+```bash
+ccw cli exec "Continue analyzing" --resume              # Resume last session
+ccw cli exec "Fix issues found" --resume <id>           # Resume specific session
+```
+
+| Value | Description |
+|-------|-------------|
+| `--resume` (empty) | Resume most recent session |
+| `--resume <id>` | Resume specific execution ID |
+
+**Context Assembly** (automatic):
+```
+=== PREVIOUS CONVERSATION ===
+USER PROMPT: [Previous prompt]
+ASSISTANT RESPONSE: [Previous output]
+=== CONTINUATION ===
+[Your new prompt]
+```
+
+**Tool Behavior**: Codex uses native `codex resume`; Gemini/Qwen assembles context as single prompt
+
+---
+
+## Prompt Template
+
+### Template Structure
+
+Every command MUST include these fields:
+
+| Field | Purpose | Components | Bad Example | Good Example |
+|-------|---------|------------|-------------|--------------|
+| **PURPOSE** | Goal + motivation + success | What + Why + Success Criteria + Constraints | "Analyze code" | "Identify security vulnerabilities in auth module to pass compliance audit; success = all OWASP Top 10 addressed; scope = src/auth/** only" |
+| **TASK** | Actionable steps | Specific verbs + targets | "• Review code • Find issues" | "• Scan for SQL injection in query builders • Check XSS in template rendering • Verify CSRF token validation" |
+| **MODE** | Permission level | analysis / write / auto | (missing) | "analysis" or "write" |
+| **CONTEXT** | File scope + history | File patterns + Memory | "@**/*" | "@src/auth/**/*.ts @shared/utils/security.ts \| Memory: Previous auth refactoring (WFS-001)" |
+| **EXPECTED** | Output specification | Format + Quality + Structure | "Report" | "Markdown report with: severity levels (Critical/High/Medium/Low), file:line references, remediation code snippets, priority ranking" |
+| **RULES** | Template + constraints | $(cat template) + domain rules | (missing) | "$(cat ~/.claude/.../security.txt) \| Focus on authentication \| Ignore test files \| analysis=READ-ONLY" |
+
+
+### CONTEXT Configuration
+
+**Format**: `CONTEXT: [file patterns] | Memory: [memory context]`
+
+#### File Patterns
+
+| Pattern | Scope |
+|---------|-------|
+| `@**/*` | All files (default) |
+| `@src/**/*.ts` | TypeScript in src |
+| `@../shared/**/*` | Sibling directory (requires `--includeDirs`) |
+| `@CLAUDE.md` | Specific file |
+
+#### Memory Context
+
+Include when building on previous work:
+
+```bash
+# Cross-task reference
+Memory: Building on auth refactoring (commit abc123), implementing refresh tokens
+
+# Cross-module integration
+Memory: Integration with auth module, using shared error patterns from @shared/utils/errors.ts
+```
+
+**Memory Sources**:
+- **Related Tasks**: Previous refactoring, extensions, conflict resolution
+- **Tech Stack Patterns**: Framework conventions, security guidelines
+- **Cross-Module References**: Integration points, shared utilities, type dependencies
+
+#### Pattern Discovery Workflow
+
+For complex requirements, discover files BEFORE CLI execution:
+
+```bash
+# Step 1: Discover files
+rg "export.*Component" --files-with-matches --type ts
+
+# Step 2: Build CONTEXT
+CONTEXT: @components/Auth.tsx @types/auth.d.ts | Memory: Previous type refactoring
+
+# Step 3: Execute CLI
+ccw cli exec "..." --tool gemini --cd src
+```
+
+### RULES Configuration
+
+**Format**: `RULES: $(cat ~/.claude/workflows/cli-templates/prompts/[category]/[template].txt) | [constraints]`
+
+**⚠️ MANDATORY**: Exactly ONE template reference is REQUIRED. Select from Task-Template Matrix or use universal fallback:
+- `universal/00-universal-rigorous-style.txt` - For precision-critical tasks (default fallback)
+- `universal/00-universal-creative-style.txt` - For exploratory tasks
+
+**Command Substitution Rules**:
+- Use `$(cat ...)` directly - do NOT read template content first
+- NEVER use escape characters: `\$`, `\"`, `\'`
+- Tilde expands correctly in prompt context
+
+**Examples**:
+```bash
+# Specific template (preferred)
+RULES: $(cat ~/.claude/workflows/cli-templates/prompts/analysis/01-diagnose-bug-root-cause.txt) | Focus on auth | analysis=READ-ONLY
+
+# Universal fallback (when no specific template matches)
+RULES: $(cat ~/.claude/workflows/cli-templates/prompts/universal/00-universal-rigorous-style.txt) | Focus on security patterns | analysis=READ-ONLY
+```
+
+### Template System
+
+**Base Path**: `~/.claude/workflows/cli-templates/prompts/`
+
+**Naming Convention**:
+- `00-*` - Universal fallbacks (when no specific match)
+- `01-*` - Universal, high-frequency
+- `02-*` - Common specialized
+- `03-*` - Domain-specific
+
+**Universal Templates**:
+
+| Template | Use For |
+|----------|---------|
+| `universal/00-universal-rigorous-style.txt` | Precision-critical, systematic methodology |
+| `universal/00-universal-creative-style.txt` | Exploratory, innovative solutions |
+
+**Task-Template Matrix**:
+
+| Task Type | Template |
+|-----------|----------|
+| **Analysis** | |
+| Execution Tracing | `analysis/01-trace-code-execution.txt` |
+| Bug Diagnosis | `analysis/01-diagnose-bug-root-cause.txt` |
+| Code Patterns | `analysis/02-analyze-code-patterns.txt` |
+| Document Analysis | `analysis/02-analyze-technical-document.txt` |
+| Architecture Review | `analysis/02-review-architecture.txt` |
+| Code Review | `analysis/02-review-code-quality.txt` |
+| Performance | `analysis/03-analyze-performance.txt` |
+| Security | `analysis/03-assess-security-risks.txt` |
+| **Planning** | |
+| Architecture | `planning/01-plan-architecture-design.txt` |
+| Task Breakdown | `planning/02-breakdown-task-steps.txt` |
+| Component Design | `planning/02-design-component-spec.txt` |
+| Migration | `planning/03-plan-migration-strategy.txt` |
+| **Development** | |
+| Feature | `development/02-implement-feature.txt` |
+| Refactoring | `development/02-refactor-codebase.txt` |
+| Tests | `development/02-generate-tests.txt` |
+| UI Component | `development/02-implement-component-ui.txt` |
+| Debugging | `development/03-debug-runtime-issues.txt` |
+
+---
+
+## CLI Execution
+
+### Command Options
+
+| Option | Description | Default |
+|--------|-------------|---------|
+| `--tool <tool>` | gemini, qwen, codex | gemini |
+| `--mode <mode>` | analysis, write, auto | analysis |
+| `--model <model>` | Model override | auto-select |
+| `--cd <path>` | Working directory | current |
+| `--includeDirs <dirs>` | Additional directories (comma-separated) | none |
+| `--timeout <ms>` | Timeout in milliseconds | 300000 |
+| `--resume [id]` | Resume previous session | - |
+| `--no-stream` | Disable streaming | false |
+
+### Directory Configuration
+
+#### Working Directory (`--cd`)
+
+When using `--cd`:
+- `@**/*` = Files within working directory tree only
+- CANNOT reference parent/sibling via @ alone
+- Must use `--includeDirs` for external directories
+
+#### Include Directories (`--includeDirs`)
+
+**TWO-STEP requirement for external files**:
+1. Add `--includeDirs` parameter
+2. Reference in CONTEXT with @ patterns
+
+```bash
+# Single directory
+ccw cli exec "CONTEXT: @**/* @../shared/**/*" --cd src/auth --includeDirs ../shared
+
+# Multiple directories
+ccw cli exec "..." --cd src/auth --includeDirs ../shared,../types,../utils
+```
+
+**Rule**: If CONTEXT contains `@../dir/**/*`, MUST include `--includeDirs ../dir`
+
+**Benefits**: Excludes unrelated directories, reduces token usage
+
+### CCW Parameter Mapping
+
+CCW automatically maps to tool-specific syntax:
+
+| CCW Parameter | Gemini/Qwen | Codex |
+|---------------|-------------|-------|
+| `--cd <path>` | `cd <path> &&` | `-C <path>` |
+| `--includeDirs <dirs>` | `--include-directories` | `--add-dir` (per dir) |
+| `--mode write` | `--approval-mode yolo` | `-s danger-full-access` |
+| `--mode auto` | N/A | `-s danger-full-access` |
+
+### Command Examples
+
+#### Task-Type Specific Templates
+
+**Analysis Task** (Security Audit):
+```bash
+ccw cli exec "
+PURPOSE: Identify OWASP Top 10 vulnerabilities in authentication module to pass security audit; success = all critical/high issues documented with remediation
+TASK: • Scan for injection flaws (SQL, command, LDAP) • Check authentication bypass vectors • Evaluate session management • Assess sensitive data exposure
+MODE: analysis
+CONTEXT: @src/auth/**/* @src/middleware/auth.ts | Memory: Using bcrypt for passwords, JWT for sessions
+EXPECTED: Security report with: severity matrix, file:line references, CVE mappings where applicable, remediation code snippets prioritized by risk
+RULES: $(cat ~/.claude/workflows/cli-templates/prompts/analysis/03-assess-security-risks.txt) | Focus on authentication | Ignore test files | analysis=READ-ONLY
+" --tool gemini --cd src/auth --timeout 600000
+```
+
+**Implementation Task** (New Feature):
+```bash
+ccw cli exec "
+PURPOSE: Implement rate limiting for API endpoints to prevent abuse; must be configurable per-endpoint; backward compatible with existing clients
+TASK: • Create rate limiter middleware with sliding window • Implement per-route configuration • Add Redis backend for distributed state • Include bypass for internal services
+MODE: auto
+CONTEXT: @src/middleware/**/* @src/config/**/* | Memory: Using Express.js, Redis already configured, existing middleware pattern in auth.ts
+EXPECTED: Production-ready code with: TypeScript types, unit tests, integration test, configuration example, migration guide
+RULES: $(cat ~/.claude/workflows/cli-templates/prompts/development/02-implement-feature.txt) | Follow existing middleware patterns | No breaking changes | auto=FULL
+" --tool codex --mode auto --timeout 1800000
+```
+
+**Bug Fix Task**:
+```bash
+ccw cli exec "
+PURPOSE: Fix memory leak in WebSocket connection handler causing server OOM after 24h; root cause must be identified before any fix
+TASK: • Trace connection lifecycle from open to close • Identify event listener accumulation • Check cleanup on disconnect • Verify garbage collection eligibility
+MODE: analysis
+CONTEXT: @src/websocket/**/* @src/services/connection-manager.ts | Memory: Using ws library, ~5000 concurrent connections in production
+EXPECTED: Root cause analysis with: memory profile, leak source (file:line), fix recommendation with code, verification steps
+RULES: $(cat ~/.claude/workflows/cli-templates/prompts/analysis/01-diagnose-bug-root-cause.txt) | Focus on resource cleanup | analysis=READ-ONLY
+" --tool gemini --cd src --timeout 900000
+```
+
+**Refactoring Task**:
+```bash
+ccw cli exec "
+PURPOSE: Refactor payment processing to use strategy pattern for multi-gateway support; no functional changes; all existing tests must pass
+TASK: • Extract gateway interface from current implementation • Create strategy classes for Stripe, PayPal • Implement factory for gateway selection • Migrate existing code to use strategies
+MODE: write
+CONTEXT: @src/payments/**/* @src/types/payment.ts | Memory: Currently only Stripe, adding PayPal next sprint, must support future gateways
+EXPECTED: Refactored code with: strategy interface, concrete implementations, factory class, updated tests, migration checklist
+RULES: $(cat ~/.claude/workflows/cli-templates/prompts/development/02-refactor-codebase.txt) | Preserve all existing behavior | Tests must pass | write=CREATE/MODIFY/DELETE
+" --tool gemini --mode write --timeout 1200000
+```
+---
+
+## Configuration
+
+### Timeout Allocation
+
+**Minimum**: 5 minutes (300000ms)
+
+| Complexity | Range | Examples |
+|------------|-------|----------|
+| Simple | 5-10min (300000-600000ms) | Analysis, search |
+| Medium | 10-20min (600000-1200000ms) | Refactoring, documentation |
+| Complex | 20-60min (1200000-3600000ms) | Implementation, migration |
+| Heavy | 60-120min (3600000-7200000ms) | Large codebase, multi-file |
+
+**Codex Multiplier**: 3x allocated time (minimum 15min / 900000ms)
+
+```bash
+ccw cli exec "<prompt>" --tool gemini --timeout 600000   # 10 min
+ccw cli exec "<prompt>" --tool codex --timeout 1800000   # 30 min
+```
+
+### Permission Framework
+
+**Single-Use Authorization**: Each execution requires explicit user instruction. Previous authorization does NOT carry over.
+
+**Mode Hierarchy**:
 - `analysis` (default): Read-only, safe for auto-execution
- `write`: Requires explicit `--mode write` - creates/modifies/deletes files
- `auto`: Requires explicit `--mode auto` - full autonomous operations (Codex only)
+- `write`: Requires explicit `--mode write`
+- `auto`: Requires explicit `--mode auto`
+- **Exception**: User provides clear instructions like "modify", "create", "implement"

-## Timeout Guidelines
+---

- Simple (5-10min): Analysis, search
- Medium (10-20min): Refactoring, documentation
- Complex (20-60min): Implementation, migration
- Heavy (60-120min): Large codebase, multi-file operations
- Codex multiplier: 3x allocated time (minimum 15min)
+## Best Practices
+
+### Workflow Principles
+
+- **Use CCW unified interface** for all executions
+- **Always include template** - Use Task-Template Matrix or universal fallback
+- **Be specific** - Clear PURPOSE, TASK, EXPECTED fields
+- **Include constraints** - File patterns, scope in RULES
+- **Leverage memory context** when building on previous work
+- **Discover patterns first** - Use rg/MCP before CLI execution
+- **Default to full context** - Use `@**/*` unless specific files needed
+
+### Workflow Integration
+
+| Phase | Command |
+|-------|---------|
+| Understanding | `ccw cli exec "<prompt>" --tool gemini` |
+| Architecture | `ccw cli exec "<prompt>" --tool gemini` |
+| Implementation | `ccw cli exec "<prompt>" --tool codex --mode auto` |
+| Quality | `ccw cli exec "<prompt>" --tool codex --mode write` |
+
+### Planning Checklist
+
+- [ ] **Purpose defined** - Clear goal and intent
+- [ ] **Mode selected** - `--mode analysis|write|auto`
+- [ ] **Context gathered** - File references + memory (default `@**/*`)
+- [ ] **Directory navigation** - `--cd` and/or `--includeDirs`
+- [ ] **Tool selected** - `--tool gemini|qwen|codex`
+- [ ] **Template applied (REQUIRED)** - Use specific or universal fallback template
+- [ ] **Constraints specified** - Scope, requirements
+- [ ] **Timeout configured** - Based on complexity
--- a/.claude/rules/context-requirements.md
+++ b/.claude/rules/context-requirements.md
@@ -5,3 +5,42 @@ Before implementation, always:
 - Identify 3+ existing similar patterns before implementation
 - Map dependencies and integration points
 - Understand testing framework and coding conventions
+
+## Context Gathering
+
+### Use Exa
+- Researching external APIs, libraries, frameworks
+- Need recent documentation beyond knowledge cutoff
+- Looking for implementation examples in public repos
+- User mentions specific library/framework names
+- Questions about "best practices" or "how does X work"
+
+### Use read_file (MCP)
+- Reading multiple related files at once
+- Directory traversal with pattern matching
+- Searching file content with regex
+- Need to limit depth/file count for large directories
+- Batch operations on multiple files
+- Pattern-based filtering (glob + content regex)
+
+### Use codex_lens
+- Large codebase (>500 files) requiring repeated searches
+- Need semantic understanding of code relationships
+- Working across multiple sessions
+- Symbol-level navigation needed
+- Finding all implementations of interface/class
+- Tracking function calls across codebase
+
+### Use smart_search
+- Unknown file locations
+- Concept/semantic search ("authentication logic", "payment processing")
+- Medium-sized codebase (100-500 files)
+- One-time or infrequent searches
+- Natural language queries about code structure
+
+**Mode Selection**:
+- `auto`: Let tool decide (default)
+- `exact`: Known exact pattern
+- `fuzzy`: Typo-tolerant search
+- `semantic`: Concept-based search
+- `graph`: Dependency analysis
--- a/.claude/rules/file-modification.md
+++ b/.claude/rules/file-modification.md
@@ -1,44 +1,3 @@
-# Tool Selection Rules
-
-## Context Gathering
-
-### Use Exa
- Researching external APIs, libraries, frameworks
- Need recent documentation beyond knowledge cutoff
- Looking for implementation examples in public repos
- User mentions specific library/framework names
- Questions about "best practices" or "how does X work"
-
-### Use read_file (MCP)
- Reading multiple related files at once
- Directory traversal with pattern matching
- Searching file content with regex
- Need to limit depth/file count for large directories
- Batch operations on multiple files
- Pattern-based filtering (glob + content regex)
-
-### Use codex_lens
- Large codebase (>500 files) requiring repeated searches
- Need semantic understanding of code relationships
- Working across multiple sessions
- Symbol-level navigation needed
- Finding all implementations of interface/class
- Tracking function calls across codebase
-
-### Use smart_search
- Unknown file locations
- Concept/semantic search ("authentication logic", "payment processing")
- Medium-sized codebase (100-500 files)
- One-time or infrequent searches
- Natural language queries about code structure
-
-**Mode Selection**:
- `auto`: Let tool decide (default)
- `exact`: Known exact pattern
- `fuzzy`: Typo-tolerant search
- `semantic`: Concept-based search
- `graph`: Dependency analysis
-
 ## File Modification

 ### Use edit_file (MCP)
--- a/.claude/rules/intelligent-tools-strategy.md
+++ b/.claude/rules/intelligent-tools-strategy.md
@@ -1,431 +0,0 @@
-# Intelligent Tools Selection Strategy
-
-## Table of Contents
-1. [Quick Reference](#quick-reference)
-2. [Tool Specifications](#tool-specifications)
-3. [Prompt Template](#prompt-template)
-4. [CLI Execution](#cli-execution)
-5. [Configuration](#configuration)
-6. [Best Practices](#best-practices)
-
---
-
-## Quick Reference
-
-### Universal Prompt Template
-
-```
-PURPOSE: [what] + [why] + [success criteria] + [constraints/scope]
-TASK: • [step 1: specific action] • [step 2: specific action] • [step 3: specific action]
-MODE: [analysis|write|auto]
-CONTEXT: @[file patterns] | Memory: [session/tech/module context]
-EXPECTED: [deliverable format] + [quality criteria] + [structure requirements]
-RULES: $(cat ~/.claude/workflows/cli-templates/prompts/[category]/[template].txt) | [domain constraints] | MODE=[permission]
-```
-
-### Intent Capture Checklist (Before CLI Execution)
-
-**⚠️ CRITICAL**: Before executing any CLI command, verify these intent dimensions:
-**Intent Validation Questions**:
- [ ] Is the objective specific and measurable?
- [ ] Are success criteria defined?
- [ ] Is the scope clearly bounded?
- [ ] Are constraints and limitations stated?
- [ ] Is the expected output format clear?
- [ ] Is the action level (read/write) explicit?
-
-### Tool Selection
-
-| Task Type | Tool | Fallback |
-|-----------|------|----------|
-| Analysis/Documentation | Gemini | Qwen |
-| Implementation/Testing | Codex | - |
-
-### CCW Command Syntax
-
-```bash
-ccw cli exec "<prompt>" --tool <gemini|qwen|codex> --mode <analysis|write|auto>
-ccw cli exec "<prompt>" --tool gemini --cd <path> --includeDirs <dirs>
-ccw cli exec "<prompt>" --resume [id]  # Resume previous session
-```
-
-### CLI Subcommands
-
-| Command | Description |
-|---------|-------------|
-| `ccw cli status` | Check CLI tools availability |
-| `ccw cli exec "<prompt>"` | Execute a CLI tool |
-| `ccw cli exec "<prompt>" --resume [id]` | Resume a previous session |
-| `ccw cli history` | Show execution history |
-| `ccw cli detail <id>` | Show execution detail |
-
-### Core Principles
-
- **Use tools early and often** - Tools are faster and more thorough
- **Unified CLI** - Always use `ccw cli exec` for consistent parameter handling
- **One template required** - ALWAYS reference exactly ONE template in RULES (use universal fallback if no specific match)
- **Write protection** - Require EXPLICIT `--mode write` or `--mode auto`
- **No escape characters** - NEVER use `\$`, `\"`, `\'` in CLI commands
-
---
-
-## Tool Specifications
-
-### MODE Options
-
-| Mode | Permission | Use For | Specification |
-|------|------------|---------|---------------|
-| `analysis` | Read-only (default) | Code review, architecture analysis, pattern discovery | Auto for Gemini/Qwen |
-| `write` | Create/Modify/Delete | Documentation, code creation, file modifications | Requires `--mode write` |
-| `auto` | Full operations | Feature implementation, bug fixes, autonomous development | Codex only, requires `--mode auto` |
-
-### Gemini & Qwen
-
-**Via CCW**: `ccw cli exec "<prompt>" --tool gemini` or `--tool qwen`
-
-**Characteristics**:
- Large context window, pattern recognition
- Best for: Analysis, documentation, code exploration, architecture review
- Default MODE: `analysis` (read-only)
- Priority: Prefer Gemini; use Qwen as fallback
-
-**Models** (override via `--model`):
- Gemini: `gemini-2.5-pro`
- Qwen: `coder-model`, `vision-model`
-
-**Error Handling**: HTTP 429 may show error but still return results - check if results exist
-
-### Codex
-
-**Via CCW**: `ccw cli exec "<prompt>" --tool codex --mode auto`
-
-**Characteristics**:
- Autonomous development, mathematical reasoning
- Best for: Implementation, testing, automation
- No default MODE - must explicitly specify `--mode write` or `--mode auto`
-
-**Models**: `gpt-5.2`
-
-### Session Resume
-
-**Resume via `--resume` parameter**:
-
-```bash
-ccw cli exec "Continue analyzing" --resume              # Resume last session
-ccw cli exec "Fix issues found" --resume <id>           # Resume specific session
-```
-
-| Value | Description |
-|-------|-------------|
-| `--resume` (empty) | Resume most recent session |
-| `--resume <id>` | Resume specific execution ID |
-
-**Context Assembly** (automatic):
-```
-=== PREVIOUS CONVERSATION ===
-USER PROMPT: [Previous prompt]
-ASSISTANT RESPONSE: [Previous output]
-=== CONTINUATION ===
-[Your new prompt]
-```
-
-**Tool Behavior**: Codex uses native `codex resume`; Gemini/Qwen assembles context as single prompt
-
---
-
-## Prompt Template
-
-### Template Structure
-
-Every command MUST include these fields:
-
-| Field | Purpose | Components | Bad Example | Good Example |
-|-------|---------|------------|-------------|--------------|
-| **PURPOSE** | Goal + motivation + success | What + Why + Success Criteria + Constraints | "Analyze code" | "Identify security vulnerabilities in auth module to pass compliance audit; success = all OWASP Top 10 addressed; scope = src/auth/** only" |
-| **TASK** | Actionable steps | Specific verbs + targets | "• Review code • Find issues" | "• Scan for SQL injection in query builders • Check XSS in template rendering • Verify CSRF token validation" |
-| **MODE** | Permission level | analysis / write / auto | (missing) | "analysis" or "write" |
-| **CONTEXT** | File scope + history | File patterns + Memory | "@**/*" | "@src/auth/**/*.ts @shared/utils/security.ts \| Memory: Previous auth refactoring (WFS-001)" |
-| **EXPECTED** | Output specification | Format + Quality + Structure | "Report" | "Markdown report with: severity levels (Critical/High/Medium/Low), file:line references, remediation code snippets, priority ranking" |
-| **RULES** | Template + constraints | $(cat template) + domain rules | (missing) | "$(cat ~/.claude/.../security.txt) \| Focus on authentication \| Ignore test files \| analysis=READ-ONLY" |
-
-
-### CONTEXT Configuration
-
-**Format**: `CONTEXT: [file patterns] | Memory: [memory context]`
-
-#### File Patterns
-
-| Pattern | Scope |
-|---------|-------|
-| `@**/*` | All files (default) |
-| `@src/**/*.ts` | TypeScript in src |
-| `@../shared/**/*` | Sibling directory (requires `--includeDirs`) |
-| `@CLAUDE.md` | Specific file |
-
-#### Memory Context
-
-Include when building on previous work:
-
-```bash
-# Cross-task reference
-Memory: Building on auth refactoring (commit abc123), implementing refresh tokens
-
-# Cross-module integration
-Memory: Integration with auth module, using shared error patterns from @shared/utils/errors.ts
-```
-
-**Memory Sources**:
- **Related Tasks**: Previous refactoring, extensions, conflict resolution
- **Tech Stack Patterns**: Framework conventions, security guidelines
- **Cross-Module References**: Integration points, shared utilities, type dependencies
-
-#### Pattern Discovery Workflow
-
-For complex requirements, discover files BEFORE CLI execution:
-
-```bash
-# Step 1: Discover files
-rg "export.*Component" --files-with-matches --type ts
-
-# Step 2: Build CONTEXT
-CONTEXT: @components/Auth.tsx @types/auth.d.ts | Memory: Previous type refactoring
-
-# Step 3: Execute CLI
-ccw cli exec "..." --tool gemini --cd src
-```
-
-### RULES Configuration
-
-**Format**: `RULES: $(cat ~/.claude/workflows/cli-templates/prompts/[category]/[template].txt) | [constraints]`
-
-**⚠️ MANDATORY**: Exactly ONE template reference is REQUIRED. Select from Task-Template Matrix or use universal fallback:
- `universal/00-universal-rigorous-style.txt` - For precision-critical tasks (default fallback)
- `universal/00-universal-creative-style.txt` - For exploratory tasks
-
-**Command Substitution Rules**:
- Use `$(cat ...)` directly - do NOT read template content first
- NEVER use escape characters: `\$`, `\"`, `\'`
- Tilde expands correctly in prompt context
-
-**Examples**:
-```bash
-# Specific template (preferred)
-RULES: $(cat ~/.claude/workflows/cli-templates/prompts/analysis/01-diagnose-bug-root-cause.txt) | Focus on auth | analysis=READ-ONLY
-
-# Universal fallback (when no specific template matches)
-RULES: $(cat ~/.claude/workflows/cli-templates/prompts/universal/00-universal-rigorous-style.txt) | Focus on security patterns | analysis=READ-ONLY
-```
-
-### Template System
-
-**Base Path**: `~/.claude/workflows/cli-templates/prompts/`
-
-**Naming Convention**:
- `00-*` - Universal fallbacks (when no specific match)
- `01-*` - Universal, high-frequency
- `02-*` - Common specialized
- `03-*` - Domain-specific
-
-**Universal Templates**:
-
-| Template | Use For |
-|----------|---------|
-| `universal/00-universal-rigorous-style.txt` | Precision-critical, systematic methodology |
-| `universal/00-universal-creative-style.txt` | Exploratory, innovative solutions |
-
-**Task-Template Matrix**:
-
-| Task Type | Template |
-|-----------|----------|
-| **Analysis** | |
-| Execution Tracing | `analysis/01-trace-code-execution.txt` |
-| Bug Diagnosis | `analysis/01-diagnose-bug-root-cause.txt` |
-| Code Patterns | `analysis/02-analyze-code-patterns.txt` |
-| Document Analysis | `analysis/02-analyze-technical-document.txt` |
-| Architecture Review | `analysis/02-review-architecture.txt` |
-| Code Review | `analysis/02-review-code-quality.txt` |
-| Performance | `analysis/03-analyze-performance.txt` |
-| Security | `analysis/03-assess-security-risks.txt` |
-| **Planning** | |
-| Architecture | `planning/01-plan-architecture-design.txt` |
-| Task Breakdown | `planning/02-breakdown-task-steps.txt` |
-| Component Design | `planning/02-design-component-spec.txt` |
-| Migration | `planning/03-plan-migration-strategy.txt` |
-| **Development** | |
-| Feature | `development/02-implement-feature.txt` |
-| Refactoring | `development/02-refactor-codebase.txt` |
-| Tests | `development/02-generate-tests.txt` |
-| UI Component | `development/02-implement-component-ui.txt` |
-| Debugging | `development/03-debug-runtime-issues.txt` |
-
---
-
-## CLI Execution
-
-### Command Options
-
-| Option | Description | Default |
-|--------|-------------|---------|
-| `--tool <tool>` | gemini, qwen, codex | gemini |
-| `--mode <mode>` | analysis, write, auto | analysis |
-| `--model <model>` | Model override | auto-select |
-| `--cd <path>` | Working directory | current |
-| `--includeDirs <dirs>` | Additional directories (comma-separated) | none |
-| `--timeout <ms>` | Timeout in milliseconds | 300000 |
-| `--resume [id]` | Resume previous session | - |
-| `--no-stream` | Disable streaming | false |
-
-### Directory Configuration
-
-#### Working Directory (`--cd`)
-
-When using `--cd`:
- `@**/*` = Files within working directory tree only
- CANNOT reference parent/sibling via @ alone
- Must use `--includeDirs` for external directories
-
-#### Include Directories (`--includeDirs`)
-
-**TWO-STEP requirement for external files**:
-1. Add `--includeDirs` parameter
-2. Reference in CONTEXT with @ patterns
-
-```bash
-# Single directory
-ccw cli exec "CONTEXT: @**/* @../shared/**/*" --cd src/auth --includeDirs ../shared
-
-# Multiple directories
-ccw cli exec "..." --cd src/auth --includeDirs ../shared,../types,../utils
-```
-
-**Rule**: If CONTEXT contains `@../dir/**/*`, MUST include `--includeDirs ../dir`
-
-**Benefits**: Excludes unrelated directories, reduces token usage
-
-### CCW Parameter Mapping
-
-CCW automatically maps to tool-specific syntax:
-
-| CCW Parameter | Gemini/Qwen | Codex |
-|---------------|-------------|-------|
-| `--cd <path>` | `cd <path> &&` | `-C <path>` |
-| `--includeDirs <dirs>` | `--include-directories` | `--add-dir` (per dir) |
-| `--mode write` | `--approval-mode yolo` | `-s danger-full-access` |
-| `--mode auto` | N/A | `-s danger-full-access` |
-
-### Command Examples
-
-#### Task-Type Specific Templates
-
-**Analysis Task** (Security Audit):
-```bash
-ccw cli exec "
-PURPOSE: Identify OWASP Top 10 vulnerabilities in authentication module to pass security audit; success = all critical/high issues documented with remediation
-TASK: • Scan for injection flaws (SQL, command, LDAP) • Check authentication bypass vectors • Evaluate session management • Assess sensitive data exposure
-MODE: analysis
-CONTEXT: @src/auth/**/* @src/middleware/auth.ts | Memory: Using bcrypt for passwords, JWT for sessions
-EXPECTED: Security report with: severity matrix, file:line references, CVE mappings where applicable, remediation code snippets prioritized by risk
-RULES: $(cat ~/.claude/workflows/cli-templates/prompts/analysis/03-assess-security-risks.txt) | Focus on authentication | Ignore test files | analysis=READ-ONLY
-" --tool gemini --cd src/auth --timeout 600000
-```
-
-**Implementation Task** (New Feature):
-```bash
-ccw cli exec "
-PURPOSE: Implement rate limiting for API endpoints to prevent abuse; must be configurable per-endpoint; backward compatible with existing clients
-TASK: • Create rate limiter middleware with sliding window • Implement per-route configuration • Add Redis backend for distributed state • Include bypass for internal services
-MODE: auto
-CONTEXT: @src/middleware/**/* @src/config/**/* | Memory: Using Express.js, Redis already configured, existing middleware pattern in auth.ts
-EXPECTED: Production-ready code with: TypeScript types, unit tests, integration test, configuration example, migration guide
-RULES: $(cat ~/.claude/workflows/cli-templates/prompts/development/02-implement-feature.txt) | Follow existing middleware patterns | No breaking changes | auto=FULL
-" --tool codex --mode auto --timeout 1800000
-```
-
-**Bug Fix Task**:
-```bash
-ccw cli exec "
-PURPOSE: Fix memory leak in WebSocket connection handler causing server OOM after 24h; root cause must be identified before any fix
-TASK: • Trace connection lifecycle from open to close • Identify event listener accumulation • Check cleanup on disconnect • Verify garbage collection eligibility
-MODE: analysis
-CONTEXT: @src/websocket/**/* @src/services/connection-manager.ts | Memory: Using ws library, ~5000 concurrent connections in production
-EXPECTED: Root cause analysis with: memory profile, leak source (file:line), fix recommendation with code, verification steps
-RULES: $(cat ~/.claude/workflows/cli-templates/prompts/analysis/01-diagnose-bug-root-cause.txt) | Focus on resource cleanup | analysis=READ-ONLY
-" --tool gemini --cd src --timeout 900000
-```
-
-**Refactoring Task**:
-```bash
-ccw cli exec "
-PURPOSE: Refactor payment processing to use strategy pattern for multi-gateway support; no functional changes; all existing tests must pass
-TASK: • Extract gateway interface from current implementation • Create strategy classes for Stripe, PayPal • Implement factory for gateway selection • Migrate existing code to use strategies
-MODE: write
-CONTEXT: @src/payments/**/* @src/types/payment.ts | Memory: Currently only Stripe, adding PayPal next sprint, must support future gateways
-EXPECTED: Refactored code with: strategy interface, concrete implementations, factory class, updated tests, migration checklist
-RULES: $(cat ~/.claude/workflows/cli-templates/prompts/development/02-refactor-codebase.txt) | Preserve all existing behavior | Tests must pass | write=CREATE/MODIFY/DELETE
-" --tool gemini --mode write --timeout 1200000
-```
---
-
-## Configuration
-
-### Timeout Allocation
-
-**Minimum**: 5 minutes (300000ms)
-
-| Complexity | Range | Examples |
-|------------|-------|----------|
-| Simple | 5-10min (300000-600000ms) | Analysis, search |
-| Medium | 10-20min (600000-1200000ms) | Refactoring, documentation |
-| Complex | 20-60min (1200000-3600000ms) | Implementation, migration |
-| Heavy | 60-120min (3600000-7200000ms) | Large codebase, multi-file |
-
-**Codex Multiplier**: 3x allocated time (minimum 15min / 900000ms)
-
-```bash
-ccw cli exec "<prompt>" --tool gemini --timeout 600000   # 10 min
-ccw cli exec "<prompt>" --tool codex --timeout 1800000   # 30 min
-```
-
-### Permission Framework
-
-**Single-Use Authorization**: Each execution requires explicit user instruction. Previous authorization does NOT carry over.
-
-**Mode Hierarchy**:
- `analysis` (default): Read-only, safe for auto-execution
- `write`: Requires explicit `--mode write`
- `auto`: Requires explicit `--mode auto`
- **Exception**: User provides clear instructions like "modify", "create", "implement"
-
---
-
-## Best Practices
-
-### Workflow Principles
-
- **Use CCW unified interface** for all executions
- **Always include template** - Use Task-Template Matrix or universal fallback
- **Be specific** - Clear PURPOSE, TASK, EXPECTED fields
- **Include constraints** - File patterns, scope in RULES
- **Leverage memory context** when building on previous work
- **Discover patterns first** - Use rg/MCP before CLI execution
- **Default to full context** - Use `@**/*` unless specific files needed
-
-### Workflow Integration
-
-| Phase | Command |
-|-------|---------|
-| Understanding | `ccw cli exec "<prompt>" --tool gemini` |
-| Architecture | `ccw cli exec "<prompt>" --tool gemini` |
-| Implementation | `ccw cli exec "<prompt>" --tool codex --mode auto` |
-| Quality | `ccw cli exec "<prompt>" --tool codex --mode write` |
-
-### Planning Checklist
-
- [ ] **Purpose defined** - Clear goal and intent
- [ ] **Mode selected** - `--mode analysis|write|auto`
- [ ] **Context gathered** - File references + memory (default `@**/*`)
- [ ] **Directory navigation** - `--cd` and/or `--includeDirs`
- [ ] **Tool selected** - `--tool gemini|qwen|codex`
- [ ] **Template applied (REQUIRED)** - Use specific or universal fallback template
- [ ] **Constraints specified** - Scope, requirements
- [ ] **Timeout configured** - Based on complexity
--- a/ccw/src/core/routes/memory-routes.ts
+++ b/ccw/src/core/routes/memory-routes.ts
@@ -734,7 +734,7 @@ Return ONLY valid JSON in this exact format (no markdown, no code blocks, just p

    try {
      const configPath = join(projectPath, '.claude', 'rules', 'active_memory.md');
-      const configJsonPath = join(projectPath, '.claude', 'rules', 'active_memory_config.json');
+      const configJsonPath = join(projectPath, '.claude', 'active_memory_config.json');
      const enabled = existsSync(configPath);
      let lastSync: string | null = null;
      let fileCount = 0;
@@ -785,14 +785,18 @@ Return ONLY valid JSON in this exact format (no markdown, no code blocks, just p
        }

        const rulesDir = join(projectPath, '.claude', 'rules');
+        const claudeDir = join(projectPath, '.claude');
        const configPath = join(rulesDir, 'active_memory.md');
-        const configJsonPath = join(rulesDir, 'active_memory_config.json');
+        const configJsonPath = join(claudeDir, 'active_memory_config.json');

        if (enabled) {
-          // Enable: Create directory and initial file
+          // Enable: Create directories and initial file
          if (!existsSync(rulesDir)) {
            mkdirSync(rulesDir, { recursive: true });
          }
+          if (!existsSync(claudeDir)) {
+            mkdirSync(claudeDir, { recursive: true });
+          }

          // Save config
          if (config) {
@@ -844,11 +848,11 @@ Return ONLY valid JSON in this exact format (no markdown, no code blocks, just p
      try {
        const { config } = JSON.parse(body || '{}');
        const projectPath = initialPath;
-        const rulesDir = join(projectPath, '.claude', 'rules');
-        const configJsonPath = join(rulesDir, 'active_memory_config.json');
+        const claudeDir = join(projectPath, '.claude');
+        const configJsonPath = join(claudeDir, 'active_memory_config.json');

-        if (!existsSync(rulesDir)) {
-          mkdirSync(rulesDir, { recursive: true });
+        if (!existsSync(claudeDir)) {
+          mkdirSync(claudeDir, { recursive: true });
        }

        writeFileSync(configJsonPath, JSON.stringify(config, null, 2), 'utf-8');
@@ -938,7 +942,10 @@ RULES: Be concise. Focus on practical understanding. Include function signatures
          });

          if (result.success && result.execution?.output) {
-            cliOutput = result.execution.output;
+            // Extract stdout from output object
+            cliOutput = typeof result.execution.output === 'string'
+              ? result.execution.output
+              : result.execution.output.stdout || '';
          }

          // Add CLI output to content
@@ -1007,6 +1014,18 @@ RULES: Be concise. Focus on practical understanding. Include function signatures
        // Write the file
        writeFileSync(configPath, content, 'utf-8');

+        // Broadcast Active Memory sync completion event
+        broadcastToClients({
+          type: 'ACTIVE_MEMORY_SYNCED',
+          payload: {
+            filesAnalyzed: hotFiles.length,
+            path: configPath,
+            tool,
+            usedCli: cliOutput.length > 0,
+            timestamp: new Date().toISOString()
+          }
+        });
+
        res.writeHead(200, { 'Content-Type': 'application/json' });
        res.end(JSON.stringify({
          success: true,
--- a/ccw/src/templates/dashboard-css/10-cli.css
+++ b/ccw/src/templates/dashboard-css/10-cli.css
@@ -3757,3 +3757,205 @@
 .btn-ghost.text-destructive:hover {
  background: hsl(var(--destructive) / 0.1);
 }
+
+/* ========================================
+ * Semantic Metadata Viewer Styles
+ * ======================================== */
+.semantic-viewer-toolbar {
+  display: flex;
+  align-items: center;
+  justify-content: space-between;
+  padding: 0.75rem 1rem;
+  background: hsl(var(--muted) / 0.3);
+  border-bottom: 1px solid hsl(var(--border));
+}
+
+.semantic-table-container {
+  max-height: 400px;
+  overflow-y: auto;
+}
+
+.semantic-table {
+  width: 100%;
+  border-collapse: collapse;
+  font-size: 0.8125rem;
+}
+
+.semantic-table th {
+  position: sticky;
+  top: 0;
+  background: hsl(var(--card));
+  padding: 0.625rem 0.75rem;
+  text-align: left;
+  font-weight: 600;
+  font-size: 0.75rem;
+  color: hsl(var(--muted-foreground));
+  border-bottom: 1px solid hsl(var(--border));
+  white-space: nowrap;
+}
+
+.semantic-table td {
+  padding: 0.625rem 0.75rem;
+  border-bottom: 1px solid hsl(var(--border) / 0.5);
+  vertical-align: top;
+}
+
+.semantic-row {
+  cursor: pointer;
+  transition: background 0.15s ease;
+}
+
+.semantic-row:hover {
+  background: hsl(var(--hover));
+}
+
+.semantic-cell-file {
+  max-width: 200px;
+}
+
+.semantic-cell-lang {
+  width: 80px;
+  color: hsl(var(--muted-foreground));
+}
+
+.semantic-cell-purpose {
+  max-width: 180px;
+  color: hsl(var(--foreground) / 0.8);
+}
+
+.semantic-cell-keywords {
+  max-width: 160px;
+}
+
+.semantic-cell-tool {
+  width: 70px;
+}
+
+.semantic-cell-date {
+  width: 80px;
+  color: hsl(var(--muted-foreground));
+  font-size: 0.75rem;
+}
+
+.semantic-keyword {
+  display: inline-block;
+  padding: 0.125rem 0.375rem;
+  margin: 0.125rem;
+  background: hsl(var(--primary) / 0.1);
+  color: hsl(var(--primary));
+  border-radius: 0.25rem;
+  font-size: 0.6875rem;
+}
+
+.semantic-keyword-more {
+  display: inline-block;
+  padding: 0.125rem 0.375rem;
+  margin: 0.125rem;
+  background: hsl(var(--muted));
+  color: hsl(var(--muted-foreground));
+  border-radius: 0.25rem;
+  font-size: 0.6875rem;
+}
+
+.tool-badge {
+  display: inline-block;
+  padding: 0.125rem 0.5rem;
+  border-radius: 0.25rem;
+  font-size: 0.6875rem;
+  font-weight: 500;
+  text-transform: capitalize;
+}
+
+.tool-badge.tool-gemini {
+  background: hsl(210 80% 55% / 0.15);
+  color: hsl(210 80% 45%);
+}
+
+.tool-badge.tool-qwen {
+  background: hsl(142 76% 36% / 0.15);
+  color: hsl(142 76% 36%);
+}
+
+.tool-badge.tool-unknown {
+  background: hsl(var(--muted));
+  color: hsl(var(--muted-foreground));
+}
+
+.semantic-detail-row {
+  background: hsl(var(--muted) / 0.2);
+}
+
+.semantic-detail-row.hidden {
+  display: none;
+}
+
+.semantic-detail-content {
+  padding: 1rem;
+}
+
+.semantic-detail-section {
+  margin-bottom: 1rem;
+}
+
+.semantic-detail-section h4 {
+  display: flex;
+  align-items: center;
+  gap: 0.5rem;
+  font-size: 0.75rem;
+  font-weight: 600;
+  color: hsl(var(--muted-foreground));
+  margin-bottom: 0.5rem;
+  text-transform: uppercase;
+  letter-spacing: 0.05em;
+}
+
+.semantic-detail-section p {
+  font-size: 0.8125rem;
+  line-height: 1.5;
+  color: hsl(var(--foreground));
+}
+
+.semantic-keywords-full {
+  display: flex;
+  flex-wrap: wrap;
+  gap: 0.25rem;
+}
+
+.semantic-detail-meta {
+  display: flex;
+  gap: 1rem;
+  padding-top: 0.75rem;
+  border-top: 1px solid hsl(var(--border) / 0.5);
+  font-size: 0.75rem;
+  color: hsl(var(--muted-foreground));
+}
+
+.semantic-detail-meta span {
+  display: flex;
+  align-items: center;
+  gap: 0.375rem;
+}
+
+.semantic-viewer-footer {
+  display: flex;
+  align-items: center;
+  justify-content: space-between;
+  padding: 0.75rem 1rem;
+  background: hsl(var(--muted) / 0.3);
+  border-top: 1px solid hsl(var(--border));
+}
+
+.semantic-loading,
+.semantic-empty {
+  display: flex;
+  flex-direction: column;
+  align-items: center;
+  justify-content: center;
+  padding: 3rem;
+  text-align: center;
+  color: hsl(var(--muted-foreground));
+}
+
+.semantic-loading {
+  gap: 1rem;
+}
--- a/ccw/src/templates/dashboard-css/11-memory.css
+++ b/ccw/src/templates/dashboard-css/11-memory.css
@@ -2097,7 +2097,7 @@
  position: fixed;
  top: 0;
  right: 0;
-  width: 480px;
+  width: 50vw;
  max-width: 100vw;
  height: 100vh;
  background: hsl(var(--card));
@@ -2132,7 +2132,6 @@
  justify-content: space-between;
  padding: 1rem 1.25rem;
  border-bottom: 1px solid hsl(var(--border));
-  background: hsl(var(--muted) / 0.3);
 }

 .insight-detail-header h3 {
--- a/ccw/src/templates/dashboard-js/components/notifications.js
+++ b/ccw/src/templates/dashboard-js/components/notifications.js
@@ -238,6 +238,31 @@ function handleNotification(data) {
      }
      break;

+    case 'ACTIVE_MEMORY_SYNCED':
+      // Handle Active Memory sync completion
+      if (typeof addGlobalNotification === 'function') {
+        const { filesAnalyzed, tool, usedCli } = payload;
+        const method = usedCli ? `CLI (${tool})` : 'Basic';
+        addGlobalNotification(
+          'success',
+          'Active Memory synced',
+          {
+            'Files Analyzed': filesAnalyzed,
+            'Method': method,
+            'Timestamp': new Date(payload.timestamp).toLocaleTimeString()
+          },
+          'Memory'
+        );
+      }
+      // Refresh Active Memory status if on memory view
+      if (getCurrentView && getCurrentView() === 'memory') {
+        if (typeof loadActiveMemoryStatus === 'function') {
+          loadActiveMemoryStatus();
+        }
+      }
+      console.log('[Active Memory] Sync completed:', payload);
+      break;
+
    default:
      console.log('[WS] Unknown notification type:', type);
  }
--- a/codex-lens/src/codexlens/cli/commands.py
+++ b/codex-lens/src/codexlens/cli/commands.py
@@ -1123,11 +1123,11 @@ def semantic_list(
        registry.initialize()
        mapper = PathMapper()

-        project_info = registry.find_project(base_path)
+        project_info = registry.get_project(base_path)
        if not project_info:
            raise CodexLensError(f"No index found for: {base_path}. Run 'codex-lens init' first.")

-        index_dir = mapper.source_to_index_dir(base_path)
+        index_dir = Path(project_info.index_root)
        if not index_dir.exists():
            raise CodexLensError(f"Index directory not found: {index_dir}")

--- a/codex-lens/src/codexlens/storage/dir_index.py
+++ b/codex-lens/src/codexlens/storage/dir_index.py
@@ -375,6 +375,7 @@ class DirIndexStore:
            keywords_json = json.dumps(keywords)
            generated_at = time.time()

+            # Write to semantic_metadata table (for backward compatibility)
            conn.execute(
                """
                INSERT INTO semantic_metadata(file_id, summary, keywords, purpose, llm_tool, generated_at)
@@ -388,6 +389,37 @@ class DirIndexStore:
                """,
                (file_id, summary, keywords_json, purpose, llm_tool, generated_at),
            )
+
+            # Write to normalized keywords tables for optimized search
+            # First, remove existing keyword associations
+            conn.execute("DELETE FROM file_keywords WHERE file_id = ?", (file_id,))
+
+            # Then add new keywords
+            for keyword in keywords:
+                keyword = keyword.strip()
+                if not keyword:
+                    continue
+
+                # Insert keyword if it doesn't exist
+                conn.execute(
+                    "INSERT OR IGNORE INTO keywords(keyword) VALUES(?)",
+                    (keyword,)
+                )
+
+                # Get keyword_id
+                row = conn.execute(
+                    "SELECT id FROM keywords WHERE keyword = ?",
+                    (keyword,)
+                ).fetchone()
+
+                if row:
+                    keyword_id = row["id"]
+                    # Link file to keyword
+                    conn.execute(
+                        "INSERT OR IGNORE INTO file_keywords(file_id, keyword_id) VALUES(?, ?)",
+                        (file_id, keyword_id)
+                    )
+
            conn.commit()

    def get_semantic_metadata(self, file_id: int) -> Optional[Dict[str, Any]]:
@@ -454,11 +486,12 @@ class DirIndexStore:
                for row in rows
            ]

-    def search_semantic_keywords(self, keyword: str) -> List[Tuple[FileEntry, List[str]]]:
+    def search_semantic_keywords(self, keyword: str, use_normalized: bool = True) -> List[Tuple[FileEntry, List[str]]]:
        """Search files by semantic keywords.

        Args:
            keyword: Keyword to search for (case-insensitive)
+            use_normalized: Use optimized normalized tables (default: True)

        Returns:
            List of (FileEntry, keywords) tuples where keyword matches
@@ -466,35 +499,71 @@ class DirIndexStore:
        with self._lock:
            conn = self._get_connection()

-            keyword_pattern = f"%{keyword}%"
+            if use_normalized:
+                # Optimized query using normalized tables with indexed lookup
+                # Use prefix search (keyword%) for better index utilization
+                keyword_pattern = f"{keyword}%"

-            rows = conn.execute(
-                """
-                SELECT f.id, f.name, f.full_path, f.language, f.mtime, f.line_count, sm.keywords
-                FROM files f
-                JOIN semantic_metadata sm ON f.id = sm.file_id
-                WHERE sm.keywords LIKE ? COLLATE NOCASE
-                ORDER BY f.name
-                """,
-                (keyword_pattern,),
-            ).fetchall()
+                rows = conn.execute(
+                    """
+                    SELECT f.id, f.name, f.full_path, f.language, f.mtime, f.line_count,
+                           GROUP_CONCAT(k.keyword, ',') as keywords
+                    FROM files f
+                    JOIN file_keywords fk ON f.id = fk.file_id
+                    JOIN keywords k ON fk.keyword_id = k.id
+                    WHERE k.keyword LIKE ? COLLATE NOCASE
+                    GROUP BY f.id, f.name, f.full_path, f.language, f.mtime, f.line_count
+                    ORDER BY f.name
+                    """,
+                    (keyword_pattern,),
+                ).fetchall()

-            import json
+                results = []
+                for row in rows:
+                    file_entry = FileEntry(
+                        id=int(row["id"]),
+                        name=row["name"],
+                        full_path=Path(row["full_path"]),
+                        language=row["language"],
+                        mtime=float(row["mtime"]) if row["mtime"] else 0.0,
+                        line_count=int(row["line_count"]) if row["line_count"] else 0,
+                    )
+                    keywords = row["keywords"].split(',') if row["keywords"] else []
+                    results.append((file_entry, keywords))

-            results = []
-            for row in rows:
-                file_entry = FileEntry(
-                    id=int(row["id"]),
-                    name=row["name"],
-                    full_path=Path(row["full_path"]),
-                    language=row["language"],
-                    mtime=float(row["mtime"]) if row["mtime"] else 0.0,
-                    line_count=int(row["line_count"]) if row["line_count"] else 0,
-                )
-                keywords = json.loads(row["keywords"]) if row["keywords"] else []
-                results.append((file_entry, keywords))
+                return results

-            return results
+            else:
+                # Fallback to original query for backward compatibility
+                keyword_pattern = f"%{keyword}%"
+
+                rows = conn.execute(
+                    """
+                    SELECT f.id, f.name, f.full_path, f.language, f.mtime, f.line_count, sm.keywords
+                    FROM files f
+                    JOIN semantic_metadata sm ON f.id = sm.file_id
+                    WHERE sm.keywords LIKE ? COLLATE NOCASE
+                    ORDER BY f.name
+                    """,
+                    (keyword_pattern,),
+                ).fetchall()
+
+                import json
+
+                results = []
+                for row in rows:
+                    file_entry = FileEntry(
+                        id=int(row["id"]),
+                        name=row["name"],
+                        full_path=Path(row["full_path"]),
+                        language=row["language"],
+                        mtime=float(row["mtime"]) if row["mtime"] else 0.0,
+                        line_count=int(row["line_count"]) if row["line_count"] else 0,
+                    )
+                    keywords = json.loads(row["keywords"]) if row["keywords"] else []
+                    results.append((file_entry, keywords))
+
+                return results

    def list_semantic_metadata(
        self,
@@ -794,19 +863,26 @@ class DirIndexStore:
            return [row["full_path"] for row in rows]

    def search_symbols(
-        self, name: str, kind: Optional[str] = None, limit: int = 50
+        self, name: str, kind: Optional[str] = None, limit: int = 50, prefix_mode: bool = True
    ) -> List[Symbol]:
        """Search symbols by name pattern.

        Args:
-            name: Symbol name pattern (LIKE query)
+            name: Symbol name pattern
            kind: Optional symbol kind filter
            limit: Maximum results to return
+            prefix_mode: If True, use prefix search (faster with index);
+                        If False, use substring search (slower)

        Returns:
            List of Symbol objects
        """
-        pattern = f"%{name}%"
+        # Prefix search is much faster as it can use index
+        if prefix_mode:
+            pattern = f"{name}%"
+        else:
+            pattern = f"%{name}%"
+
        with self._lock:
            conn = self._get_connection()
            if kind:
@@ -979,6 +1055,28 @@ class DirIndexStore:
                """
            )

+            # Normalized keywords tables for performance
+            conn.execute(
+                """
+                CREATE TABLE IF NOT EXISTS keywords (
+                    id INTEGER PRIMARY KEY,
+                    keyword TEXT NOT NULL UNIQUE
+                )
+                """
+            )
+
+            conn.execute(
+                """
+                CREATE TABLE IF NOT EXISTS file_keywords (
+                    file_id INTEGER NOT NULL,
+                    keyword_id INTEGER NOT NULL,
+                    PRIMARY KEY (file_id, keyword_id),
+                    FOREIGN KEY (file_id) REFERENCES files (id) ON DELETE CASCADE,
+                    FOREIGN KEY (keyword_id) REFERENCES keywords (id) ON DELETE CASCADE
+                )
+                """
+            )
+
            # Indexes
            conn.execute("CREATE INDEX IF NOT EXISTS idx_files_name ON files(name)")
            conn.execute("CREATE INDEX IF NOT EXISTS idx_files_path ON files(full_path)")
@@ -986,6 +1084,9 @@ class DirIndexStore:
            conn.execute("CREATE INDEX IF NOT EXISTS idx_symbols_name ON symbols(name)")
            conn.execute("CREATE INDEX IF NOT EXISTS idx_symbols_file ON symbols(file_id)")
            conn.execute("CREATE INDEX IF NOT EXISTS idx_semantic_file ON semantic_metadata(file_id)")
+            conn.execute("CREATE INDEX IF NOT EXISTS idx_keywords_keyword ON keywords(keyword)")
+            conn.execute("CREATE INDEX IF NOT EXISTS idx_file_keywords_file_id ON file_keywords(file_id)")
+            conn.execute("CREATE INDEX IF NOT EXISTS idx_file_keywords_keyword_id ON file_keywords(keyword_id)")

        except sqlite3.DatabaseError as exc:
            raise StorageError(f"Failed to create schema: {exc}") from exc
--- a/codex-lens/src/codexlens/storage/migration_manager.py
+++ b/codex-lens/src/codexlens/storage/migration_manager.py
@@ -0,0 +1,139 @@
+"""
+Manages database schema migrations.
+
+This module provides a framework for applying versioned migrations to the SQLite
+database. Migrations are discovered from the `codexlens.storage.migrations`
+package and applied sequentially. The database schema version is tracked using
+the `user_version` pragma.
+"""
+
+import importlib
+import logging
+import pkgutil
+from pathlib import Path
+from sqlite3 import Connection
+from typing import List, NamedTuple
+
+log = logging.getLogger(__name__)
+
+
+class Migration(NamedTuple):
+    """Represents a single database migration."""
+
+    version: int
+    name: str
+    upgrade: callable
+
+
+def discover_migrations() -> List[Migration]:
+    """
+    Discovers and returns a sorted list of database migrations.
+
+    Migrations are expected to be in the `codexlens.storage.migrations` package,
+    with filenames in the format `migration_XXX_description.py`, where XXX is
+    the version number. Each migration module must contain an `upgrade` function
+    that takes a `sqlite3.Connection` object as its argument.
+
+    Returns:
+        A list of Migration objects, sorted by version.
+    """
+    import codexlens.storage.migrations
+
+    migrations = []
+    package_path = Path(codexlens.storage.migrations.__file__).parent
+    
+    for _, name, _ in pkgutil.iter_modules([str(package_path)]):
+        if name.startswith("migration_"):
+            try:
+                version = int(name.split("_")[1])
+                module = importlib.import_module(f"codexlens.storage.migrations.{name}")
+                if hasattr(module, "upgrade"):
+                    migrations.append(
+                        Migration(version=version, name=name, upgrade=module.upgrade)
+                    )
+                else:
+                    log.warning(f"Migration {name} is missing 'upgrade' function.")
+            except (ValueError, IndexError) as e:
+                log.warning(f"Could not parse migration name {name}: {e}")
+            except ImportError as e:
+                log.warning(f"Could not import migration {name}: {e}")
+
+    migrations.sort(key=lambda m: m.version)
+    return migrations
+
+
+class MigrationManager:
+    """
+    Manages the application of migrations to a database.
+    """
+
+    def __init__(self, db_conn: Connection):
+        """
+        Initializes the MigrationManager.
+
+        Args:
+            db_conn: The SQLite database connection.
+        """
+        self.db_conn = db_conn
+        self.migrations = discover_migrations()
+
+    def get_current_version(self) -> int:
+        """
+        Gets the current version of the database schema.
+
+        Returns:
+            The current schema version number.
+        """
+        return self.db_conn.execute("PRAGMA user_version").fetchone()[0]
+
+    def set_version(self, version: int):
+        """
+        Sets the database schema version.
+
+        Args:
+            version: The version number to set.
+        """
+        self.db_conn.execute(f"PRAGMA user_version = {version}")
+        log.info(f"Database schema version set to {version}")
+
+    def apply_migrations(self):
+        """
+        Applies all pending migrations to the database.
+
+        This method checks the current database version and applies all
+        subsequent migrations in order. Each migration is applied within
+        a transaction.
+        """
+        current_version = self.get_current_version()
+        log.info(f"Current database schema version: {current_version}")
+
+        for migration in self.migrations:
+            if migration.version > current_version:
+                log.info(f"Applying migration {migration.version}: {migration.name}...")
+                try:
+                    self.db_conn.execute("BEGIN")
+                    migration.upgrade(self.db_conn)
+                    self.set_version(migration.version)
+                    self.db_conn.execute("COMMIT")
+                    log.info(
+                        f"Successfully applied migration {migration.version}: {migration.name}"
+                    )
+                except Exception as e:
+                    log.error(
+                        f"Failed to apply migration {migration.version}: {migration.name}. Rolling back. Error: {e}",
+                        exc_info=True,
+                    )
+                    self.db_conn.execute("ROLLBACK")
+                    raise
+        
+        latest_migration_version = self.migrations[-1].version if self.migrations else 0
+        if current_version < latest_migration_version:
+            # This case can be hit if migrations were applied but the loop was exited
+            # and set_version was not called for the last one for some reason.
+            # To be safe, we explicitly set the version to the latest known migration.
+            final_version = self.get_current_version()
+            if final_version != latest_migration_version:
+                 log.warning(f"Database version ({final_version}) is not the latest migration version ({latest_migration_version}). This may indicate a problem.")
+
+        log.info("All pending migrations applied successfully.")
+
--- a/codex-lens/src/codexlens/storage/migrations/init.py
+++ b/codex-lens/src/codexlens/storage/migrations/init.py
@@ -0,0 +1 @@
+# This file makes the 'migrations' directory a Python package.
--- a/codex-lens/src/codexlens/storage/migrations/migration_001_normalize_keywords.py
+++ b/codex-lens/src/codexlens/storage/migrations/migration_001_normalize_keywords.py
@@ -0,0 +1,108 @@
+"""
+Migration 001: Normalize keywords into separate tables.
+
+This migration introduces two new tables, `keywords` and `file_keywords`, to
+store semantic keywords in a normalized fashion. It then migrates the existing
+keywords from the `semantic_data` JSON blob in the `files` table into these
+new tables. This is intended to speed up keyword-based searches significantly.
+"""
+
+import json
+import logging
+from sqlite3 import Connection
+
+log = logging.getLogger(__name__)
+
+
+def upgrade(db_conn: Connection):
+    """
+    Applies the migration to normalize keywords.
+
+    - Creates `keywords` and `file_keywords` tables.
+    - Creates indexes for efficient querying.
+    - Migrates data from `files.semantic_data` to the new tables.
+
+    Args:
+        db_conn: The SQLite database connection.
+    """
+    cursor = db_conn.cursor()
+
+    log.info("Creating 'keywords' and 'file_keywords' tables...")
+    # Create a table to store unique keywords
+    cursor.execute(
+        """
+        CREATE TABLE IF NOT EXISTS keywords (
+            id INTEGER PRIMARY KEY,
+            keyword TEXT NOT NULL UNIQUE
+        )
+        """
+    )
+
+    # Create a join table to link files and keywords (many-to-many)
+    cursor.execute(
+        """
+        CREATE TABLE IF NOT EXISTS file_keywords (
+            file_id INTEGER NOT NULL,
+            keyword_id INTEGER NOT NULL,
+            PRIMARY KEY (file_id, keyword_id),
+            FOREIGN KEY (file_id) REFERENCES files (id) ON DELETE CASCADE,
+            FOREIGN KEY (keyword_id) REFERENCES keywords (id) ON DELETE CASCADE
+        )
+        """
+    )
+    
+    log.info("Creating indexes for new keyword tables...")
+    cursor.execute("CREATE INDEX IF NOT EXISTS idx_keywords_keyword ON keywords (keyword)")
+    cursor.execute("CREATE INDEX IF NOT EXISTS idx_file_keywords_file_id ON file_keywords (file_id)")
+    cursor.execute("CREATE INDEX IF NOT EXISTS idx_file_keywords_keyword_id ON file_keywords (keyword_id)")
+
+    log.info("Migrating existing keywords from 'semantic_metadata' table...")
+    cursor.execute("SELECT file_id, keywords FROM semantic_metadata WHERE keywords IS NOT NULL AND keywords != ''")
+
+    files_to_migrate = cursor.fetchall()
+    if not files_to_migrate:
+        log.info("No existing files with semantic metadata to migrate.")
+        return
+
+    log.info(f"Found {len(files_to_migrate)} files with semantic metadata to migrate.")
+
+    for file_id, keywords_json in files_to_migrate:
+        if not keywords_json:
+            continue
+        try:
+            keywords = json.loads(keywords_json)
+
+            if not isinstance(keywords, list):
+                log.warning(f"Keywords for file_id {file_id} is not a list, skipping.")
+                continue
+
+            for keyword in keywords:
+                if not isinstance(keyword, str):
+                    log.warning(f"Non-string keyword '{keyword}' found for file_id {file_id}, skipping.")
+                    continue
+
+                keyword = keyword.strip()
+                if not keyword:
+                    continue
+
+                # Get or create keyword_id
+                cursor.execute("INSERT OR IGNORE INTO keywords (keyword) VALUES (?)", (keyword,))
+                cursor.execute("SELECT id FROM keywords WHERE keyword = ?", (keyword,))
+                keyword_id_result = cursor.fetchone()
+
+                if keyword_id_result:
+                    keyword_id = keyword_id_result[0]
+                    # Link file to keyword
+                    cursor.execute(
+                        "INSERT OR IGNORE INTO file_keywords (file_id, keyword_id) VALUES (?, ?)",
+                        (file_id, keyword_id),
+                    )
+                else:
+                    log.error(f"Failed to retrieve or create keyword_id for keyword: {keyword}")
+
+        except json.JSONDecodeError as e:
+            log.warning(f"Could not parse keywords for file_id {file_id}: {e}")
+        except Exception as e:
+            log.error(f"An unexpected error occurred during migration for file_id {file_id}: {e}", exc_info=True)
+
+    log.info("Finished migrating keywords.")
--- a/codex-lens/src/codexlens/storage/registry.py
+++ b/codex-lens/src/codexlens/storage/registry.py
@@ -424,6 +424,9 @@ class RegistryStore:
        Searches for the closest parent directory that has an index.
        Useful for supporting subdirectory searches.

+        Optimized to use single database query instead of iterating through
+        each parent directory level.
+
        Args:
            source_path: Source directory or file path

@@ -434,23 +437,30 @@ class RegistryStore:
            conn = self._get_connection()
            source_path_resolved = source_path.resolve()

-            # Check from current path up to root
+            # Build list of all parent paths from deepest to shallowest
+            paths_to_check = []
            current = source_path_resolved
            while True:
-                current_str = str(current)
-                row = conn.execute(
-                    "SELECT * FROM dir_mapping WHERE source_path=?", (current_str,)
-                ).fetchone()
-
-                if row:
-                    return self._row_to_dir_mapping(row)
-
+                paths_to_check.append(str(current))
                parent = current.parent
                if parent == current:  # Reached filesystem root
                    break
                current = parent

-            return None
+            if not paths_to_check:
+                return None
+
+            # Single query with WHERE IN, ordered by path length (longest = nearest)
+            placeholders = ','.join('?' * len(paths_to_check))
+            query = f"""
+                SELECT * FROM dir_mapping
+                WHERE source_path IN ({placeholders})
+                ORDER BY LENGTH(source_path) DESC
+                LIMIT 1
+            """
+
+            row = conn.execute(query, paths_to_check).fetchone()
+            return self._row_to_dir_mapping(row) if row else None

    def get_project_dirs(self, project_id: int) -> List[DirMapping]:
        """Get all directory mappings for a project.
--- a/codex-lens/tests/simple_validation.py
+++ b/codex-lens/tests/simple_validation.py
@@ -0,0 +1,218 @@
+"""
+Simple validation for performance optimizations (Windows-safe).
+"""
+import sys
+sys.stdout.reconfigure(encoding='utf-8')
+
+import json
+import sqlite3
+import tempfile
+import time
+from pathlib import Path
+
+from codexlens.storage.dir_index import DirIndexStore
+from codexlens.storage.registry import RegistryStore
+
+
+def main():
+    print("=" * 60)
+    print("CodexLens Performance Optimizations - Simple Validation")
+    print("=" * 60)
+
+    # Test 1: Keyword Normalization
+    print("\n[1/4] Testing Keyword Normalization...")
+    try:
+        tmpdir = tempfile.mkdtemp()
+        db_path = Path(tmpdir) / "test1.db"
+
+        store = DirIndexStore(db_path)
+        store.initialize()
+
+        file_id = store.add_file(
+            name="test.py",
+            full_path=Path(f"{tmpdir}/test.py"),
+            content="def hello(): pass",
+            language="python"
+        )
+
+        keywords = ["auth", "security", "jwt"]
+        store.add_semantic_metadata(
+            file_id=file_id,
+            summary="Test",
+            keywords=keywords,
+            purpose="Testing",
+            llm_tool="gemini"
+        )
+
+        # Check normalized tables
+        conn = store._get_connection()
+        count = conn.execute(
+            "SELECT COUNT(*) as c FROM file_keywords WHERE file_id=?",
+            (file_id,)
+        ).fetchone()["c"]
+
+        store.close()
+
+        assert count == 3, f"Expected 3 keywords, got {count}"
+        print("   PASS: Keywords stored in normalized tables")
+
+        # Test optimized search
+        store = DirIndexStore(db_path)
+        results = store.search_semantic_keywords("auth", use_normalized=True)
+        store.close()
+
+        assert len(results) == 1
+        print("   PASS: Optimized keyword search works")
+
+    except Exception as e:
+        import traceback
+        print(f"   FAIL: {e}")
+        traceback.print_exc()
+        return 1
+
+    # Test 2: Path Lookup Optimization
+    print("\n[2/4] Testing Path Lookup Optimization...")
+    try:
+        tmpdir = tempfile.mkdtemp()
+        db_path = Path(tmpdir) / "test2.db"
+
+        store = RegistryStore(db_path)
+        store.initialize()  # Create schema
+
+        # Register a project first
+        project = store.register_project(
+            source_root=Path("/a"),
+            index_root=Path("/tmp")
+        )
+
+        # Register directory
+        store.register_dir(
+            project_id=project.id,
+            source_path=Path("/a/b/c"),
+            index_path=Path("/tmp/index.db"),
+            depth=2,
+            files_count=0
+        )
+
+        deep_path = Path("/a/b/c/d/e/f/g/h/i/j/file.py")
+
+        start = time.perf_counter()
+        result = store.find_nearest_index(deep_path)
+        elapsed = time.perf_counter() - start
+
+        store.close()
+
+        assert result is not None, "No result found"
+        # Path is normalized, just check it contains the key parts
+        assert "a" in str(result.source_path) and "b" in str(result.source_path) and "c" in str(result.source_path)
+        assert elapsed < 0.05, f"Too slow: {elapsed*1000:.2f}ms"
+
+        print(f"   PASS: Found nearest index in {elapsed*1000:.2f}ms")
+
+    except Exception as e:
+        import traceback
+        print(f"   FAIL: {e}")
+        traceback.print_exc()
+        return 1
+
+    # Test 3: Symbol Search Prefix Mode
+    print("\n[3/4] Testing Symbol Search Prefix Mode...")
+    try:
+        tmpdir = tempfile.mkdtemp()
+        db_path = Path(tmpdir) / "test3.db"
+
+        store = DirIndexStore(db_path)
+        store.initialize()
+
+        from codexlens.entities import Symbol
+        file_id = store.add_file(
+            name="test.py",
+            full_path=Path(f"{tmpdir}/test.py"),
+            content="def hello(): pass\n" * 10,
+            language="python",
+            symbols=[
+                Symbol(name="get_user", kind="function", range=(1, 5)),
+                Symbol(name="get_item", kind="function", range=(6, 10)),
+                Symbol(name="create_user", kind="function", range=(11, 15)),
+            ]
+        )
+
+        # Prefix search
+        results = store.search_symbols("get", prefix_mode=True)
+        store.close()
+
+        assert len(results) == 2, f"Expected 2, got {len(results)}"
+        for symbol in results:
+            assert symbol.name.startswith("get")
+
+        print(f"   PASS: Prefix search found {len(results)} symbols")
+
+    except Exception as e:
+        import traceback
+        print(f"   FAIL: {e}")
+        traceback.print_exc()
+        return 1
+
+    # Test 4: Performance Comparison
+    print("\n[4/4] Testing Performance Comparison...")
+    try:
+        tmpdir = tempfile.mkdtemp()
+        db_path = Path(tmpdir) / "test4.db"
+
+        store = DirIndexStore(db_path)
+        store.initialize()
+
+        # Create 50 files with keywords
+        for i in range(50):
+            file_id = store.add_file(
+                name=f"file_{i}.py",
+                full_path=Path(f"{tmpdir}/file_{i}.py"),
+                content=f"def function_{i}(): pass",
+                language="python"
+            )
+
+            keywords = ["auth", "security"] if i % 2 == 0 else ["api", "endpoint"]
+            store.add_semantic_metadata(
+                file_id=file_id,
+                summary=f"File {i}",
+                keywords=keywords,
+                purpose="Testing",
+                llm_tool="gemini"
+            )
+
+        # Benchmark normalized
+        start = time.perf_counter()
+        for _ in range(5):
+            results_norm = store.search_semantic_keywords("auth", use_normalized=True)
+        norm_time = time.perf_counter() - start
+
+        # Benchmark fallback
+        start = time.perf_counter()
+        for _ in range(5):
+            results_fallback = store.search_semantic_keywords("auth", use_normalized=False)
+        fallback_time = time.perf_counter() - start
+
+        store.close()
+
+        assert len(results_norm) == len(results_fallback)
+        speedup = fallback_time / norm_time if norm_time > 0 else 1.0
+
+        print(f"   Normalized: {norm_time*1000:.2f}ms (5 iterations)")
+        print(f"   Fallback:   {fallback_time*1000:.2f}ms (5 iterations)")
+        print(f"   Speedup:    {speedup:.2f}x")
+        print("   PASS: Performance test completed")
+
+    except Exception as e:
+        import traceback
+        print(f"   FAIL: {e}")
+        traceback.print_exc()
+        return 1
+
+    print("\n" + "=" * 60)
+    print("ALL VALIDATION TESTS PASSED")
+    print("=" * 60)
+    return 0
+
+
+if __name__ == "__main__":
+    exit(main())
--- a/codex-lens/tests/test_performance_optimizations.py
+++ b/codex-lens/tests/test_performance_optimizations.py
@@ -0,0 +1,467 @@
+"""Tests for performance optimizations in CodexLens storage.
+
+This module tests the following optimizations:
+1. Normalized keywords search (migration_001)
+2. Optimized path lookup in registry
+3. Prefix-mode symbol search
+"""
+
+import json
+import sqlite3
+import tempfile
+import time
+from pathlib import Path
+
+import pytest
+
+from codexlens.storage.dir_index import DirIndexStore
+from codexlens.storage.registry import RegistryStore
+from codexlens.storage.migration_manager import MigrationManager
+from codexlens.storage.migrations import migration_001_normalize_keywords
+
+
+@pytest.fixture
+def temp_index_db():
+    """Create a temporary dir index database."""
+    with tempfile.TemporaryDirectory() as tmpdir:
+        db_path = Path(tmpdir) / "test_index.db"
+        store = DirIndexStore(db_path)
+        store.initialize()  # Initialize schema
+        yield store
+        store.close()
+
+
+@pytest.fixture
+def temp_registry_db():
+    """Create a temporary registry database."""
+    with tempfile.TemporaryDirectory() as tmpdir:
+        db_path = Path(tmpdir) / "test_registry.db"
+        store = RegistryStore(db_path)
+        store.initialize()  # Initialize schema
+        yield store
+        store.close()
+
+
+@pytest.fixture
+def populated_index_db(temp_index_db):
+    """Create an index database with sample data.
+
+    Uses 100 files to provide meaningful performance comparison between
+    optimized and fallback implementations.
+    """
+    from codexlens.entities import Symbol
+
+    store = temp_index_db
+
+    # Add files with symbols and keywords
+    # Using 100 files to show performance improvements
+    file_ids = []
+
+    # Define keyword pools for cycling
+    keyword_pools = [
+        ["auth", "security", "jwt"],
+        ["database", "sql", "query"],
+        ["auth", "login", "password"],
+        ["api", "rest", "endpoint"],
+        ["cache", "redis", "performance"],
+        ["auth", "oauth", "token"],
+        ["test", "unittest", "pytest"],
+        ["database", "postgres", "migration"],
+        ["api", "graphql", "resolver"],
+        ["security", "encryption", "crypto"]
+    ]
+
+    for i in range(100):
+        # Create symbols for first 50 files to have more symbol search data
+        symbols = None
+        if i < 50:
+            symbols = [
+                Symbol(name=f"get_user_{i}", kind="function", range=(1, 10)),
+                Symbol(name=f"create_user_{i}", kind="function", range=(11, 20)),
+                Symbol(name=f"UserClass_{i}", kind="class", range=(21, 40)),
+            ]
+
+        file_id = store.add_file(
+            name=f"file_{i}.py",
+            full_path=Path(f"/test/path/file_{i}.py"),
+            content=f"def function_{i}(): pass\n" * 10,
+            language="python",
+            symbols=symbols
+        )
+        file_ids.append(file_id)
+
+        # Add semantic metadata with keywords (cycle through keyword pools)
+        keywords = keyword_pools[i % len(keyword_pools)]
+        store.add_semantic_metadata(
+            file_id=file_id,
+            summary=f"Test file {file_id}",
+            keywords=keywords,
+            purpose="Testing",
+            llm_tool="gemini"
+        )
+
+    return store
+
+
+class TestKeywordNormalization:
+    """Test normalized keywords functionality."""
+
+    def test_migration_creates_tables(self, temp_index_db):
+        """Test that migration creates keywords and file_keywords tables."""
+        conn = temp_index_db._get_connection()
+
+        # Verify tables exist (created by _create_schema)
+        tables = conn.execute("""
+            SELECT name FROM sqlite_master
+            WHERE type='table' AND name IN ('keywords', 'file_keywords')
+        """).fetchall()
+
+        assert len(tables) == 2
+
+    def test_migration_creates_indexes(self, temp_index_db):
+        """Test that migration creates necessary indexes."""
+        conn = temp_index_db._get_connection()
+
+        # Check for indexes
+        indexes = conn.execute("""
+            SELECT name FROM sqlite_master
+            WHERE type='index' AND name IN (
+                'idx_keywords_keyword',
+                'idx_file_keywords_file_id',
+                'idx_file_keywords_keyword_id'
+            )
+        """).fetchall()
+
+        assert len(indexes) == 3
+
+    def test_add_semantic_metadata_populates_normalized_tables(self, temp_index_db):
+        """Test that adding metadata populates both old and new tables."""
+        # Add a file
+        file_id = temp_index_db.add_file(
+            name="test.py",
+            full_path=Path("/test/test.py"),
+            language="python",
+            content="test"
+        )
+
+        # Add semantic metadata
+        keywords = ["auth", "security", "jwt"]
+        temp_index_db.add_semantic_metadata(
+            file_id=file_id,
+            summary="Test summary",
+            keywords=keywords,
+            purpose="Testing",
+            llm_tool="gemini"
+        )
+
+        conn = temp_index_db._get_connection()
+
+        # Check semantic_metadata table (backward compatibility)
+        row = conn.execute(
+            "SELECT keywords FROM semantic_metadata WHERE file_id=?",
+            (file_id,)
+        ).fetchone()
+        assert row is not None
+        assert json.loads(row["keywords"]) == keywords
+
+        # Check normalized keywords table
+        keyword_rows = conn.execute("""
+            SELECT k.keyword
+            FROM file_keywords fk
+            JOIN keywords k ON fk.keyword_id = k.id
+            WHERE fk.file_id = ?
+        """, (file_id,)).fetchall()
+
+        assert len(keyword_rows) == 3
+        normalized_keywords = [row["keyword"] for row in keyword_rows]
+        assert set(normalized_keywords) == set(keywords)
+
+    def test_search_semantic_keywords_normalized(self, populated_index_db):
+        """Test optimized keyword search using normalized tables."""
+        results = populated_index_db.search_semantic_keywords("auth", use_normalized=True)
+
+        # Should find 3 files with "auth" keyword
+        assert len(results) >= 3
+
+        # Verify results structure
+        for file_entry, keywords in results:
+            assert file_entry.name.startswith("file_")
+            assert isinstance(keywords, list)
+            assert any("auth" in k.lower() for k in keywords)
+
+    def test_search_semantic_keywords_fallback(self, populated_index_db):
+        """Test that fallback search still works."""
+        results = populated_index_db.search_semantic_keywords("auth", use_normalized=False)
+
+        # Should find files with "auth" keyword
+        assert len(results) >= 3
+
+        for file_entry, keywords in results:
+            assert isinstance(keywords, list)
+
+
+class TestPathLookupOptimization:
+    """Test optimized path lookup in registry."""
+
+    def test_find_nearest_index_shallow(self, temp_registry_db):
+        """Test path lookup with shallow directory structure."""
+        # Register a project first
+        project = temp_registry_db.register_project(
+            source_root=Path("/test"),
+            index_root=Path("/tmp")
+        )
+
+        # Register directory mapping
+        temp_registry_db.register_dir(
+            project_id=project.id,
+            source_path=Path("/test"),
+            index_path=Path("/tmp/index.db"),
+            depth=0,
+            files_count=0
+        )
+
+        # Search for subdirectory
+        result = temp_registry_db.find_nearest_index(Path("/test/subdir/file.py"))
+
+        assert result is not None
+        # Compare as strings for cross-platform compatibility
+        assert "/test" in str(result.source_path) or "\\test" in str(result.source_path)
+
+    def test_find_nearest_index_deep(self, temp_registry_db):
+        """Test path lookup with deep directory structure."""
+        # Register a project
+        project = temp_registry_db.register_project(
+            source_root=Path("/a"),
+            index_root=Path("/tmp")
+        )
+
+        # Add directory mappings at different levels
+        temp_registry_db.register_dir(
+            project_id=project.id,
+            source_path=Path("/a"),
+            index_path=Path("/tmp/index_a.db"),
+            depth=0,
+            files_count=0
+        )
+        temp_registry_db.register_dir(
+            project_id=project.id,
+            source_path=Path("/a/b/c"),
+            index_path=Path("/tmp/index_abc.db"),
+            depth=2,
+            files_count=0
+        )
+
+        # Should find nearest (longest) match
+        result = temp_registry_db.find_nearest_index(Path("/a/b/c/d/e/f/file.py"))
+
+        assert result is not None
+        # Check that path contains the key parts
+        result_path = str(result.source_path)
+        assert "a" in result_path and "b" in result_path and "c" in result_path
+
+    def test_find_nearest_index_not_found(self, temp_registry_db):
+        """Test path lookup when no mapping exists."""
+        result = temp_registry_db.find_nearest_index(Path("/nonexistent/path"))
+        assert result is None
+
+    def test_find_nearest_index_performance(self, temp_registry_db):
+        """Basic performance test for path lookup."""
+        # Register a project
+        project = temp_registry_db.register_project(
+            source_root=Path("/root"),
+            index_root=Path("/tmp")
+        )
+
+        # Add mapping at root
+        temp_registry_db.register_dir(
+            project_id=project.id,
+            source_path=Path("/root"),
+            index_path=Path("/tmp/index.db"),
+            depth=0,
+            files_count=0
+        )
+
+        # Test with very deep path (10 levels)
+        deep_path = Path("/root/a/b/c/d/e/f/g/h/i/j/file.py")
+
+        start = time.perf_counter()
+        result = temp_registry_db.find_nearest_index(deep_path)
+        elapsed = time.perf_counter() - start
+
+        # Should complete quickly (< 50ms even on slow systems)
+        assert elapsed < 0.05
+        assert result is not None
+
+
+class TestSymbolSearchOptimization:
+    """Test optimized symbol search."""
+
+    def test_symbol_search_prefix_mode(self, populated_index_db):
+        """Test symbol search with prefix mode."""
+        results = populated_index_db.search_symbols("get", prefix_mode=True)
+
+        # Should find symbols starting with "get"
+        assert len(results) > 0
+        for symbol in results:
+            assert symbol.name.startswith("get")
+
+    def test_symbol_search_substring_mode(self, populated_index_db):
+        """Test symbol search with substring mode."""
+        results = populated_index_db.search_symbols("user", prefix_mode=False)
+
+        # Should find symbols containing "user"
+        assert len(results) > 0
+        for symbol in results:
+            assert "user" in symbol.name.lower()
+
+    def test_symbol_search_with_kind_filter(self, populated_index_db):
+        """Test symbol search with kind filter."""
+        results = populated_index_db.search_symbols(
+            "UserClass",
+            kind="class",
+            prefix_mode=True
+        )
+
+        # Should find only class symbols
+        assert len(results) > 0
+        for symbol in results:
+            assert symbol.kind == "class"
+
+    def test_symbol_search_limit(self, populated_index_db):
+        """Test symbol search respects limit."""
+        results = populated_index_db.search_symbols("", prefix_mode=True, limit=5)
+
+        # Should return at most 5 results
+        assert len(results) <= 5
+
+
+class TestMigrationManager:
+    """Test migration manager functionality."""
+
+    def test_migration_manager_tracks_version(self, temp_index_db):
+        """Test that migration manager tracks schema version."""
+        conn = temp_index_db._get_connection()
+        manager = MigrationManager(conn)
+
+        current_version = manager.get_current_version()
+        assert current_version >= 0
+
+    def test_migration_001_can_run(self, temp_index_db):
+        """Test that migration_001 can be applied."""
+        conn = temp_index_db._get_connection()
+
+        # Add some test data to semantic_metadata first
+        conn.execute("""
+            INSERT INTO files(id, name, full_path, language, content, mtime, line_count)
+            VALUES(100, 'test.py', '/test_migration.py', 'python', 'def test(): pass', 0, 10)
+        """)
+        conn.execute("""
+            INSERT INTO semantic_metadata(file_id, keywords)
+            VALUES(100, ?)
+        """, (json.dumps(["test", "keyword"]),))
+        conn.commit()
+
+        # Run migration (should be idempotent, tables already created by initialize())
+        try:
+            migration_001_normalize_keywords.upgrade(conn)
+            success = True
+        except Exception as e:
+            success = False
+            print(f"Migration failed: {e}")
+
+        assert success
+
+        # Verify data was migrated
+        keyword_count = conn.execute("""
+            SELECT COUNT(*) as c FROM file_keywords WHERE file_id=100
+        """).fetchone()["c"]
+
+        assert keyword_count == 2  # "test" and "keyword"
+
+
+class TestPerformanceComparison:
+    """Compare performance of old vs new implementations."""
+
+    def test_keyword_search_performance(self, populated_index_db):
+        """Compare keyword search performance.
+
+        IMPORTANT: The normalized query optimization is designed for large datasets
+        (1000+ files). On small datasets (< 1000 files), the overhead of JOINs and
+        GROUP BY operations can make the normalized query slower than the simple
+        LIKE query on JSON fields. This is expected behavior.
+
+        Performance benefits appear when:
+        - Dataset size > 1000 files
+        - Full-table scans on JSON LIKE become the bottleneck
+        - Index-based lookups provide O(log N) complexity advantage
+        """
+        # Normalized search
+        start = time.perf_counter()
+        normalized_results = populated_index_db.search_semantic_keywords(
+            "auth",
+            use_normalized=True
+        )
+        normalized_time = time.perf_counter() - start
+
+        # Fallback search
+        start = time.perf_counter()
+        fallback_results = populated_index_db.search_semantic_keywords(
+            "auth",
+            use_normalized=False
+        )
+        fallback_time = time.perf_counter() - start
+
+        # Verify correctness: both queries should return identical results
+        assert len(normalized_results) == len(fallback_results)
+
+        # Verify result content matches
+        normalized_files = {entry.id for entry, _ in normalized_results}
+        fallback_files = {entry.id for entry, _ in fallback_results}
+        assert normalized_files == fallback_files, "Both queries must return same files"
+
+        # Document performance characteristics (no strict assertion)
+        # On datasets < 1000 files, normalized may be slower due to JOIN overhead
+        print(f"\nKeyword search performance (100 files):")
+        print(f"  Normalized: {normalized_time*1000:.3f}ms")
+        print(f"  Fallback:   {fallback_time*1000:.3f}ms")
+        print(f"  Ratio:      {normalized_time/fallback_time:.2f}x")
+        print(f"  Note: Performance benefits appear with 1000+ files")
+
+    def test_prefix_vs_substring_symbol_search(self, populated_index_db):
+        """Compare prefix vs substring symbol search performance.
+
+        IMPORTANT: Prefix search optimization (LIKE 'prefix%') benefits from B-tree
+        indexes, but on small datasets (< 1000 symbols), the performance difference
+        may not be measurable or may even be slower due to query planner overhead.
+
+        Performance benefits appear when:
+        - Symbol count > 1000
+        - Index-based prefix search provides O(log N) advantage
+        - Full table scans with LIKE '%substring%' become bottleneck
+        """
+        # Prefix search (optimized)
+        start = time.perf_counter()
+        prefix_results = populated_index_db.search_symbols("get", prefix_mode=True)
+        prefix_time = time.perf_counter() - start
+
+        # Substring search (fallback)
+        start = time.perf_counter()
+        substring_results = populated_index_db.search_symbols("get", prefix_mode=False)
+        substring_time = time.perf_counter() - start
+
+        # Verify correctness: prefix results should be subset of substring results
+        prefix_names = {s.name for s in prefix_results}
+        substring_names = {s.name for s in substring_results}
+        assert prefix_names.issubset(substring_names), "Prefix must be subset of substring"
+
+        # Verify all prefix results actually start with search term
+        for symbol in prefix_results:
+            assert symbol.name.startswith("get"), f"Symbol {symbol.name} should start with 'get'"
+
+        # Document performance characteristics (no strict assertion)
+        # On datasets < 1000 symbols, performance difference is negligible
+        print(f"\nSymbol search performance (150 symbols):")
+        print(f"  Prefix:    {prefix_time*1000:.3f}ms ({len(prefix_results)} results)")
+        print(f"  Substring: {substring_time*1000:.3f}ms ({len(substring_results)} results)")
+        print(f"  Ratio:     {prefix_time/substring_time:.2f}x")
+        print(f"  Note: Performance benefits appear with 1000+ symbols")
--- a/codex-lens/tests/validate_optimizations.py
+++ b/codex-lens/tests/validate_optimizations.py
@@ -0,0 +1,287 @@
+"""
+Manual validation script for performance optimizations.
+
+This script verifies that the optimization implementations are working correctly.
+Run with: python tests/validate_optimizations.py
+"""
+
+import json
+import sqlite3
+import tempfile
+import time
+from pathlib import Path
+
+from codexlens.storage.dir_index import DirIndexStore
+from codexlens.storage.registry import RegistryStore
+from codexlens.storage.migration_manager import MigrationManager
+from codexlens.storage.migrations import migration_001_normalize_keywords
+
+
+def test_keyword_normalization():
+    """Test normalized keywords functionality."""
+    print("\n=== Testing Keyword Normalization ===")
+
+    with tempfile.TemporaryDirectory() as tmpdir:
+        db_path = Path(tmpdir) / "test_index.db"
+        store = DirIndexStore(db_path)
+        store.initialize()  # Create schema
+
+        # Add a test file
+        # Note: add_file automatically calculates mtime and line_count
+        file_id = store.add_file(
+            name="test.py",
+            full_path=Path("/test/test.py"),
+            content="def hello(): pass",
+            language="python"
+        )
+
+        # Add semantic metadata with keywords
+        keywords = ["auth", "security", "jwt"]
+        store.add_semantic_metadata(
+            file_id=file_id,
+            summary="Test summary",
+            keywords=keywords,
+            purpose="Testing",
+            llm_tool="gemini"
+        )
+
+        conn = store._get_connection()
+
+        # Verify keywords table populated
+        keyword_rows = conn.execute("""
+            SELECT k.keyword
+            FROM file_keywords fk
+            JOIN keywords k ON fk.keyword_id = k.id
+            WHERE fk.file_id = ?
+        """, (file_id,)).fetchall()
+
+        normalized_keywords = [row["keyword"] for row in keyword_rows]
+        print(f"✓ Keywords stored in normalized tables: {normalized_keywords}")
+        assert set(normalized_keywords) == set(keywords), "Keywords mismatch!"
+
+        # Test optimized search
+        results = store.search_semantic_keywords("auth", use_normalized=True)
+        print(f"✓ Found {len(results)} file(s) with keyword 'auth'")
+        assert len(results) > 0, "No results found!"
+
+        # Test fallback search
+        results_fallback = store.search_semantic_keywords("auth", use_normalized=False)
+        print(f"✓ Fallback search found {len(results_fallback)} file(s)")
+        assert len(results) == len(results_fallback), "Result count mismatch!"
+
+        store.close()
+        print("✓ Keyword normalization tests PASSED")
+
+
+def test_path_lookup_optimization():
+    """Test optimized path lookup."""
+    print("\n=== Testing Path Lookup Optimization ===")
+
+    with tempfile.TemporaryDirectory() as tmpdir:
+        db_path = Path(tmpdir) / "test_registry.db"
+        store = RegistryStore(db_path)
+
+        # Add directory mapping
+        store.add_dir_mapping(
+            source_path=Path("/a/b/c"),
+            index_path=Path("/tmp/index.db"),
+            project_id=None
+        )
+
+        # Test deep path lookup
+        deep_path = Path("/a/b/c/d/e/f/g/h/i/j/file.py")
+
+        start = time.perf_counter()
+        result = store.find_nearest_index(deep_path)
+        elapsed = time.perf_counter() - start
+
+        print(f"✓ Found nearest index in {elapsed*1000:.2f}ms")
+        assert result is not None, "No result found!"
+        assert result.source_path == Path("/a/b/c"), "Wrong path found!"
+        assert elapsed < 0.05, f"Too slow: {elapsed*1000:.2f}ms"
+
+        store.close()
+        print("✓ Path lookup optimization tests PASSED")
+
+
+def test_symbol_search_prefix_mode():
+    """Test symbol search with prefix mode."""
+    print("\n=== Testing Symbol Search Prefix Mode ===")
+
+    with tempfile.TemporaryDirectory() as tmpdir:
+        db_path = Path(tmpdir) / "test_index.db"
+        store = DirIndexStore(db_path)
+        store.initialize()  # Create schema
+
+        # Add a test file
+        file_id = store.add_file(
+            name="test.py",
+            full_path=Path("/test/test.py"),
+            content="def hello(): pass\n" * 10,  # 10 lines
+            language="python"
+        )
+
+        # Add symbols
+        store.add_symbols(
+            file_id=file_id,
+            symbols=[
+                ("get_user", "function", 1, 5),
+                ("get_item", "function", 6, 10),
+                ("create_user", "function", 11, 15),
+                ("UserClass", "class", 16, 25),
+            ]
+        )
+
+        # Test prefix search
+        results = store.search_symbols("get", prefix_mode=True)
+        print(f"✓ Prefix search for 'get' found {len(results)} symbol(s)")
+        assert len(results) == 2, f"Expected 2 symbols, got {len(results)}"
+        for symbol in results:
+            assert symbol.name.startswith("get"), f"Symbol {symbol.name} doesn't start with 'get'"
+        print(f"  Symbols: {[s.name for s in results]}")
+
+        # Test substring search
+        results_sub = store.search_symbols("user", prefix_mode=False)
+        print(f"✓ Substring search for 'user' found {len(results_sub)} symbol(s)")
+        assert len(results_sub) == 3, f"Expected 3 symbols, got {len(results_sub)}"
+        print(f"  Symbols: {[s.name for s in results_sub]}")
+
+        store.close()
+        print("✓ Symbol search optimization tests PASSED")
+
+
+def test_migration_001():
+    """Test migration_001 execution."""
+    print("\n=== Testing Migration 001 ===")
+
+    with tempfile.TemporaryDirectory() as tmpdir:
+        db_path = Path(tmpdir) / "test_index.db"
+        store = DirIndexStore(db_path)
+        store.initialize()  # Create schema
+        conn = store._get_connection()
+
+        # Add test data to semantic_metadata
+        conn.execute("""
+            INSERT INTO files(id, name, full_path, language, mtime, line_count)
+            VALUES(1, 'test.py', '/test.py', 'python', 0, 10)
+        """)
+        conn.execute("""
+            INSERT INTO semantic_metadata(file_id, keywords)
+            VALUES(1, ?)
+        """, (json.dumps(["test", "migration", "keyword"]),))
+        conn.commit()
+
+        # Run migration
+        print("  Running migration_001...")
+        migration_001_normalize_keywords.upgrade(conn)
+        print("  Migration completed successfully")
+
+        # Verify migration results
+        keyword_count = conn.execute("""
+            SELECT COUNT(*) as c FROM file_keywords WHERE file_id=1
+        """).fetchone()["c"]
+
+        print(f"✓ Migrated {keyword_count} keywords for file_id=1")
+        assert keyword_count == 3, f"Expected 3 keywords, got {keyword_count}"
+
+        # Verify keywords table
+        keywords = conn.execute("""
+            SELECT k.keyword FROM keywords k
+            JOIN file_keywords fk ON k.id = fk.keyword_id
+            WHERE fk.file_id = 1
+        """).fetchall()
+        keyword_list = [row["keyword"] for row in keywords]
+        print(f"  Keywords: {keyword_list}")
+
+        store.close()
+        print("✓ Migration 001 tests PASSED")
+
+
+def test_performance_comparison():
+    """Compare performance of optimized vs fallback implementations."""
+    print("\n=== Performance Comparison ===")
+
+    with tempfile.TemporaryDirectory() as tmpdir:
+        db_path = Path(tmpdir) / "test_index.db"
+        store = DirIndexStore(db_path)
+        store.initialize()  # Create schema
+
+        # Create test data
+        print("  Creating test data...")
+        for i in range(100):
+            file_id = store.add_file(
+                name=f"file_{i}.py",
+                full_path=Path(f"/test/file_{i}.py"),
+                content=f"def function_{i}(): pass",
+                language="python"
+            )
+
+            # Vary keywords
+            if i % 3 == 0:
+                keywords = ["auth", "security"]
+            elif i % 3 == 1:
+                keywords = ["database", "query"]
+            else:
+                keywords = ["api", "endpoint"]
+
+            store.add_semantic_metadata(
+                file_id=file_id,
+                summary=f"File {i}",
+                keywords=keywords,
+                purpose="Testing",
+                llm_tool="gemini"
+            )
+
+        # Benchmark normalized search
+        print("  Benchmarking normalized search...")
+        start = time.perf_counter()
+        for _ in range(10):
+            results_norm = store.search_semantic_keywords("auth", use_normalized=True)
+        norm_time = time.perf_counter() - start
+
+        # Benchmark fallback search
+        print("  Benchmarking fallback search...")
+        start = time.perf_counter()
+        for _ in range(10):
+            results_fallback = store.search_semantic_keywords("auth", use_normalized=False)
+        fallback_time = time.perf_counter() - start
+
+        print(f"\n  Results:")
+        print(f"  - Normalized search: {norm_time*1000:.2f}ms (10 iterations)")
+        print(f"  - Fallback search:   {fallback_time*1000:.2f}ms (10 iterations)")
+        print(f"  - Speedup factor:    {fallback_time/norm_time:.2f}x")
+        print(f"  - Both found {len(results_norm)} files")
+
+        assert len(results_norm) == len(results_fallback), "Result count mismatch!"
+
+        store.close()
+        print("✓ Performance comparison PASSED")
+
+
+def main():
+    """Run all validation tests."""
+    print("=" * 60)
+    print("CodexLens Performance Optimizations Validation")
+    print("=" * 60)
+
+    try:
+        test_keyword_normalization()
+        test_path_lookup_optimization()
+        test_symbol_search_prefix_mode()
+        test_migration_001()
+        test_performance_comparison()
+
+        print("\n" + "=" * 60)
+        print("✓✓✓ ALL VALIDATION TESTS PASSED ✓✓✓")
+        print("=" * 60)
+        return 0
+
+    except Exception as e:
+        print(f"\nX VALIDATION FAILED: {e}")
+        import traceback
+        traceback.print_exc()
+        return 1
+
+
+if __name__ == "__main__":
+    exit(main())
				`@@ -0,0 +1 @@`
				`# This file makes the 'migrations' directory a Python package.`