Files
Claude-Code-Workflow/codex-lens/docs/T6-CLI-Integration-Summary.md
catlog22 3da0ef2adb Add comprehensive tests for query parsing and Reciprocal Rank Fusion
- Implemented tests for the QueryParser class, covering various identifier splitting methods (CamelCase, snake_case, kebab-case), OR expansion, and FTS5 operator preservation.
- Added parameterized tests to validate expected token outputs for different query formats.
- Created edge case tests to ensure robustness against unusual input scenarios.
- Developed tests for the Reciprocal Rank Fusion (RRF) algorithm, including score computation, weight handling, and result ranking across multiple sources.
- Included tests for normalization of BM25 scores and tagging search results with source metadata.
2025-12-16 10:20:19 +08:00

6.2 KiB

T6: CLI Integration for Hybrid Search - Implementation Summary

Overview

Successfully integrated hybrid search capabilities into the CodexLens CLI with user-configurable options, migration support, and enhanced status reporting.

Changes Made

1. Search Command Enhancement (commands.py)

New --mode Parameter:

  • Replaced --hybrid and --exact-only flags with unified --mode parameter
  • Supported modes: exact, fuzzy, hybrid, vector
  • Default: exact (backward compatible)

Mode Validation:

valid_modes = ["exact", "fuzzy", "hybrid", "vector"]
if mode not in valid_modes:
    # Error with helpful message

Weights Configuration:

  • Accepts custom RRF weights via --weights exact,fuzzy,vector
  • Example: --weights 0.5,0.3,0.2
  • Automatic normalization if weights don't sum to 1.0
  • Validation for 3-value format

Mode Mapping to SearchOptions:

hybrid_mode = mode == "hybrid"
enable_fuzzy = mode in ["fuzzy", "hybrid"]

options = SearchOptions(
    hybrid_mode=hybrid_mode,
    enable_fuzzy=enable_fuzzy,
    hybrid_weights=hybrid_weights,
)

Enhanced Output:

  • Shows search mode in status line
  • Includes search source tags in verbose mode
  • JSON output includes mode and source information

2. Migrate Command (commands.py)

New Command for Dual-FTS Upgrade:

codex-lens migrate [path]

Features:

  • Upgrades all _index.db files to schema version 4
  • Shows progress bar with percentage complete
  • Tracks: migrated, already up-to-date, errors
  • Safe operation preserving all data
  • Verbose mode shows per-database migration details

Progress Tracking:

  • Uses Rich progress bar with spinner
  • Shows percentage and count (N/Total)
  • Time elapsed indicator

3. Status Command Enhancement (commands.py)

New Backend Status Display:

Search Backends:
  Exact FTS: ✓ (unicode61)
  Fuzzy FTS: ✓ (trigram)
  Hybrid Search: ✓ (RRF fusion)
  Vector Search: ✗ (future)

Schema Version Detection:

  • Checks first available _index.db
  • Reports schema version
  • Detects dual FTS table presence

Feature Flags in JSON:

{
  "features": {
    "exact_fts": true,
    "fuzzy_fts": true,
    "hybrid_search": true,
    "vector_search": false
  }
}

4. Output Rendering (output.py)

Verbose Mode Support:

render_search_results(results, verbose=True)

Search Source Tags:

  • [E] - Exact FTS result
  • [F] - Fuzzy FTS result
  • [V] - Vector search result
  • [RRF] - Fusion result

Enhanced Table:

  • New "Source" column in verbose mode
  • Shows result origin for debugging
  • Fusion scores visible

Usage Examples

1. Search with Different Modes

# Exact search (default)
codex-lens search "authentication"

# Fuzzy search only
codex-lens search "authentication" --mode fuzzy

# Hybrid search with RRF fusion
codex-lens search "authentication" --mode hybrid

# Hybrid with custom weights
codex-lens search "authentication" --mode hybrid --weights 0.5,0.3,0.2

# Verbose mode shows source tags
codex-lens search "authentication" --mode hybrid -v

2. Migration

# Migrate current project
codex-lens migrate

# Migrate specific project with verbose output
codex-lens migrate /path/to/project -v

# JSON output for automation
codex-lens migrate --json

3. Status Checking

# Check backend availability
codex-lens status

# JSON output with feature flags
codex-lens status --json

Testing

Test Coverage:

  • Mode parameter validation (exact, fuzzy, hybrid, vector)
  • Weights parsing and normalization
  • Help text shows all modes
  • Migrate command exists and accessible
  • Status command shows backends
  • Mode mapping to SearchOptions

Test Results:

11 passed in 2.27s

Integration Points

With Phase 1 (Dual-FTS):

  • Uses search_fts_exact() for exact mode
  • Uses search_fts_fuzzy() for fuzzy mode
  • Schema migration via _apply_migrations()
  • Calls HybridSearchEngine for hybrid mode
  • Passes custom weights to RRF algorithm
  • Displays fusion scores and source tags

With Existing CLI:

  • Backward compatible (default mode=exact)
  • Follows existing error handling patterns
  • Uses Rich for progress and formatting
  • Supports JSON output mode

Done Criteria Verification

CLI search --mode exact uses only exact FTS table

  • Mode validation ensures correct backend selection
  • hybrid_mode=False, enable_fuzzy=False for exact mode

--mode fuzzy uses only fuzzy table

  • hybrid_mode=False, enable_fuzzy=True for fuzzy mode
  • Single backend execution

--mode hybrid fuses both

  • hybrid_mode=True, enable_fuzzy=True activates RRF fusion
  • HybridSearchEngine coordinates parallel search

Custom weights via --weights 0.5,0.3,0.2

  • Parses 3-value comma-separated format
  • Validates and normalizes to sum=1.0
  • Passes to RRF algorithm

Migration command completes Dual-FTS upgrade

  • Shows progress bar with percentage
  • Tracks migration status per database
  • Safe operation with error handling

Search output shows [E], [F], [V] tags and fusion scores

  • Verbose mode displays Source column
  • Tags extracted from search_source attribute
  • Fusion scores shown in Score column

Files Modified

  1. codex-lens/src/codexlens/cli/commands.py

    • Updated search() command with --mode parameter
    • Added migrate() command
    • Enhanced status() command
    • Added DirIndexStore import
  2. codex-lens/src/codexlens/cli/output.py

    • Updated render_search_results() with verbose mode
    • Added source tag display logic
  3. codex-lens/tests/test_cli_hybrid_search.py (new)

    • Comprehensive CLI integration tests
    • Mode validation tests
    • Weights parsing tests
    • Command availability tests

Performance Impact

  • Exact mode: Same as before (no overhead)
  • Fuzzy mode: Single FTS query (minimal overhead)
  • Hybrid mode: Parallel execution (2x I/O, no sequential penalty)
  • Migration: One-time operation, safe for large projects

Next Steps

Users can now:

  1. Run codex-lens migrate to upgrade existing indexes
  2. Use codex-lens search "query" --mode hybrid for best results
  3. Check codex-lens status to verify enabled features
  4. Tune fusion weights for their use case via --weights