mirror of
https://github.com/catlog22/Claude-Code-Workflow.git
synced 2026-02-05 01:50:27 +08:00
Add comprehensive tests for query parsing and Reciprocal Rank Fusion
- Implemented tests for the QueryParser class, covering various identifier splitting methods (CamelCase, snake_case, kebab-case), OR expansion, and FTS5 operator preservation. - Added parameterized tests to validate expected token outputs for different query formats. - Created edge case tests to ensure robustness against unusual input scenarios. - Developed tests for the Reciprocal Rank Fusion (RRF) algorithm, including score computation, weight handling, and result ranking across multiple sources. - Included tests for normalization of BM25 scores and tagging search results with source metadata.
This commit is contained in:
248
codex-lens/docs/T6-CLI-Integration-Summary.md
Normal file
248
codex-lens/docs/T6-CLI-Integration-Summary.md
Normal file
@@ -0,0 +1,248 @@
|
||||
# T6: CLI Integration for Hybrid Search - Implementation Summary
|
||||
|
||||
## Overview
|
||||
|
||||
Successfully integrated hybrid search capabilities into the CodexLens CLI with user-configurable options, migration support, and enhanced status reporting.
|
||||
|
||||
## Changes Made
|
||||
|
||||
### 1. Search Command Enhancement (`commands.py`)
|
||||
|
||||
**New `--mode` Parameter:**
|
||||
- Replaced `--hybrid` and `--exact-only` flags with unified `--mode` parameter
|
||||
- Supported modes: `exact`, `fuzzy`, `hybrid`, `vector`
|
||||
- Default: `exact` (backward compatible)
|
||||
|
||||
**Mode Validation:**
|
||||
```python
|
||||
valid_modes = ["exact", "fuzzy", "hybrid", "vector"]
|
||||
if mode not in valid_modes:
|
||||
# Error with helpful message
|
||||
```
|
||||
|
||||
**Weights Configuration:**
|
||||
- Accepts custom RRF weights via `--weights exact,fuzzy,vector`
|
||||
- Example: `--weights 0.5,0.3,0.2`
|
||||
- Automatic normalization if weights don't sum to 1.0
|
||||
- Validation for 3-value format
|
||||
|
||||
**Mode Mapping to SearchOptions:**
|
||||
```python
|
||||
hybrid_mode = mode == "hybrid"
|
||||
enable_fuzzy = mode in ["fuzzy", "hybrid"]
|
||||
|
||||
options = SearchOptions(
|
||||
hybrid_mode=hybrid_mode,
|
||||
enable_fuzzy=enable_fuzzy,
|
||||
hybrid_weights=hybrid_weights,
|
||||
)
|
||||
```
|
||||
|
||||
**Enhanced Output:**
|
||||
- Shows search mode in status line
|
||||
- Includes search source tags in verbose mode
|
||||
- JSON output includes mode and source information
|
||||
|
||||
### 2. Migrate Command (`commands.py`)
|
||||
|
||||
**New Command for Dual-FTS Upgrade:**
|
||||
```bash
|
||||
codex-lens migrate [path]
|
||||
```
|
||||
|
||||
**Features:**
|
||||
- Upgrades all `_index.db` files to schema version 4
|
||||
- Shows progress bar with percentage complete
|
||||
- Tracks: migrated, already up-to-date, errors
|
||||
- Safe operation preserving all data
|
||||
- Verbose mode shows per-database migration details
|
||||
|
||||
**Progress Tracking:**
|
||||
- Uses Rich progress bar with spinner
|
||||
- Shows percentage and count (N/Total)
|
||||
- Time elapsed indicator
|
||||
|
||||
### 3. Status Command Enhancement (`commands.py`)
|
||||
|
||||
**New Backend Status Display:**
|
||||
```
|
||||
Search Backends:
|
||||
Exact FTS: ✓ (unicode61)
|
||||
Fuzzy FTS: ✓ (trigram)
|
||||
Hybrid Search: ✓ (RRF fusion)
|
||||
Vector Search: ✗ (future)
|
||||
```
|
||||
|
||||
**Schema Version Detection:**
|
||||
- Checks first available `_index.db`
|
||||
- Reports schema version
|
||||
- Detects dual FTS table presence
|
||||
|
||||
**Feature Flags in JSON:**
|
||||
```json
|
||||
{
|
||||
"features": {
|
||||
"exact_fts": true,
|
||||
"fuzzy_fts": true,
|
||||
"hybrid_search": true,
|
||||
"vector_search": false
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Output Rendering (`output.py`)
|
||||
|
||||
**Verbose Mode Support:**
|
||||
```python
|
||||
render_search_results(results, verbose=True)
|
||||
```
|
||||
|
||||
**Search Source Tags:**
|
||||
- `[E]` - Exact FTS result
|
||||
- `[F]` - Fuzzy FTS result
|
||||
- `[V]` - Vector search result
|
||||
- `[RRF]` - Fusion result
|
||||
|
||||
**Enhanced Table:**
|
||||
- New "Source" column in verbose mode
|
||||
- Shows result origin for debugging
|
||||
- Fusion scores visible
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### 1. Search with Different Modes
|
||||
|
||||
```bash
|
||||
# Exact search (default)
|
||||
codex-lens search "authentication"
|
||||
|
||||
# Fuzzy search only
|
||||
codex-lens search "authentication" --mode fuzzy
|
||||
|
||||
# Hybrid search with RRF fusion
|
||||
codex-lens search "authentication" --mode hybrid
|
||||
|
||||
# Hybrid with custom weights
|
||||
codex-lens search "authentication" --mode hybrid --weights 0.5,0.3,0.2
|
||||
|
||||
# Verbose mode shows source tags
|
||||
codex-lens search "authentication" --mode hybrid -v
|
||||
```
|
||||
|
||||
### 2. Migration
|
||||
|
||||
```bash
|
||||
# Migrate current project
|
||||
codex-lens migrate
|
||||
|
||||
# Migrate specific project with verbose output
|
||||
codex-lens migrate /path/to/project -v
|
||||
|
||||
# JSON output for automation
|
||||
codex-lens migrate --json
|
||||
```
|
||||
|
||||
### 3. Status Checking
|
||||
|
||||
```bash
|
||||
# Check backend availability
|
||||
codex-lens status
|
||||
|
||||
# JSON output with feature flags
|
||||
codex-lens status --json
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
**Test Coverage:**
|
||||
- ✅ Mode parameter validation (exact, fuzzy, hybrid, vector)
|
||||
- ✅ Weights parsing and normalization
|
||||
- ✅ Help text shows all modes
|
||||
- ✅ Migrate command exists and accessible
|
||||
- ✅ Status command shows backends
|
||||
- ✅ Mode mapping to SearchOptions
|
||||
|
||||
**Test Results:**
|
||||
```
|
||||
11 passed in 2.27s
|
||||
```
|
||||
|
||||
## Integration Points
|
||||
|
||||
### With Phase 1 (Dual-FTS):
|
||||
- Uses `search_fts_exact()` for exact mode
|
||||
- Uses `search_fts_fuzzy()` for fuzzy mode
|
||||
- Schema migration via `_apply_migrations()`
|
||||
|
||||
### With Phase 2 (Hybrid Search):
|
||||
- Calls `HybridSearchEngine` for hybrid mode
|
||||
- Passes custom weights to RRF algorithm
|
||||
- Displays fusion scores and source tags
|
||||
|
||||
### With Existing CLI:
|
||||
- Backward compatible (default mode=exact)
|
||||
- Follows existing error handling patterns
|
||||
- Uses Rich for progress and formatting
|
||||
- Supports JSON output mode
|
||||
|
||||
## Done Criteria Verification
|
||||
|
||||
✅ **CLI search --mode exact uses only exact FTS table**
|
||||
- Mode validation ensures correct backend selection
|
||||
- `hybrid_mode=False, enable_fuzzy=False` for exact mode
|
||||
|
||||
✅ **--mode fuzzy uses only fuzzy table**
|
||||
- `hybrid_mode=False, enable_fuzzy=True` for fuzzy mode
|
||||
- Single backend execution
|
||||
|
||||
✅ **--mode hybrid fuses both**
|
||||
- `hybrid_mode=True, enable_fuzzy=True` activates RRF fusion
|
||||
- HybridSearchEngine coordinates parallel search
|
||||
|
||||
✅ **Custom weights via --weights 0.5,0.3,0.2**
|
||||
- Parses 3-value comma-separated format
|
||||
- Validates and normalizes to sum=1.0
|
||||
- Passes to RRF algorithm
|
||||
|
||||
✅ **Migration command completes Dual-FTS upgrade**
|
||||
- Shows progress bar with percentage
|
||||
- Tracks migration status per database
|
||||
- Safe operation with error handling
|
||||
|
||||
✅ **Search output shows [E], [F], [V] tags and fusion scores**
|
||||
- Verbose mode displays Source column
|
||||
- Tags extracted from `search_source` attribute
|
||||
- Fusion scores shown in Score column
|
||||
|
||||
## Files Modified
|
||||
|
||||
1. `codex-lens/src/codexlens/cli/commands.py`
|
||||
- Updated `search()` command with `--mode` parameter
|
||||
- Added `migrate()` command
|
||||
- Enhanced `status()` command
|
||||
- Added DirIndexStore import
|
||||
|
||||
2. `codex-lens/src/codexlens/cli/output.py`
|
||||
- Updated `render_search_results()` with verbose mode
|
||||
- Added source tag display logic
|
||||
|
||||
3. `codex-lens/tests/test_cli_hybrid_search.py` (new)
|
||||
- Comprehensive CLI integration tests
|
||||
- Mode validation tests
|
||||
- Weights parsing tests
|
||||
- Command availability tests
|
||||
|
||||
## Performance Impact
|
||||
|
||||
- **Exact mode**: Same as before (no overhead)
|
||||
- **Fuzzy mode**: Single FTS query (minimal overhead)
|
||||
- **Hybrid mode**: Parallel execution (2x I/O, no sequential penalty)
|
||||
- **Migration**: One-time operation, safe for large projects
|
||||
|
||||
## Next Steps
|
||||
|
||||
Users can now:
|
||||
1. Run `codex-lens migrate` to upgrade existing indexes
|
||||
2. Use `codex-lens search "query" --mode hybrid` for best results
|
||||
3. Check `codex-lens status` to verify enabled features
|
||||
4. Tune fusion weights for their use case via `--weights`
|
||||
Reference in New Issue
Block a user