Files
Claude-Code-Workflow/codex-lens/CHANGELOG.md
catlog22 4061ae48c4 feat: Implement adaptive RRF weights and query intent detection
- Added integration tests for adaptive RRF weights in hybrid search.
- Enhanced query intent detection with new classifications: keyword, semantic, and mixed.
- Introduced symbol boosting in search results based on explicit symbol matches.
- Implemented embedding-based reranking with configurable options.
- Added global symbol index for efficient symbol lookups across projects.
- Improved file deletion handling on Windows to avoid permission errors.
- Updated chunk configuration to increase overlap for better context.
- Modified package.json test script to target specific test files.
- Created comprehensive writing style guidelines for documentation.
- Added TypeScript tests for query intent detection and adaptive weights.
- Established performance benchmarks for global symbol indexing.
2025-12-26 15:08:47 +08:00

42 lines
1.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# CodexLens Optimization Plan Changelog
This changelog tracks the **CodexLens optimization plan** milestones (not the Python package version in `pyproject.toml`).
## v1.0 (Optimization) 2025-12-26
### Optimizations
1. **P0: Context-aware hybrid chunking**
- Docstrings are extracted into dedicated chunks and excluded from code chunks.
- Docstring chunks include `parent_symbol` metadata when the docstring belongs to a function/class/method.
- Sliding-window chunk boundaries are deterministic for identical input.
2. **P1: Adaptive RRF weights (QueryIntent)**
- Query intent is classified as `keyword` / `semantic` / `mixed`.
- RRF weights adapt to intent:
- `keyword`: exact-heavy (favors lexical matches)
- `semantic`: vector-heavy (favors semantic matches)
- `mixed`: keeps base/default weights
3. **P2: Symbol boost**
- Fused results with an explicit symbol match (`symbol_name`) receive a multiplicative boost (default `1.5x`).
4. **P2: Embedding-based re-ranking (optional)**
- A second-stage ranker can reorder top results by semantic similarity.
- Re-ranking runs only when `Config.enable_reranking=True`.
5. **P3: Global symbol index (incremental + fast path)**
- `GlobalSymbolIndex` stores project-wide symbols in one SQLite DB for fast symbol lookups.
- `ChainSearchEngine.search_symbols()` uses the global index fast path when enabled.
### Migration Notes
- **Reindexing (recommended)**: deterministic chunking and docstring metadata affect stored chunks. For best results, regenerate indexes/embeddings after upgrading:
- Rebuild indexes and/or re-run embedding generation for existing projects.
- **New config flags**:
- `Config.enable_reranking` (default `False`)
- `Config.reranking_top_k` (default `50`)
- `Config.symbol_boost_factor` (default `1.5`)
- `Config.global_symbol_index_enabled` (default `True`)
- **Breaking changes**: none (behavioral improvements only).