Replaces generic Exception handling with specific PermissionError and OSError
handling in __post_init__ and ensure_runtime_dirs(). Provides clear diagnostic
messages to distinguish permission issues from other filesystem errors.
Solution-ID: SOL-1735385400008
Issue-ID: ISS-1766921318981-8
Task-ID: T1
Protect _bulk_insert_mode flag and accumulation lists with
_ann_write_lock to prevent corruption during concurrent access.
Solution-ID: SOL-1735392000003
Issue-ID: ISS-1766921318981-12
Task-ID: T1
Adds nested exception handling in add_files() and _migrate_fts_to_external()
to catch and log rollback failures. Uses exception chaining to preserve both
transaction and rollback errors, preventing silent database inconsistency.
Solution-ID: SOL-1735385400010
Issue-ID: ISS-1766921318981-10
Task-ID: T1
Add comprehensive test coverage for zero and near-zero vector
detection in SemanticChunk embedding validation.
Solution-ID: SOL-20251228113612
Issue-ID: ISS-1766921318981-7
Task-ID: T2
Adds robust exception handling for os.path.commonpath() in search_symbols()
to prevent crashes on malformed paths and Windows cross-drive scenarios.
Invalid symbols are skipped with debug logging, search continues.
Solution-ID: SOL-1735385400004
Issue-ID: ISS-1766921318981-4
Task-ID: T1
Add fallback validation to detect dead threads missed by
threading.enumerate(), ensuring all stale connections are cleaned.
Solution-ID: SOL-1735392000002
Issue-ID: ISS-1766921318981-3
Task-ID: T2
Protect fast path cache read in get_embedder() to prevent KeyError
during concurrent access and cache clearing operations.
Solution-ID: SOL-1735392000001
Issue-ID: ISS-1766921318981-2
Task-ID: T1
Adds case normalization for path comparison on Windows to handle
case-insensitive filesystem behavior. Preserves case-sensitivity on Unix.
Fixes: ISS-1766921318981-13
Solution-ID: SOL-1735386000-13
Issue-ID: ISS-1766921318981-13
Task-ID: T1
Add comprehensive test coverage for near-zero norms, product
underflow, and floating point precision edge cases in
_cosine_similarity function.
Solution-ID: SOL-20251228113619
Issue-ID: ISS-1766921318981-11
Task-ID: T2
Replaces bare exception handler in load_settings() with logging.warning()
to help users debug configuration file issues (syntax errors, permissions).
Maintains backward compatibility - errors do not break initialization.
Solution-ID: SOL-1735385400001
Issue-ID: ISS-1766921318981-1
Task-ID: T1
Add comprehensive test coverage for NaN, infinity, and all-None
edge cases in weight normalization to prevent regression.
Solution-ID: SOL-20251228113631
Issue-ID: ISS-1766921318981-0
Task-ID: T2
- Added integration tests for adaptive RRF weights in hybrid search.
- Enhanced query intent detection with new classifications: keyword, semantic, and mixed.
- Introduced symbol boosting in search results based on explicit symbol matches.
- Implemented embedding-based reranking with configurable options.
- Added global symbol index for efficient symbol lookups across projects.
- Improved file deletion handling on Windows to avoid permission errors.
- Updated chunk configuration to increase overlap for better context.
- Modified package.json test script to target specific test files.
- Created comprehensive writing style guidelines for documentation.
- Added TypeScript tests for query intent detection and adaptive weights.
- Established performance benchmarks for global symbol indexing.
- Introduced a `stream` parameter to control output streaming vs. caching.
- Enhanced status determination logic to prioritize valid output over exit codes.
- Updated output structure to include full stdout and stderr when not streaming.
feat(cli-history-store): extend conversation turn schema and migration
- Added `cached`, `stdout_full`, and `stderr_full` fields to the conversation turn schema.
- Implemented database migration to add new columns if they do not exist.
- Updated upsert logic to handle new fields.
feat(codex-lens): implement global symbol index for fast lookups
- Created `GlobalSymbolIndex` class to manage project-wide symbol indexing.
- Added methods for adding, updating, and deleting symbols in the global index.
- Integrated global index updates into directory indexing processes.
feat(codex-lens): optimize search functionality with global index
- Enhanced `ChainSearchEngine` to utilize the global symbol index for faster searches.
- Added configuration option to enable/disable global symbol indexing.
- Updated tests to validate global index functionality and performance.
- Added functions to get and update CodexLens embedding rotation configuration.
- Introduced functionality to retrieve enabled embedding providers for rotation.
- Created endpoints for managing rotation configuration via API.
- Enhanced dashboard UI to support multi-provider rotation configuration.
- Updated internationalization strings for new rotation features.
- Adjusted CLI commands and embedding manager to support increased concurrency limits.
- Modified hybrid search weights for improved ranking behavior.
- Added JSON-based settings management in Config class for embedding and LLM configurations.
- Introduced methods to save and load settings from a JSON file.
- Updated BaseEmbedder and its subclasses to include max_tokens property for better token management.
- Enhanced chunking strategy to support recursive splitting of large symbols with improved overlap handling.
- Implemented comprehensive tests for recursive splitting and chunking behavior.
- Added CLI tools configuration management for better integration with external tools.
- Introduced a new command for compacting session memory into structured text for recovery.
- Add SQLite table and CRUD methods for tracking update history
- Create freshness calculation service based on git file changes
- Add API endpoints for freshness data, marking updates, and history
- Display freshness badges in file tree (green/yellow/red indicators)
- Show freshness gauge and details in metadata panel
- Auto-mark files as updated after CLI sync
- Add English and Chinese i18n translations
Freshness algorithm: 100 - min((changedFilesCount / 20) * 100, 100)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add functionality to group search results with similar content and scores
into a single representative result with additional locations.
Changes:
- Add AdditionalLocation entity model for storing grouped result locations
- Add additional_locations field to SearchResult for backward compatibility
- Implement group_similar_results() function in ranking.py with:
- Content-based grouping (by excerpt or content field)
- Score-based sub-grouping with configurable threshold
- Metadata preservation with grouped_count tracking
- Add group_results and grouping_threshold options to SearchOptions
- Integrate grouping into ChainSearchEngine.search() after RRF fusion
Test coverage:
- 36 multi-level tests covering unit, boundary, integration, and performance
- Real-world scenario tests for RRF scores and duplicate code detection
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Added ANNIndex class for approximate nearest neighbor search using HNSW.
- Integrated ANN index with VectorStore for enhanced search capabilities.
- Updated test suite for ANN index, including tests for adding, searching, saving, and loading vectors.
- Modified existing tests to accommodate changes in search performance expectations.
- Improved error handling for file operations in tests to ensure compatibility with Windows file locks.
- Adjusted hybrid search performance assertions for increased stability in CI environments.
- Introduced styles for the help view including tab transitions, accordion animations, search highlighting, and responsive design.
- Implemented core memory styles with modal base styles, memory card designs, and knowledge graph visualization.
- Enhanced dark mode support across various components.
- Added loading states and empty state designs for better user experience.
- Implement `inspect_llm_summaries.py` to display LLM-generated summaries from the semantic_chunks table in the database.
- Create `show_llm_analysis.py` to demonstrate LLM analysis of misleading code examples, highlighting discrepancies between comments and actual functionality.
- Develop `test_misleading_comments.py` to compare pure vector search with LLM-enhanced search, focusing on the impact of misleading or missing comments on search results.
- Introduce `test_llm_enhanced_search.py` to provide a test suite for evaluating the effectiveness of LLM-enhanced vector search against pure vector search.
- Ensure all new scripts are integrated with the existing codebase and follow the established coding standards.
- Implement tests for migration 005 to verify removal of deprecated fields in the database schema.
- Ensure that new databases are created with a clean schema.
- Validate that keywords are correctly extracted from the normalized file_keywords table.
- Test symbol insertion without deprecated fields and subdir operations without direct_files.
- Create a detailed search comparison test to evaluate vector search vs hybrid search performance.
- Add a script for reindexing projects to extract code relationships and verify GraphAnalyzer functionality.
- Include a test script to check TreeSitter parser availability and relationship extraction from sample files.
- Implemented tests for the QueryParser class, covering various identifier splitting methods (CamelCase, snake_case, kebab-case), OR expansion, and FTS5 operator preservation.
- Added parameterized tests to validate expected token outputs for different query formats.
- Created edge case tests to ensure robustness against unusual input scenarios.
- Developed tests for the Reciprocal Rank Fusion (RRF) algorithm, including score computation, weight handling, and result ranking across multiple sources.
- Included tests for normalization of BM25 scores and tagging search results with source metadata.
- Added a cleanup function to reset the state when navigating away from the graph explorer.
- Updated navigation logic to call the cleanup function before switching views.
- Improved internationalization by adding new translations for graph-related terms.
- Adjusted icon sizes for better UI consistency in the graph explorer.
- Implemented impact analysis button functionality in the graph explorer.
- Refactored CLI tool configuration to use updated model names.
- Enhanced CLI executor to handle prompts correctly for codex commands.
- Introduced code relationship storage for better visualization in the index tree.
- Added support for parsing Markdown and plain text files in the symbol parser.
- Updated tests to reflect changes in language detection logic.
- Added a new Storage Manager component to handle storage statistics, project cleanup, and configuration for CCW centralized storage.
- Introduced functions to calculate directory sizes, get project storage stats, and clean specific or all storage.
- Enhanced SQLiteStore with a public API for executing queries securely.
- Updated tests to utilize the new execute_query method and validate storage management functionalities.
- Improved performance by implementing connection pooling with idle timeout management in SQLiteStore.
- Added new fields (token_count, symbol_type) to the symbols table and adjusted related insertions.
- Enhanced error handling and logging for storage operations.
- Implemented unit tests for the Tokenizer class, covering various text inputs, edge cases, and fallback mechanisms.
- Created performance benchmarks comparing tiktoken and pure Python implementations for token counting.
- Developed extensive tests for TreeSitterSymbolParser across Python, JavaScript, and TypeScript, ensuring accurate symbol extraction and parsing.
- Added configuration documentation for MCP integration and custom prompts, enhancing usability and flexibility.
- Introduced a refactor script for GraphAnalyzer to streamline future improvements.
- Added active memory configuration for manual interval and Gemini tool.
- Created file modification rules for handling edits and writes.
- Implemented migration manager for managing database schema migrations.
- Added migration 001 to normalize keywords into separate tables.
- Developed tests for validating performance optimizations including keyword normalization, path lookup, and symbol search.
- Created validation script to manually verify optimization implementations.
- Implement full coverage tests for Embedder model loading and embedding generation
- Add CRUD operations and caching tests for VectorStore
- Include cosine similarity computation tests
- Validate semantic search accuracy and relevance through various queries
- Establish performance benchmarks for embedding and search operations
- Ensure edge cases and error handling are covered
- Test thread safety and concurrent access scenarios
- Verify availability of semantic search dependencies
- Implemented tests for the ChunkConfig and Chunker classes, covering default and custom configurations.
- Added tests for symbol-based chunking, including single and multiple symbols, handling of empty symbols, and preservation of line numbers.
- Developed tests for sliding window chunking, ensuring correct chunking behavior with various content sizes and configurations.
- Created integration tests for semantic search, validating embedding generation, vector storage, and search accuracy across a complex codebase.
- Included performance tests for embedding generation and search operations.
- Established tests for chunking strategies, comparing symbol-based and sliding window approaches.
- Enhanced test coverage for edge cases, including handling of unicode characters and out-of-bounds symbol ranges.