Claude-Code-Workflow

mirror of https://github.com/catlog22/Claude-Code-Workflow.git synced 2026-03-21 19:08:17 +08:00

Author	SHA1	Message	Date
catlog22	5a4b18d9b1	feat: enhance search, ranking, reranker and CLI tooling across ccw and codex-lens Major improvements to smart-search, chain-search cascade, ranking pipeline, reranker factory, CLI history store, codex-lens integration, and uv-manager. Simplify command-generator skill by inlining phases. Add comprehensive tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 20:35:08 +08:00
catlog22	71faaf43a8	refactor: 移除 SPLADE 和 hybrid_cascade，精简搜索架构删除 SPLADE 稀疏神经搜索后端和 hybrid_cascade 策略，将搜索架构从 6 种后端简化为 4 种（FTS Exact/Fuzzy, Binary Vector, Dense Vector, LSP）。主要变更： - 删除 splade_encoder.py, splade_index.py, migration_009 等 4 个文件 - 移除 config.py 中 SPLADE 相关配置（enable_splade, splade_model 等） - DEFAULT_WEIGHTS 改为 FTS 权重 {exact:0.25, fuzzy:0.1, vector:0.5, lsp:0.15} - 删除 hybrid_cascade_search()，所有 cascade fallback 改为 self.search() - API fusion_strategy='hybrid' 向后兼容映射到 binary_rerank - 删除 CLI index_splade/splade_status 命令和 --method splade - 更新测试、基准测试和文档	2026-02-08 12:07:41 +08:00
catlog22	2f3a14e946	Add unit tests for LspGraphBuilder class - Implement comprehensive unit tests for the LspGraphBuilder class to validate its functionality in building code association graphs. - Tests cover various scenarios including single level graph expansion, max nodes and depth boundaries, concurrent expansion limits, document symbol caching, error handling during node expansion, and edge cases such as empty seed lists and self-referencing nodes. - Utilize pytest and asyncio for asynchronous testing and mocking of LspBridge methods.	2026-01-20 12:49:31 +08:00
catlog22	87d38a3374	feat: 添加重排序模型配置，支持最大输入令牌数，优化 API 批处理能力	2026-01-07 15:50:22 +08:00
catlog22	86d3e36722	feat: 增强解决方案管理功能，支持按解决方案 ID 过滤和简要输出，优化嵌入模型配置读取	2026-01-07 09:31:52 +08:00
catlog22	713894090d	feat(codexlens): Improve search defaults and add explicit SPLADE mode Config changes: - Disable SPLADE by default (slow ~360ms), use FTS instead - Enable use_fts_fallback by default for faster sparse search CLI improvements: - Fix duplicate index_app typer definition - Add cascade_search dispatch for cascade method - Rename 'mode' to 'method' in search output - Mark embeddings-status, splade-status as deprecated - Add enable_splade and enable_cascade to search options Hybrid search: - Add enable_splade parameter for explicit SPLADE mode - Add fallback handling when SPLADE is requested but unavailable	2026-01-03 11:49:58 +08:00
catlog22	96b44e1482	feat: Add type validation for RRF weights and implement caching for embedder instances	2026-01-02 19:50:51 +08:00
catlog22	c268b531aa	feat: Enhance embedding generation to track current index path and improve metadata retrieval	2026-01-02 19:18:26 +08:00
catlog22	0b6e9db8e4	feat: Add centralized vector storage and metadata management for embeddings	2026-01-02 17:18:23 +08:00
catlog22	9157c5c78b	feat: Implement centralized storage for SPLADE and vector embeddings - Added centralized SPLADE database and vector storage configuration in config.py. - Updated embedding_manager.py to support centralized SPLADE database path. - Enhanced generate_embeddings and generate_embeddings_recursive functions for centralized storage. - Introduced centralized ANN index creation in ann_index.py. - Modified hybrid_search.py to utilize centralized vector index for searches. - Implemented methods to discover and manage centralized SPLADE and HNSW files.	2026-01-02 16:53:39 +08:00
catlog22	54fb7afdb2	Enhance semantic search capabilities and configuration - Added category support for programming and documentation languages in Config. - Implemented category-based filtering in HybridSearchEngine to improve search relevance based on query intent. - Introduced functions for filtering results by category and determining file categories based on extensions. - Updated VectorStore to include a category column in the database schema and modified chunk addition methods to support category tagging. - Enhanced the WatcherConfig to ignore additional common directories and files. - Created a benchmark script to compare performance between Binary Cascade, SPLADE, and Vector semantic search methods, including detailed result analysis and overlap comparison.	2026-01-02 15:01:20 +08:00
catlog22	92ed2524b7	feat: Enhance SPLADE indexing command to support multiple index databases and add chunk ID management	2026-01-02 13:25:23 +08:00
catlog22	e21d801523	feat: Add multi-type embedding backends for cascade retrieval - Implemented BinaryEmbeddingBackend for fast coarse filtering using 256-dimensional binary vectors. - Developed DenseEmbeddingBackend for high-precision dense vectors (2048 dimensions) for reranking. - Created CascadeEmbeddingBackend to combine binary and dense embeddings for two-stage retrieval. - Introduced utility functions for embedding conversion and distance computation. chore: Migration 010 - Add multi-vector storage support - Added 'chunks' table to support multi-vector embeddings for cascade retrieval. - Included new columns: embedding_binary (256-dim) and embedding_dense (2048-dim) for efficient storage. - Implemented upgrade and downgrade functions to manage schema changes and data migration.	2026-01-02 10:52:43 +08:00
catlog22	5bb01755bc	Implement SPLADE sparse encoder and associated database migrations - Added `splade_encoder.py` for ONNX-optimized SPLADE encoding, including methods for encoding text and batch processing. - Created `SPLADE_IMPLEMENTATION.md` to document the SPLADE encoder's functionality, design patterns, and integration points. - Introduced migration script `migration_009_add_splade.py` to add SPLADE metadata and posting list tables to the database. - Developed `splade_index.py` for managing the SPLADE inverted index, supporting efficient sparse vector retrieval. - Added verification script `verify_watcher.py` to test FileWatcher event filtering and debouncing functionality.	2026-01-01 17:41:22 +08:00
catlog22	520f2d26f2	feat(codex-lens): add unified reranker architecture and file watcher Unified Reranker Architecture: - Add BaseReranker ABC with factory pattern - Implement 4 backends: ONNX (default), API, LiteLLM, Legacy - Add .env configuration parsing for API credentials - Migrate from sentence-transformers to optimum+onnxruntime File Watcher Module: - Add real-time file system monitoring with watchdog - Implement IncrementalIndexer for single-file updates - Add WatcherManager with signal handling and graceful shutdown - Add 'codexlens watch' CLI command - Event filtering, debouncing, and deduplication - Thread-safe design with proper resource cleanup Tests: 16 watcher tests + 5 reranker test files 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-01 13:23:52 +08:00
catlog22	31a45f1f30	Add graph expansion and cross-encoder reranking features - Implemented GraphExpander to enhance search results with related symbols using precomputed neighbors. - Added CrossEncoderReranker for second-stage search ranking, allowing for improved result scoring. - Created migrations to establish necessary database tables for relationships and graph neighbors. - Developed tests for graph expansion functionality, ensuring related results are populated correctly. - Enhanced performance benchmarks for cross-encoder reranking latency and graph expansion overhead. - Updated schema cleanup tests to reflect changes in versioning and deprecated fields. - Added new test cases for Treesitter parser to validate relationship extraction with alias resolution.	2025-12-31 16:58:59 +08:00
catlog22	4061ae48c4	feat: Implement adaptive RRF weights and query intent detection - Added integration tests for adaptive RRF weights in hybrid search. - Enhanced query intent detection with new classifications: keyword, semantic, and mixed. - Introduced symbol boosting in search results based on explicit symbol matches. - Implemented embedding-based reranking with configurable options. - Added global symbol index for efficient symbol lookups across projects. - Improved file deletion handling on Windows to avoid permission errors. - Updated chunk configuration to increase overlap for better context. - Modified package.json test script to target specific test files. - Created comprehensive writing style guidelines for documentation. - Added TypeScript tests for query intent detection and adaptive weights. - Established performance benchmarks for global symbol indexing.	2025-12-26 15:08:47 +08:00
catlog22	ebcbb11cb2	feat: Enhance CodexLens search functionality with new parameters and result handling - Added search limit, content length, and extra files input fields in the CodexLens manager UI. - Updated API request parameters to include new fields: max_content_length and extra_files_count. - Refactored smart-search.ts to support new parameters with default values. - Implemented result splitting logic to return both full content and additional file paths. - Updated CLI commands to remove worker limits and allow dynamic scaling based on endpoint count. - Introduced EmbeddingPoolConfig for improved embedding management and auto-discovery of providers. - Enhanced search engines to utilize new parameters for fuzzy and exact searches. - Added support for embedding single texts in the LiteLLM embedder.	2025-12-25 16:16:44 +08:00
catlog22	8e744597d1	feat: Implement CodexLens multi-provider embedding rotation management - Added functions to get and update CodexLens embedding rotation configuration. - Introduced functionality to retrieve enabled embedding providers for rotation. - Created endpoints for managing rotation configuration via API. - Enhanced dashboard UI to support multi-provider rotation configuration. - Updated internationalization strings for new rotation features. - Adjusted CLI commands and embedding manager to support increased concurrency limits. - Modified hybrid search weights for improved ranking behavior.	2025-12-25 14:13:27 +08:00
catlog22	b00113d212	feat: Enhance embedding management and model configuration - Updated embedding_manager.py to include backend parameter in model configuration. - Modified model_manager.py to utilize cache_name for ONNX models. - Refactored hybrid_search.py to improve embedder initialization based on backend type. - Added backend column to vector_store.py for better model configuration management. - Implemented migration for existing database to include backend information. - Enhanced API settings implementation with comprehensive provider and endpoint management. - Introduced LiteLLM integration guide detailing configuration and usage. - Added examples for LiteLLM usage in TypeScript.	2025-12-24 14:03:59 +08:00
catlog22	8203d690cb	fix: CodexLens model detection, hybrid search stability, and JSON logging - Fix model installation detection using fastembed ONNX cache names - Add embeddings_config table for model metadata tracking - Fix hybrid search segfault by using single-threaded GPU mode - Suppress INFO logs in JSON mode to prevent error display - Add model dropdown filtering to show only installed models 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-22 21:49:10 +08:00
catlog22	6eebdb8898	fix: 修复额外的内存泄露问题 1. hybrid_search.py: 修复 _search_vector 方法中的 SQLite 连接泄露 - 使用 with 语句包装数据库连接 - 添加异常处理确保连接正确关闭 2. symbol_extractor.py: 添加上下文管理器支持 - 实现 __enter__ 和 __exit__ 方法 - 支持 with 语句自动管理资源 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-21 16:39:38 +08:00
catlog22	e1cac5dd50	Refactor search modes and optimize embedding generation - Updated the dashboard template to hide the Code Graph Explorer feature. - Enhanced the `executeCodexLens` function to use `exec` for better cross-platform compatibility and improved command execution. - Changed the default `maxResults` and `limit` parameters in the smart search tool to 10 for better performance. - Introduced a new `priority` search mode in the smart search tool, replacing the previous `parallel` mode, which now follows a fallback strategy: hybrid -> exact -> ripgrep. - Optimized the embedding generation process in the embedding manager by batching operations and using a cached embedder instance to reduce model loading overhead. - Implemented a thread-safe singleton pattern for the embedder to improve performance across multiple searches.	2025-12-20 11:08:34 +08:00
catlog22	5e91ba6c60	Implement ANN index using HNSW algorithm and update related tests - Added ANNIndex class for approximate nearest neighbor search using HNSW. - Integrated ANN index with VectorStore for enhanced search capabilities. - Updated test suite for ANN index, including tests for adding, searching, saving, and loading vectors. - Modified existing tests to accommodate changes in search performance expectations. - Improved error handling for file operations in tests to ensure compatibility with Windows file locks. - Adjusted hybrid search performance assertions for increased stability in CI environments.	2025-12-19 10:35:29 +08:00
catlog22	df23975a0b	Add comprehensive tests for schema cleanup migration and search comparison - Implement tests for migration 005 to verify removal of deprecated fields in the database schema. - Ensure that new databases are created with a clean schema. - Validate that keywords are correctly extracted from the normalized file_keywords table. - Test symbol insertion without deprecated fields and subdir operations without direct_files. - Create a detailed search comparison test to evaluate vector search vs hybrid search performance. - Add a script for reindexing projects to extract code relationships and verify GraphAnalyzer functionality. - Include a test script to check TreeSitter parser availability and relationship extraction from sample files.	2025-12-16 19:27:05 +08:00
catlog22	3da0ef2adb	Add comprehensive tests for query parsing and Reciprocal Rank Fusion - Implemented tests for the QueryParser class, covering various identifier splitting methods (CamelCase, snake_case, kebab-case), OR expansion, and FTS5 operator preservation. - Added parameterized tests to validate expected token outputs for different query formats. - Created edge case tests to ensure robustness against unusual input scenarios. - Developed tests for the Reciprocal Rank Fusion (RRF) algorithm, including score computation, weight handling, and result ranking across multiple sources. - Included tests for normalization of BM25 scores and tagging search results with source metadata.	2025-12-16 10:20:19 +08:00

26 Commits