Claude-Code-Workflow

mirror of https://github.com/catlog22/Claude-Code-Workflow.git synced 2026-03-21 19:08:17 +08:00

Author	SHA1	Message	Date
catlog22	5a4b18d9b1	feat: enhance search, ranking, reranker and CLI tooling across ccw and codex-lens Major improvements to smart-search, chain-search cascade, ranking pipeline, reranker factory, CLI history store, codex-lens integration, and uv-manager. Simplify command-generator skill by inlining phases. Add comprehensive tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 20:35:08 +08:00
catlog22	54fd94547c	feat: Enhance embedding generation and search capabilities - Added pre-calculation of estimated chunk count for HNSW capacity in `generate_dense_embeddings_centralized` to optimize indexing performance. - Implemented binary vector generation with memory-mapped storage for efficient cascade search, including metadata saving. - Introduced SPLADE sparse index generation with improved handling and metadata storage. - Updated `ChainSearchEngine` to prefer centralized binary searcher for improved performance and added fallback to legacy binary index. - Deprecated `BinaryANNIndex` in favor of `BinarySearcher` for better memory management and performance. - Enhanced `SpladeEncoder` with warmup functionality to reduce latency spikes during first-time inference. - Improved `SpladeIndex` with cache size adjustments for better query performance. - Added methods for managing binary vectors in `VectorMetadataStore`, including batch insertion and retrieval. - Created a new `BinarySearcher` class for efficient binary vector search using Hamming distance, supporting both memory-mapped and database loading modes.	2026-01-02 23:57:55 +08:00
catlog22	9157c5c78b	feat: Implement centralized storage for SPLADE and vector embeddings - Added centralized SPLADE database and vector storage configuration in config.py. - Updated embedding_manager.py to support centralized SPLADE database path. - Enhanced generate_embeddings and generate_embeddings_recursive functions for centralized storage. - Introduced centralized ANN index creation in ann_index.py. - Modified hybrid_search.py to utilize centralized vector index for searches. - Implemented methods to discover and manage centralized SPLADE and HNSW files.	2026-01-02 16:53:39 +08:00
catlog22	9129c981a4	feat: Enhance BinaryANNIndex with vectorized search and performance benchmarking	2026-01-02 11:49:54 +08:00
catlog22	e21d801523	feat: Add multi-type embedding backends for cascade retrieval - Implemented BinaryEmbeddingBackend for fast coarse filtering using 256-dimensional binary vectors. - Developed DenseEmbeddingBackend for high-precision dense vectors (2048 dimensions) for reranking. - Created CascadeEmbeddingBackend to combine binary and dense embeddings for two-stage retrieval. - Introduced utility functions for embedding conversion and distance computation. chore: Migration 010 - Add multi-vector storage support - Added 'chunks' table to support multi-vector embeddings for cascade retrieval. - Included new columns: embedding_binary (256-dim) and embedding_dense (2048-dim) for efficient storage. - Implemented upgrade and downgrade functions to manage schema changes and data migration.	2026-01-02 10:52:43 +08:00
catlog22	5849f751bc	fix: 修复嵌入生成内存泄漏，优化性能 - HNSW 索引：预分配从 100 万降至 5 万，添加动态扩容和可控保存 - Embedder：添加 embed_to_numpy() 避免 .tolist() 转换，增强缓存清理 - embedding_manager：每 10 批次重建 embedder 实例，显式 gc.collect() - VectorStore：添加 bulk_insert() 上下文管理器，支持 numpy 批量写入 - Chunker：添加 skip_token_count 轻量模式，使用 char/4 估算（~9x 加速） 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-21 19:15:47 +08:00
catlog22	5e91ba6c60	Implement ANN index using HNSW algorithm and update related tests - Added ANNIndex class for approximate nearest neighbor search using HNSW. - Integrated ANN index with VectorStore for enhanced search capabilities. - Updated test suite for ANN index, including tests for adding, searching, saving, and loading vectors. - Modified existing tests to accommodate changes in search performance expectations. - Improved error handling for file operations in tests to ensure compatibility with Windows file locks. - Adjusted hybrid search performance assertions for increased stability in CI environments.	2025-12-19 10:35:29 +08:00

7 Commits