mirror of
https://github.com/catlog22/Claude-Code-Workflow.git
synced 2026-03-21 19:08:17 +08:00
feat: Enhance embedding generation and search capabilities
- Added pre-calculation of estimated chunk count for HNSW capacity in `generate_dense_embeddings_centralized` to optimize indexing performance. - Implemented binary vector generation with memory-mapped storage for efficient cascade search, including metadata saving. - Introduced SPLADE sparse index generation with improved handling and metadata storage. - Updated `ChainSearchEngine` to prefer centralized binary searcher for improved performance and added fallback to legacy binary index. - Deprecated `BinaryANNIndex` in favor of `BinarySearcher` for better memory management and performance. - Enhanced `SpladeEncoder` with warmup functionality to reduce latency spikes during first-time inference. - Improved `SpladeIndex` with cache size adjustments for better query performance. - Added methods for managing binary vectors in `VectorMetadataStore`, including batch insertion and retrieval. - Created a new `BinarySearcher` class for efficient binary vector search using Hamming distance, supporting both memory-mapped and database loading modes.
This commit is contained in:
@@ -508,6 +508,10 @@ class ANNIndex:
|
||||
class BinaryANNIndex:
|
||||
"""Binary vector ANN index using Hamming distance for fast coarse retrieval.
|
||||
|
||||
.. deprecated::
|
||||
This class is deprecated. Use :class:`codexlens.search.binary_searcher.BinarySearcher`
|
||||
instead, which provides faster memory-mapped search with centralized storage.
|
||||
|
||||
Optimized for binary vectors (256-bit / 32 bytes per vector).
|
||||
Uses packed binary representation for memory efficiency.
|
||||
|
||||
@@ -553,6 +557,14 @@ class BinaryANNIndex:
|
||||
"Install with: pip install codexlens[semantic]"
|
||||
)
|
||||
|
||||
import warnings
|
||||
warnings.warn(
|
||||
"BinaryANNIndex is deprecated. Use codexlens.search.binary_searcher.BinarySearcher "
|
||||
"instead for faster memory-mapped search with centralized storage.",
|
||||
DeprecationWarning,
|
||||
stacklevel=2
|
||||
)
|
||||
|
||||
if dim <= 0 or dim % 8 != 0:
|
||||
raise ValueError(
|
||||
f"Invalid dimension: {dim}. Must be positive and divisible by 8."
|
||||
|
||||
Reference in New Issue
Block a user