Implement search and reranking functionality with FTS and embedding support

- Added BaseReranker abstract class for defining reranking interfaces.
- Implemented FastEmbedReranker using fastembed's TextCrossEncoder for scoring document-query pairs.
- Introduced FTSEngine for full-text search capabilities using SQLite FTS5.
- Developed SearchPipeline to integrate embedding, binary search, ANN indexing, FTS, and reranking.
- Added fusion methods for combining results from different search strategies using Reciprocal Rank Fusion.
- Created unit and integration tests for the new search and reranking components.
- Established configuration management for search parameters and models.
This commit is contained in:
catlog22
2026-03-16 23:03:17 +08:00
parent 5a4b18d9b1
commit de4158597b
41 changed files with 2655 additions and 1848 deletions

View File

@@ -0,0 +1,31 @@
from codexlens.config import Config
def test_config_instantiates_no_args():
cfg = Config()
assert cfg is not None
def test_defaults_hnsw_ef():
cfg = Config.defaults()
assert cfg.hnsw_ef == 150
def test_defaults_hnsw_M():
cfg = Config.defaults()
assert cfg.hnsw_M == 32
def test_small_hnsw_ef():
cfg = Config.small()
assert cfg.hnsw_ef == 50
def test_custom_instantiation():
cfg = Config(hnsw_ef=100)
assert cfg.hnsw_ef == 100
def test_fusion_weights_keys():
cfg = Config()
assert set(cfg.fusion_weights.keys()) == {"exact", "fuzzy", "vector", "graph"}