Implement search and reranking functionality with FTS and embedding support

- Added BaseReranker abstract class for defining reranking interfaces.
- Implemented FastEmbedReranker using fastembed's TextCrossEncoder for scoring document-query pairs.
- Introduced FTSEngine for full-text search capabilities using SQLite FTS5.
- Developed SearchPipeline to integrate embedding, binary search, ANN indexing, FTS, and reranking.
- Added fusion methods for combining results from different search strategies using Reciprocal Rank Fusion.
- Created unit and integration tests for the new search and reranking components.
- Established configuration management for search parameters and models.
This commit is contained in:
catlog22
2026-03-16 23:03:17 +08:00
parent 5a4b18d9b1
commit de4158597b
41 changed files with 2655 additions and 1848 deletions

18
.gitignore vendored
View File

@@ -143,3 +143,21 @@ ccw/.tmp-ccw-auth-home/
docs/node_modules/
docs/.vitepress/dist/
docs/.vitepress/cache/
codex-lens/.cache/huggingface/hub/models--Xenova--ms-marco-MiniLM-L-6-v2/refs/main
codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/.gitattributes
codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/config.json
codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/quantize_config.json
codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/README.md
codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/special_tokens_map.json
codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/tokenizer_config.json
codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/tokenizer.json
codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/vocab.txt
codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/onnx/model_bnb4.onnx
codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/onnx/model_fp16.onnx
codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/onnx/model_int8.onnx
codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/onnx/model_q4.onnx
codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/onnx/model_q4f16.onnx
codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/onnx/model_quantized.onnx
codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/onnx/model_uint8.onnx
codex-lens/.cache/huggingface/models/Xenova--ms-marco-MiniLM-L-6-v2/onnx/model.onnx
codex-lens/data/registry.db