Commit Graph

2 Commits

Author SHA1 Message Date
catlog22
a0a50d338a fix: correct embedder API call in SearchPipeline and add E2E test script
SearchPipeline.search() called self._embedder.embed() which doesn't exist
on BaseEmbedder/FastEmbedEmbedder — only embed_single() and embed_batch()
are defined. This was masked by MockEmbedder in tests. Changed to
embed_single() which is the correct API for single-query embedding.

Also added scripts/test_small_e2e.py for quick end-to-end validation of
indexing pipeline and all search features on a small file set.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 23:09:37 +08:00
catlog22
de4158597b Implement search and reranking functionality with FTS and embedding support
- Added BaseReranker abstract class for defining reranking interfaces.
- Implemented FastEmbedReranker using fastembed's TextCrossEncoder for scoring document-query pairs.
- Introduced FTSEngine for full-text search capabilities using SQLite FTS5.
- Developed SearchPipeline to integrate embedding, binary search, ANN indexing, FTS, and reranking.
- Added fusion methods for combining results from different search strategies using Reciprocal Rank Fusion.
- Created unit and integration tests for the new search and reranking components.
- Established configuration management for search parameters and models.
2026-03-16 23:03:17 +08:00