Add graph expansion and cross-encoder reranking features

- Implemented GraphExpander to enhance search results with related symbols using precomputed neighbors.
- Added CrossEncoderReranker for second-stage search ranking, allowing for improved result scoring.
- Created migrations to establish necessary database tables for relationships and graph neighbors.
- Developed tests for graph expansion functionality, ensuring related results are populated correctly.
- Enhanced performance benchmarks for cross-encoder reranking latency and graph expansion overhead.
- Updated schema cleanup tests to reflect changes in versioning and deprecated fields.
- Added new test cases for Treesitter parser to validate relationship extraction with alias resolution.
This commit is contained in:
catlog22
2025-12-31 16:58:59 +08:00
parent 4bde13e83a
commit 31a45f1f30
27 changed files with 2566 additions and 97 deletions

View File

@@ -58,6 +58,7 @@ class IndexedFile(BaseModel):
language: str = Field(..., min_length=1)
symbols: List[Symbol] = Field(default_factory=list)
chunks: List[SemanticChunk] = Field(default_factory=list)
relationships: List["CodeRelationship"] = Field(default_factory=list)
@field_validator("path", "language")
@classmethod
@@ -70,7 +71,7 @@ class IndexedFile(BaseModel):
class RelationshipType(str, Enum):
"""Types of code relationships."""
CALL = "call"
CALL = "calls"
INHERITS = "inherits"
IMPORTS = "imports"