feat: Add API indexer and enhance embedding management

- Add new API indexer script for document processing
- Update embedding manager with improved functionality
- Remove old cache files and update dependencies
- Modify workflow execute documentation

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
catlog22
2025-09-23 19:40:22 +08:00
parent 984fa3a4f3
commit 410d0efd7b
8 changed files with 506 additions and 337 deletions

View File

@@ -66,11 +66,12 @@ file_extensions:
# Embedding/RAG configuration
embedding:
enabled: true # Set to true to enable RAG features
model: "all-MiniLM-L6-v2" # Lightweight sentence transformer
model: "codesage/codesage-large-v2" # CodeSage V2 for code embeddings
cache_dir: "cache"
similarity_threshold: 0.3
max_context_length: 512
batch_size: 32
similarity_threshold: 0.6 # Higher threshold for better code similarity
max_context_length: 2048 # Increased for CodeSage V2 capabilities
batch_size: 8 # Reduced for larger model
trust_remote_code: true # Required for CodeSage V2
# Context analysis settings
context_analysis: