feat: Unified Embedding Pool with auto-discovery

Architecture refactoring for multi-provider rotation:

Backend:
- Add EmbeddingPoolConfig type with autoDiscover support
- Implement discoverProvidersForModel() for auto-aggregation
- Add GET/PUT /api/litellm-api/embedding-pool endpoints
- Add GET /api/litellm-api/embedding-pool/discover/:model preview
- Convert ccw-litellm status check to async with 5-min cache
- Maintain backward compatibility with legacy rotation config

Frontend:
- Add "Embedding Pool" tab in API Settings
- Auto-discover providers when target model selected
- Show provider/key count with include/exclude controls
- Increase sidebar width (280px → 320px)
- Add sync result feedback on save

Other:
- Remove worker count limits (was max=32)
- Add i18n translations (EN/CN)
- Update .gitignore for .mcp.json

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
catlog22
2025-12-25 16:06:49 +08:00
parent 4e6ee2db25
commit a1413dd1b3
10 changed files with 882 additions and 43 deletions

View File

@@ -331,7 +331,7 @@ def generate_embeddings(
if max_workers is None:
if embedding_backend == "litellm":
if endpoint_count > 1:
max_workers = min(endpoint_count * 2, 32) # Cap at 32 workers
max_workers = endpoint_count * 2 # No cap, scale with endpoints
else:
max_workers = 4
else:
@@ -806,7 +806,7 @@ def generate_embeddings_recursive(
if max_workers is None:
if embedding_backend == "litellm":
if endpoint_count > 1:
max_workers = min(endpoint_count * 2, 32)
max_workers = endpoint_count * 2 # No cap, scale with endpoints
else:
max_workers = 4
else: