mirror of
https://github.com/catlog22/Claude-Code-Workflow.git
synced 2026-02-04 01:40:45 +08:00
feat: Enhance embedding management and model configuration
- Updated embedding_manager.py to include backend parameter in model configuration. - Modified model_manager.py to utilize cache_name for ONNX models. - Refactored hybrid_search.py to improve embedder initialization based on backend type. - Added backend column to vector_store.py for better model configuration management. - Implemented migration for existing database to include backend information. - Enhanced API settings implementation with comprehensive provider and endpoint management. - Introduced LiteLLM integration guide detailing configuration and usage. - Added examples for LiteLLM usage in TypeScript.
This commit is contained in:
@@ -49,17 +49,6 @@ RULES: [templates | additional constraints]
|
||||
- Break backward compatibility
|
||||
- Exceed 3 failed attempts without stopping
|
||||
|
||||
## Multi-Task Execution (Resume)
|
||||
|
||||
**First subtask**: Standard execution flow
|
||||
**Subsequent subtasks** (via `resume`):
|
||||
- Recall context from previous subtasks
|
||||
- Build on previous work
|
||||
- Maintain consistency
|
||||
- Test integration
|
||||
- Report context for next subtask
|
||||
|
||||
## Error Handling
|
||||
|
||||
**Three-Attempt Rule**: On 3rd failure, stop and report what attempted, what failed, root cause
|
||||
|
||||
@@ -80,7 +69,7 @@ RULES: [templates | additional constraints]
|
||||
|
||||
**If template has no format** → Use default format below
|
||||
|
||||
### Single Task Implementation
|
||||
### Task Implementation
|
||||
|
||||
```markdown
|
||||
# Implementation: [TASK Title]
|
||||
@@ -112,48 +101,6 @@ RULES: [templates | additional constraints]
|
||||
[Recommendations if any]
|
||||
```
|
||||
|
||||
### Multi-Task (First Subtask)
|
||||
|
||||
```markdown
|
||||
# Subtask 1/N: [TASK Title]
|
||||
|
||||
## Changes
|
||||
[List of file changes]
|
||||
|
||||
## Implementation
|
||||
[Details with code references]
|
||||
|
||||
## Testing
|
||||
✅ Tests: X passing
|
||||
|
||||
## Context for Next Subtask
|
||||
- Key decisions: [established patterns]
|
||||
- Files created: [paths and purposes]
|
||||
- Integration points: [where next subtask should connect]
|
||||
```
|
||||
|
||||
### Multi-Task (Subsequent Subtasks)
|
||||
|
||||
```markdown
|
||||
# Subtask N/M: [TASK Title]
|
||||
|
||||
## Changes
|
||||
[List of file changes]
|
||||
|
||||
## Integration Notes
|
||||
✅ Compatible with previous subtask
|
||||
✅ Maintains established patterns
|
||||
|
||||
## Implementation
|
||||
[Details with code references]
|
||||
|
||||
## Testing
|
||||
✅ Tests: X passing
|
||||
|
||||
## Context for Next Subtask
|
||||
[If not final, provide context]
|
||||
```
|
||||
|
||||
### Partial Completion
|
||||
|
||||
```markdown
|
||||
|
||||
@@ -362,10 +362,6 @@ ccw cli -p "RULES: \$(cat ~/.claude/workflows/cli-templates/protocols/analysis-p
|
||||
- Description: Additional directories (comma-separated)
|
||||
- Default: none
|
||||
|
||||
- **`--timeout <ms>`**
|
||||
- Description: Timeout in milliseconds
|
||||
- Default: 300000
|
||||
|
||||
- **`--resume [id]`**
|
||||
- Description: Resume previous session
|
||||
- Default: -
|
||||
@@ -423,73 +419,80 @@ CCW automatically maps to tool-specific syntax:
|
||||
|
||||
**Analysis Task** (Security Audit):
|
||||
```bash
|
||||
ccw cli -p "
|
||||
timeout 600 ccw cli -p "
|
||||
PURPOSE: Identify OWASP Top 10 vulnerabilities in authentication module to pass security audit; success = all critical/high issues documented with remediation
|
||||
TASK: • Scan for injection flaws (SQL, command, LDAP) • Check authentication bypass vectors • Evaluate session management • Assess sensitive data exposure
|
||||
MODE: analysis
|
||||
CONTEXT: @src/auth/**/* @src/middleware/auth.ts | Memory: Using bcrypt for passwords, JWT for sessions
|
||||
EXPECTED: Security report with: severity matrix, file:line references, CVE mappings where applicable, remediation code snippets prioritized by risk
|
||||
RULES: $(cat ~/.claude/workflows/cli-templates/protocols/analysis-protocol.md) $(cat ~/.claude/workflows/cli-templates/prompts/analysis/03-assess-security-risks.txt) | Focus on authentication | Ignore test files
|
||||
" --tool gemini --cd src/auth --timeout 600000
|
||||
" --tool gemini --mode analysis --cd src/auth
|
||||
```
|
||||
|
||||
**Implementation Task** (New Feature):
|
||||
```bash
|
||||
ccw cli -p "
|
||||
timeout 1800 ccw cli -p "
|
||||
PURPOSE: Implement rate limiting for API endpoints to prevent abuse; must be configurable per-endpoint; backward compatible with existing clients
|
||||
TASK: • Create rate limiter middleware with sliding window • Implement per-route configuration • Add Redis backend for distributed state • Include bypass for internal services
|
||||
MODE: write
|
||||
CONTEXT: @src/middleware/**/* @src/config/**/* | Memory: Using Express.js, Redis already configured, existing middleware pattern in auth.ts
|
||||
EXPECTED: Production-ready code with: TypeScript types, unit tests, integration test, configuration example, migration guide
|
||||
RULES: $(cat ~/.claude/workflows/cli-templates/protocols/write-protocol.md) $(cat ~/.claude/workflows/cli-templates/prompts/development/02-implement-feature.txt) | Follow existing middleware patterns | No breaking changes
|
||||
" --tool codex --mode write --timeout 1800000
|
||||
" --tool codex --mode write
|
||||
```
|
||||
|
||||
**Bug Fix Task**:
|
||||
```bash
|
||||
ccw cli -p "
|
||||
timeout 900 ccw cli -p "
|
||||
PURPOSE: Fix memory leak in WebSocket connection handler causing server OOM after 24h; root cause must be identified before any fix
|
||||
TASK: • Trace connection lifecycle from open to close • Identify event listener accumulation • Check cleanup on disconnect • Verify garbage collection eligibility
|
||||
MODE: analysis
|
||||
CONTEXT: @src/websocket/**/* @src/services/connection-manager.ts | Memory: Using ws library, ~5000 concurrent connections in production
|
||||
EXPECTED: Root cause analysis with: memory profile, leak source (file:line), fix recommendation with code, verification steps
|
||||
RULES: $(cat ~/.claude/workflows/cli-templates/protocols/analysis-protocol.md) $(cat ~/.claude/workflows/cli-templates/prompts/analysis/01-diagnose-bug-root-cause.txt) | Focus on resource cleanup
|
||||
" --tool gemini --cd src --timeout 900000
|
||||
" --tool gemini --mode analysis --cd src
|
||||
```
|
||||
|
||||
**Refactoring Task**:
|
||||
```bash
|
||||
ccw cli -p "
|
||||
timeout 1200 ccw cli -p "
|
||||
PURPOSE: Refactor payment processing to use strategy pattern for multi-gateway support; no functional changes; all existing tests must pass
|
||||
TASK: • Extract gateway interface from current implementation • Create strategy classes for Stripe, PayPal • Implement factory for gateway selection • Migrate existing code to use strategies
|
||||
MODE: write
|
||||
CONTEXT: @src/payments/**/* @src/types/payment.ts | Memory: Currently only Stripe, adding PayPal next sprint, must support future gateways
|
||||
EXPECTED: Refactored code with: strategy interface, concrete implementations, factory class, updated tests, migration checklist
|
||||
RULES: $(cat ~/.claude/workflows/cli-templates/protocols/write-protocol.md) $(cat ~/.claude/workflows/cli-templates/prompts/development/02-refactor-codebase.txt) | Preserve all existing behavior | Tests must pass
|
||||
" --tool gemini --mode write --timeout 1200000
|
||||
" --tool gemini --mode write
|
||||
```
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
### Timeout Allocation
|
||||
### Timeout Allocation (Bash)
|
||||
|
||||
**Minimum**: 5 minutes (300000ms)
|
||||
CLI internal timeout is disabled; controlled by external bash `timeout` command:
|
||||
|
||||
- **Simple**: 5-10min (300000-600000ms)
|
||||
- Examples: Analysis, search
|
||||
```bash
|
||||
# Syntax: timeout <seconds> ccw cli ...
|
||||
timeout 600 ccw cli -p "..." --tool gemini --mode analysis # 10 minutes
|
||||
timeout 1800 ccw cli -p "..." --tool codex --mode write # 30 minutes
|
||||
```
|
||||
|
||||
- **Medium**: 10-20min (600000-1200000ms)
|
||||
- Examples: Refactoring, documentation
|
||||
**Recommended Time Allocation**:
|
||||
|
||||
- **Complex**: 20-60min (1200000-3600000ms)
|
||||
- Examples: Implementation, migration
|
||||
- **Simple** (5-10min): Analysis, search
|
||||
- `timeout 300` ~ `timeout 600`
|
||||
|
||||
- **Heavy**: 60-120min (3600000-7200000ms)
|
||||
- Examples: Large codebase, multi-file
|
||||
- **Medium** (10-20min): Refactoring, documentation
|
||||
- `timeout 600` ~ `timeout 1200`
|
||||
|
||||
**Codex Multiplier**: 3x allocated time (minimum 15min / 900000ms)
|
||||
- **Complex** (20-60min): Implementation, migration
|
||||
- `timeout 1200` ~ `timeout 3600`
|
||||
|
||||
- **Heavy** (60-120min): Large codebase, multi-file
|
||||
- `timeout 3600` ~ `timeout 7200`
|
||||
|
||||
**Codex Multiplier**: 3x allocated time (minimum 15min / 900s)
|
||||
|
||||
### Permission Framework
|
||||
|
||||
@@ -523,4 +526,3 @@ RULES: $(cat ~/.claude/workflows/cli-templates/protocols/write-protocol.md) $(ca
|
||||
- [ ] **Tool selected** - `--tool gemini|qwen|codex`
|
||||
- [ ] **Template applied (REQUIRED)** - Use specific or universal fallback template
|
||||
- [ ] **Constraints specified** - Scope, requirements
|
||||
- [ ] **Timeout configured** - Based on complexity
|
||||
|
||||
@@ -21,8 +21,11 @@
|
||||
- Graceful degradation
|
||||
- Don't expose sensitive info
|
||||
|
||||
|
||||
|
||||
## Core Principles
|
||||
|
||||
|
||||
**Incremental Progress**:
|
||||
- Small, testable changes
|
||||
- Commit working code frequently
|
||||
@@ -43,11 +46,58 @@
|
||||
- Maintain established patterns
|
||||
- Test integration between subtasks
|
||||
|
||||
|
||||
## System Optimization
|
||||
|
||||
**Direct Binary Calls**: Always call binaries directly in `functions.shell`, set `workdir`, avoid shell wrappers (`bash -lc`, `cmd /c`, etc.)
|
||||
|
||||
**Text Editing Priority**:
|
||||
1. Use `apply_patch` tool for all routine text edits
|
||||
2. Fall back to `sed` for single-line substitutions if unavailable
|
||||
3. Avoid Python editing scripts unless both fail
|
||||
|
||||
**apply_patch invocation**:
|
||||
```json
|
||||
{
|
||||
"command": ["apply_patch", "*** Begin Patch\n*** Update File: path/to/file\n@@\n- old\n+ new\n*** End Patch\n"],
|
||||
"workdir": "<workdir>",
|
||||
"justification": "Brief reason"
|
||||
}
|
||||
```
|
||||
|
||||
**Windows UTF-8 Encoding** (before commands):
|
||||
```powershell
|
||||
[Console]::InputEncoding = [Text.UTF8Encoding]::new($false)
|
||||
[Console]::OutputEncoding = [Text.UTF8Encoding]::new($false)
|
||||
chcp 65001 > $null
|
||||
```
|
||||
|
||||
## Context Acquisition (MCP Tools Priority)
|
||||
|
||||
**For task context gathering and analysis, ALWAYS prefer MCP tools**:
|
||||
|
||||
1. **smart_search** - First choice for code discovery
|
||||
- Use `smart_search(query="...")` for semantic/keyword search
|
||||
- Use `smart_search(action="find_files", pattern="*.ts")` for file discovery
|
||||
- Supports modes: `auto`, `hybrid`, `exact`, `ripgrep`
|
||||
|
||||
2. **read_file** - Batch file reading
|
||||
- Read multiple files in parallel: `read_file(path="file1.ts")`, `read_file(path="file2.ts")`
|
||||
- Supports glob patterns: `read_file(path="src/**/*.config.ts")`
|
||||
|
||||
**Priority Order**:
|
||||
```
|
||||
smart_search (discovery) → read_file (batch read) → shell commands (fallback)
|
||||
```
|
||||
|
||||
**NEVER** use shell commands (`cat`, `find`, `grep`) when MCP tools are available.
|
||||
|
||||
## Execution Checklist
|
||||
|
||||
**Before**:
|
||||
- [ ] Understand PURPOSE and TASK clearly
|
||||
- [ ] Review CONTEXT files, find 3+ patterns
|
||||
- [ ] Use smart_search to discover relevant files
|
||||
- [ ] Use read_file to batch read context files, find 3+ patterns
|
||||
- [ ] Check RULES templates and constraints
|
||||
|
||||
**During**:
|
||||
|
||||
@@ -1,25 +1,62 @@
|
||||
# Gemini Code Guidelines
|
||||
|
||||
## Code Quality Standards
|
||||
|
||||
### Code Quality
|
||||
- Follow project's existing patterns
|
||||
- Match import style and naming conventions
|
||||
- Single responsibility per function/class
|
||||
- DRY (Don't Repeat Yourself)
|
||||
- YAGNI (You Aren't Gonna Need It)
|
||||
|
||||
### Testing
|
||||
- Test all public functions
|
||||
- Test edge cases and error conditions
|
||||
- Mock external dependencies
|
||||
- Target 80%+ coverage
|
||||
|
||||
### Error Handling
|
||||
- Proper try-catch blocks
|
||||
- Clear error messages
|
||||
- Graceful degradation
|
||||
- Don't expose sensitive info
|
||||
|
||||
## Core Principles
|
||||
|
||||
**Thoroughness**:
|
||||
- Analyze ALL CONTEXT files completely
|
||||
- Check cross-file patterns and dependencies
|
||||
- Identify edge cases and quantify metrics
|
||||
**Incremental Progress**:
|
||||
- Small, testable changes
|
||||
- Commit working code frequently
|
||||
- Build on previous work (subtasks)
|
||||
|
||||
**Evidence-Based**:
|
||||
- Quote relevant code with `file:line` references
|
||||
- Link related patterns across files
|
||||
- Support all claims with concrete examples
|
||||
- Study 3+ similar patterns before implementing
|
||||
- Match project style exactly
|
||||
- Verify with existing code
|
||||
|
||||
**Actionable**:
|
||||
- Clear, specific recommendations (not vague)
|
||||
- Prioritized by impact
|
||||
- Incremental changes over big rewrites
|
||||
**Pragmatic**:
|
||||
- Boring solutions over clever code
|
||||
- Simple over complex
|
||||
- Adapt to project reality
|
||||
|
||||
**Philosophy**:
|
||||
- **Simple over complex** - Avoid over-engineering
|
||||
- **Clear over clever** - Prefer obvious solutions
|
||||
- **Learn from existing** - Reference project patterns
|
||||
- **Pragmatic over dogmatic** - Adapt to project reality
|
||||
- **Incremental progress** - Small, testable changes
|
||||
**Context Continuity** (Multi-Task):
|
||||
- Leverage resume for consistency
|
||||
- Maintain established patterns
|
||||
- Test integration between subtasks
|
||||
|
||||
## Execution Checklist
|
||||
|
||||
**Before**:
|
||||
- [ ] Understand PURPOSE and TASK clearly
|
||||
- [ ] Review CONTEXT files, find 3+ patterns
|
||||
- [ ] Check RULES templates and constraints
|
||||
|
||||
**During**:
|
||||
- [ ] Follow existing patterns exactly
|
||||
- [ ] Write tests alongside code
|
||||
- [ ] Run tests after every change
|
||||
- [ ] Commit working code incrementally
|
||||
|
||||
**After**:
|
||||
- [ ] All tests pass
|
||||
- [ ] Coverage meets target
|
||||
- [ ] Build succeeds
|
||||
- [ ] All EXPECTED deliverables met
|
||||
|
||||
196
API_SETTINGS_IMPLEMENTATION.md
Normal file
196
API_SETTINGS_IMPLEMENTATION.md
Normal file
@@ -0,0 +1,196 @@
|
||||
# API Settings 页面实现完成
|
||||
|
||||
## 创建的文件
|
||||
|
||||
### 1. JavaScript 文件
|
||||
**位置**: `ccw/src/templates/dashboard-js/views/api-settings.js` (28KB)
|
||||
|
||||
**主要功能**:
|
||||
- ✅ Provider Management (提供商管理)
|
||||
- 添加/编辑/删除提供商
|
||||
- 支持 OpenAI, Anthropic, Google, Ollama, Azure, Mistral, DeepSeek, Custom
|
||||
- API Key 管理(支持环境变量)
|
||||
- 连接测试功能
|
||||
|
||||
- ✅ Endpoint Management (端点管理)
|
||||
- 创建自定义端点
|
||||
- 关联提供商和模型
|
||||
- 缓存策略配置
|
||||
- 显示 CLI 使用示例
|
||||
|
||||
- ✅ Cache Management (缓存管理)
|
||||
- 全局缓存开关
|
||||
- 缓存统计显示
|
||||
- 清除缓存功能
|
||||
|
||||
### 2. CSS 样式文件
|
||||
**位置**: `ccw/src/templates/dashboard-css/31-api-settings.css` (6.8KB)
|
||||
|
||||
**样式包括**:
|
||||
- 卡片式布局
|
||||
- 表单样式
|
||||
- 进度条
|
||||
- 响应式设计
|
||||
- 空状态显示
|
||||
|
||||
### 3. 国际化支持
|
||||
**位置**: `ccw/src/templates/dashboard-js/i18n.js`
|
||||
|
||||
**添加的翻译**:
|
||||
- 英文:54 个翻译键
|
||||
- 中文:54 个翻译键
|
||||
- 包含所有 UI 文本、提示信息、错误消息
|
||||
|
||||
### 4. 配置更新
|
||||
|
||||
#### dashboard-generator.ts
|
||||
- ✅ 添加 `31-api-settings.css` 到 CSS 模块列表
|
||||
- ✅ 添加 `views/api-settings.js` 到 JS 模块列表
|
||||
|
||||
#### navigation.js
|
||||
- ✅ 添加 `api-settings` 路由处理
|
||||
- ✅ 添加标题更新逻辑
|
||||
|
||||
#### dashboard.html
|
||||
- ✅ 添加导航菜单项 (Settings 图标)
|
||||
|
||||
## API 端点使用
|
||||
|
||||
该页面使用以下后端 API(已存在):
|
||||
|
||||
### Provider APIs
|
||||
- `GET /api/litellm-api/providers` - 获取所有提供商
|
||||
- `POST /api/litellm-api/providers` - 创建提供商
|
||||
- `PUT /api/litellm-api/providers/:id` - 更新提供商
|
||||
- `DELETE /api/litellm-api/providers/:id` - 删除提供商
|
||||
- `POST /api/litellm-api/providers/:id/test` - 测试连接
|
||||
|
||||
### Endpoint APIs
|
||||
- `GET /api/litellm-api/endpoints` - 获取所有端点
|
||||
- `POST /api/litellm-api/endpoints` - 创建端点
|
||||
- `PUT /api/litellm-api/endpoints/:id` - 更新端点
|
||||
- `DELETE /api/litellm-api/endpoints/:id` - 删除端点
|
||||
|
||||
### Model Discovery
|
||||
- `GET /api/litellm-api/models/:providerType` - 获取提供商支持的模型列表
|
||||
|
||||
### Cache APIs
|
||||
- `GET /api/litellm-api/cache/stats` - 获取缓存统计
|
||||
- `POST /api/litellm-api/cache/clear` - 清除缓存
|
||||
|
||||
### Config APIs
|
||||
- `GET /api/litellm-api/config` - 获取完整配置
|
||||
- `PUT /api/litellm-api/config/cache` - 更新全局缓存设置
|
||||
|
||||
## 页面特性
|
||||
|
||||
### Provider 管理
|
||||
```
|
||||
+-- Provider Card ------------------------+
|
||||
| OpenAI Production [Edit] [Del] |
|
||||
| Type: openai |
|
||||
| Key: sk-...abc |
|
||||
| URL: https://api.openai.com/v1 |
|
||||
| Status: ✓ Enabled |
|
||||
+-----------------------------------------+
|
||||
```
|
||||
|
||||
### Endpoint 管理
|
||||
```
|
||||
+-- Endpoint Card ------------------------+
|
||||
| GPT-4o Code Review [Edit] [Del]|
|
||||
| ID: my-gpt4o |
|
||||
| Provider: OpenAI Production |
|
||||
| Model: gpt-4-turbo |
|
||||
| Cache: Enabled (60 min) |
|
||||
| Usage: ccw cli -p "..." --model my-gpt4o|
|
||||
+-----------------------------------------+
|
||||
```
|
||||
|
||||
### 表单功能
|
||||
- **Provider Form**:
|
||||
- 类型选择(8 种提供商)
|
||||
- API Key 输入(支持显示/隐藏)
|
||||
- 环境变量支持
|
||||
- Base URL 自定义
|
||||
- 启用/禁用开关
|
||||
|
||||
- **Endpoint Form**:
|
||||
- 端点 ID(CLI 使用)
|
||||
- 显示名称
|
||||
- 提供商选择(动态加载)
|
||||
- 模型选择(根据提供商动态加载)
|
||||
- 缓存策略配置
|
||||
- TTL(分钟)
|
||||
- 最大大小(KB)
|
||||
- 自动缓存文件模式
|
||||
|
||||
## 使用流程
|
||||
|
||||
### 1. 添加提供商
|
||||
1. 点击 "Add Provider"
|
||||
2. 选择提供商类型(如 OpenAI)
|
||||
3. 输入显示名称
|
||||
4. 输入 API Key(或使用环境变量)
|
||||
5. 可选:输入自定义 API Base URL
|
||||
6. 保存
|
||||
|
||||
### 2. 创建自定义端点
|
||||
1. 点击 "Add Endpoint"
|
||||
2. 输入端点 ID(用于 CLI)
|
||||
3. 输入显示名称
|
||||
4. 选择提供商
|
||||
5. 选择模型(自动加载该提供商支持的模型)
|
||||
6. 可选:配置缓存策略
|
||||
7. 保存
|
||||
|
||||
### 3. 使用端点
|
||||
```bash
|
||||
ccw cli -p "Analyze this code..." --model my-gpt4o
|
||||
```
|
||||
|
||||
## 代码质量
|
||||
|
||||
- ✅ 遵循现有代码风格
|
||||
- ✅ 使用 i18n 函数支持国际化
|
||||
- ✅ 响应式设计(移动端友好)
|
||||
- ✅ 完整的表单验证
|
||||
- ✅ 用户友好的错误提示
|
||||
- ✅ 使用 Lucide 图标
|
||||
- ✅ 模态框复用现有样式
|
||||
- ✅ 与后端 API 完全集成
|
||||
|
||||
## 测试建议
|
||||
|
||||
1. **基础功能测试**:
|
||||
- 添加/编辑/删除提供商
|
||||
- 添加/编辑/删除端点
|
||||
- 清除缓存
|
||||
|
||||
2. **表单验证测试**:
|
||||
- 必填字段验证
|
||||
- API Key 显示/隐藏
|
||||
- 环境变量切换
|
||||
|
||||
3. **数据加载测试**:
|
||||
- 模型列表动态加载
|
||||
- 缓存统计显示
|
||||
- 空状态显示
|
||||
|
||||
4. **国际化测试**:
|
||||
- 切换语言(英文/中文)
|
||||
- 验证所有文本正确显示
|
||||
|
||||
## 下一步
|
||||
|
||||
页面已完成并集成到项目中。启动 CCW Dashboard 后:
|
||||
1. 导航栏会显示 "API Settings" 菜单项(Settings 图标)
|
||||
2. 点击进入即可使用所有功能
|
||||
3. 所有操作会实时同步到配置文件
|
||||
|
||||
## 注意事项
|
||||
|
||||
- 页面使用现有的 LiteLLM API 路由(`litellm-api-routes.ts`)
|
||||
- 配置保存在项目的 LiteLLM 配置文件中
|
||||
- 支持环境变量引用格式:`${VARIABLE_NAME}`
|
||||
- API Key 在显示时会自动脱敏(显示前 4 位和后 4 位)
|
||||
308
ccw/LITELLM_INTEGRATION.md
Normal file
308
ccw/LITELLM_INTEGRATION.md
Normal file
@@ -0,0 +1,308 @@
|
||||
# LiteLLM Integration Guide
|
||||
|
||||
## Overview
|
||||
|
||||
CCW now supports custom LiteLLM endpoints with integrated context caching. You can configure multiple providers (OpenAI, Anthropic, Ollama, etc.) and create custom endpoints with file-based caching strategies.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ CLI Executor │
|
||||
│ │
|
||||
│ ┌─────────────┐ ┌──────────────────────────────┐ │
|
||||
│ │ --model │────────>│ Route Decision: │ │
|
||||
│ │ flag │ │ - gemini/qwen/codex → CLI │ │
|
||||
│ └─────────────┘ │ - custom ID → LiteLLM │ │
|
||||
│ └──────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ LiteLLM Executor │
|
||||
│ │
|
||||
│ 1. Load endpoint config (litellm-api-config.json) │
|
||||
│ 2. Extract @patterns from prompt │
|
||||
│ 3. Pack files via context-cache │
|
||||
│ 4. Call LiteLLM client with cached content + prompt │
|
||||
│ 5. Return result │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### File Location
|
||||
|
||||
Configuration is stored per-project:
|
||||
```
|
||||
<project>/.ccw/storage/config/litellm-api-config.json
|
||||
```
|
||||
|
||||
### Configuration Structure
|
||||
|
||||
```json
|
||||
{
|
||||
"version": 1,
|
||||
"providers": [
|
||||
{
|
||||
"id": "openai-1234567890",
|
||||
"name": "My OpenAI",
|
||||
"type": "openai",
|
||||
"apiKey": "${OPENAI_API_KEY}",
|
||||
"enabled": true,
|
||||
"createdAt": "2025-01-01T00:00:00.000Z",
|
||||
"updatedAt": "2025-01-01T00:00:00.000Z"
|
||||
}
|
||||
],
|
||||
"endpoints": [
|
||||
{
|
||||
"id": "my-gpt4o",
|
||||
"name": "GPT-4o with Context Cache",
|
||||
"providerId": "openai-1234567890",
|
||||
"model": "gpt-4o",
|
||||
"description": "GPT-4o with automatic file caching",
|
||||
"cacheStrategy": {
|
||||
"enabled": true,
|
||||
"ttlMinutes": 60,
|
||||
"maxSizeKB": 512,
|
||||
"filePatterns": ["*.md", "*.ts", "*.js"]
|
||||
},
|
||||
"enabled": true,
|
||||
"createdAt": "2025-01-01T00:00:00.000Z",
|
||||
"updatedAt": "2025-01-01T00:00:00.000Z"
|
||||
}
|
||||
],
|
||||
"defaultEndpoint": "my-gpt4o",
|
||||
"globalCacheSettings": {
|
||||
"enabled": true,
|
||||
"cacheDir": "~/.ccw/cache/context",
|
||||
"maxTotalSizeMB": 100
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### Via CLI
|
||||
|
||||
```bash
|
||||
# Use custom endpoint with --model flag
|
||||
ccw cli -p "Analyze authentication flow" --tool litellm --model my-gpt4o
|
||||
|
||||
# With context patterns (automatically cached)
|
||||
ccw cli -p "@src/auth/**/*.ts Review security" --tool litellm --model my-gpt4o
|
||||
|
||||
# Disable caching for specific call
|
||||
ccw cli -p "Quick question" --tool litellm --model my-gpt4o --no-cache
|
||||
```
|
||||
|
||||
### Via Dashboard API
|
||||
|
||||
#### Create Provider
|
||||
```bash
|
||||
curl -X POST http://localhost:3000/api/litellm-api/providers \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"name": "My OpenAI",
|
||||
"type": "openai",
|
||||
"apiKey": "${OPENAI_API_KEY}",
|
||||
"enabled": true
|
||||
}'
|
||||
```
|
||||
|
||||
#### Create Endpoint
|
||||
```bash
|
||||
curl -X POST http://localhost:3000/api/litellm-api/endpoints \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"id": "my-gpt4o",
|
||||
"name": "GPT-4o with Cache",
|
||||
"providerId": "openai-1234567890",
|
||||
"model": "gpt-4o",
|
||||
"cacheStrategy": {
|
||||
"enabled": true,
|
||||
"ttlMinutes": 60,
|
||||
"maxSizeKB": 512,
|
||||
"filePatterns": ["*.md", "*.ts"]
|
||||
},
|
||||
"enabled": true
|
||||
}'
|
||||
```
|
||||
|
||||
#### Test Provider Connection
|
||||
```bash
|
||||
curl -X POST http://localhost:3000/api/litellm-api/providers/openai-1234567890/test
|
||||
```
|
||||
|
||||
## Context Caching
|
||||
|
||||
### How It Works
|
||||
|
||||
1. **Pattern Detection**: LiteLLM executor scans prompt for `@patterns`
|
||||
```
|
||||
@src/**/*.ts
|
||||
@CLAUDE.md
|
||||
@../shared/**/*
|
||||
```
|
||||
|
||||
2. **File Packing**: Files matching patterns are packed via `context-cache` tool
|
||||
- Respects `max_file_size` limit (default: 1MB per file)
|
||||
- Applies TTL from endpoint config
|
||||
- Generates session ID for retrieval
|
||||
|
||||
3. **Cache Integration**: Cached content is prepended to prompt
|
||||
```
|
||||
<cached files>
|
||||
---
|
||||
<original prompt>
|
||||
```
|
||||
|
||||
4. **LLM Call**: Combined prompt sent to LiteLLM with provider credentials
|
||||
|
||||
### Cache Strategy Configuration
|
||||
|
||||
```typescript
|
||||
interface CacheStrategy {
|
||||
enabled: boolean; // Enable/disable caching for this endpoint
|
||||
ttlMinutes: number; // Cache lifetime (default: 60)
|
||||
maxSizeKB: number; // Max cache size (default: 512KB)
|
||||
filePatterns: string[]; // Glob patterns to cache
|
||||
}
|
||||
```
|
||||
|
||||
### Example: Security Audit with Cache
|
||||
|
||||
```bash
|
||||
ccw cli -p "
|
||||
PURPOSE: OWASP Top 10 security audit of authentication module
|
||||
TASK: • Check SQL injection • Verify session management • Test XSS vectors
|
||||
CONTEXT: @src/auth/**/*.ts @src/middleware/auth.ts
|
||||
EXPECTED: Security report with severity levels and remediation steps
|
||||
" --tool litellm --model my-security-scanner --mode analysis
|
||||
```
|
||||
|
||||
**What happens:**
|
||||
1. Executor detects `@src/auth/**/*.ts` and `@src/middleware/auth.ts`
|
||||
2. Packs matching files into context cache
|
||||
3. Cache entry valid for 60 minutes (per endpoint config)
|
||||
4. Subsequent calls reuse cached files (no re-packing)
|
||||
5. LiteLLM receives full context without manual file specification
|
||||
|
||||
## Environment Variables
|
||||
|
||||
### Provider API Keys
|
||||
|
||||
LiteLLM uses standard environment variable names:
|
||||
|
||||
| Provider | Env Var Name |
|
||||
|------------|-----------------------|
|
||||
| OpenAI | `OPENAI_API_KEY` |
|
||||
| Anthropic | `ANTHROPIC_API_KEY` |
|
||||
| Google | `GOOGLE_API_KEY` |
|
||||
| Azure | `AZURE_API_KEY` |
|
||||
| Mistral | `MISTRAL_API_KEY` |
|
||||
| DeepSeek | `DEEPSEEK_API_KEY` |
|
||||
|
||||
### Configuration Syntax
|
||||
|
||||
Use `${ENV_VAR}` syntax in config:
|
||||
```json
|
||||
{
|
||||
"apiKey": "${OPENAI_API_KEY}"
|
||||
}
|
||||
```
|
||||
|
||||
The executor resolves these at runtime via `resolveEnvVar()`.
|
||||
|
||||
## API Reference
|
||||
|
||||
### Config Manager (`litellm-api-config-manager.ts`)
|
||||
|
||||
#### Provider Management
|
||||
```typescript
|
||||
getAllProviders(baseDir: string): ProviderCredential[]
|
||||
getProvider(baseDir: string, providerId: string): ProviderCredential | null
|
||||
getProviderWithResolvedEnvVars(baseDir: string, providerId: string): ProviderCredential & { resolvedApiKey: string } | null
|
||||
addProvider(baseDir: string, providerData): ProviderCredential
|
||||
updateProvider(baseDir: string, providerId: string, updates): ProviderCredential
|
||||
deleteProvider(baseDir: string, providerId: string): boolean
|
||||
```
|
||||
|
||||
#### Endpoint Management
|
||||
```typescript
|
||||
getAllEndpoints(baseDir: string): CustomEndpoint[]
|
||||
getEndpoint(baseDir: string, endpointId: string): CustomEndpoint | null
|
||||
findEndpointById(baseDir: string, endpointId: string): CustomEndpoint | null
|
||||
addEndpoint(baseDir: string, endpointData): CustomEndpoint
|
||||
updateEndpoint(baseDir: string, endpointId: string, updates): CustomEndpoint
|
||||
deleteEndpoint(baseDir: string, endpointId: string): boolean
|
||||
```
|
||||
|
||||
### Executor (`litellm-executor.ts`)
|
||||
|
||||
```typescript
|
||||
interface LiteLLMExecutionOptions {
|
||||
prompt: string;
|
||||
endpointId: string;
|
||||
baseDir: string;
|
||||
cwd?: string;
|
||||
includeDirs?: string[];
|
||||
enableCache?: boolean;
|
||||
onOutput?: (data: { type: string; data: string }) => void;
|
||||
}
|
||||
|
||||
interface LiteLLMExecutionResult {
|
||||
success: boolean;
|
||||
output: string;
|
||||
model: string;
|
||||
provider: string;
|
||||
cacheUsed: boolean;
|
||||
cachedFiles?: string[];
|
||||
error?: string;
|
||||
}
|
||||
|
||||
executeLiteLLMEndpoint(options: LiteLLMExecutionOptions): Promise<LiteLLMExecutionResult>
|
||||
extractPatterns(prompt: string): string[]
|
||||
```
|
||||
|
||||
## Dashboard Integration
|
||||
|
||||
The dashboard provides UI for managing LiteLLM configuration:
|
||||
|
||||
- **Providers**: Add/edit/delete provider credentials
|
||||
- **Endpoints**: Configure custom endpoints with cache strategies
|
||||
- **Cache Stats**: View cache usage and clear entries
|
||||
- **Test Connections**: Verify provider API access
|
||||
|
||||
Routes are handled by `litellm-api-routes.ts`.
|
||||
|
||||
## Limitations
|
||||
|
||||
1. **Python Dependency**: Requires `ccw-litellm` Python package installed
|
||||
2. **Model Support**: Limited to models supported by LiteLLM library
|
||||
3. **Cache Scope**: Context cache is in-memory (not persisted across restarts)
|
||||
4. **Pattern Syntax**: Only supports glob-style `@patterns`, not regex
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Error: "Endpoint not found"
|
||||
- Verify endpoint ID matches config file
|
||||
- Check `litellm-api-config.json` exists in `.ccw/storage/config/`
|
||||
|
||||
### Error: "API key not configured"
|
||||
- Ensure environment variable is set
|
||||
- Verify `${ENV_VAR}` syntax in config
|
||||
- Test with `echo $OPENAI_API_KEY`
|
||||
|
||||
### Error: "Failed to spawn Python process"
|
||||
- Install ccw-litellm: `pip install ccw-litellm`
|
||||
- Verify Python accessible: `python --version`
|
||||
|
||||
### Cache Not Applied
|
||||
- Check endpoint has `cacheStrategy.enabled: true`
|
||||
- Verify prompt contains `@patterns`
|
||||
- Check cache TTL hasn't expired
|
||||
|
||||
## Examples
|
||||
|
||||
See `examples/litellm-config.json` for complete configuration template.
|
||||
77
ccw/examples/litellm-usage.ts
Normal file
77
ccw/examples/litellm-usage.ts
Normal file
@@ -0,0 +1,77 @@
|
||||
/**
|
||||
* LiteLLM Usage Examples
|
||||
* Demonstrates how to use the LiteLLM TypeScript client
|
||||
*/
|
||||
|
||||
import { getLiteLLMClient, getLiteLLMStatus } from '../src/tools/litellm-client';
|
||||
|
||||
async function main() {
|
||||
console.log('=== LiteLLM TypeScript Bridge Examples ===\n');
|
||||
|
||||
// Example 1: Check availability
|
||||
console.log('1. Checking LiteLLM availability...');
|
||||
const status = await getLiteLLMStatus();
|
||||
console.log(' Status:', status);
|
||||
console.log('');
|
||||
|
||||
if (!status.available) {
|
||||
console.log('❌ LiteLLM is not available. Please install ccw-litellm:');
|
||||
console.log(' pip install ccw-litellm');
|
||||
return;
|
||||
}
|
||||
|
||||
const client = getLiteLLMClient();
|
||||
|
||||
// Example 2: Get configuration
|
||||
console.log('2. Getting configuration...');
|
||||
try {
|
||||
const config = await client.getConfig();
|
||||
console.log(' Config:', config);
|
||||
} catch (error) {
|
||||
console.log(' Error:', error.message);
|
||||
}
|
||||
console.log('');
|
||||
|
||||
// Example 3: Generate embeddings
|
||||
console.log('3. Generating embeddings...');
|
||||
try {
|
||||
const texts = ['Hello world', 'Machine learning is amazing'];
|
||||
const embedResult = await client.embed(texts, 'default');
|
||||
console.log(' Dimensions:', embedResult.dimensions);
|
||||
console.log(' Vectors count:', embedResult.vectors.length);
|
||||
console.log(' First vector (first 5 dims):', embedResult.vectors[0]?.slice(0, 5));
|
||||
} catch (error) {
|
||||
console.log(' Error:', error.message);
|
||||
}
|
||||
console.log('');
|
||||
|
||||
// Example 4: Single message chat
|
||||
console.log('4. Single message chat...');
|
||||
try {
|
||||
const response = await client.chat('What is 2+2?', 'default');
|
||||
console.log(' Response:', response);
|
||||
} catch (error) {
|
||||
console.log(' Error:', error.message);
|
||||
}
|
||||
console.log('');
|
||||
|
||||
// Example 5: Multi-turn chat
|
||||
console.log('5. Multi-turn chat...');
|
||||
try {
|
||||
const chatResponse = await client.chatMessages([
|
||||
{ role: 'system', content: 'You are a helpful math tutor.' },
|
||||
{ role: 'user', content: 'What is the Pythagorean theorem?' }
|
||||
], 'default');
|
||||
console.log(' Content:', chatResponse.content);
|
||||
console.log(' Model:', chatResponse.model);
|
||||
console.log(' Usage:', chatResponse.usage);
|
||||
} catch (error) {
|
||||
console.log(' Error:', error.message);
|
||||
}
|
||||
console.log('');
|
||||
|
||||
console.log('=== Examples completed ===');
|
||||
}
|
||||
|
||||
// Run examples
|
||||
main().catch(console.error);
|
||||
@@ -855,7 +855,7 @@ export async function cliCommand(
|
||||
console.log(chalk.gray(' --model <model> Model override'));
|
||||
console.log(chalk.gray(' --cd <path> Working directory'));
|
||||
console.log(chalk.gray(' --includeDirs <dirs> Additional directories'));
|
||||
console.log(chalk.gray(' --timeout <ms> Timeout (default: 300000)'));
|
||||
console.log(chalk.gray(' --timeout <ms> Timeout (default: 0=disabled)'));
|
||||
console.log(chalk.gray(' --resume [id] Resume previous session'));
|
||||
console.log(chalk.gray(' --cache <items> Cache: comma-separated @patterns and text'));
|
||||
console.log(chalk.gray(' --inject-mode <m> Inject mode: none, full, progressive'));
|
||||
|
||||
@@ -6,7 +6,7 @@
|
||||
import chalk from 'chalk';
|
||||
import { existsSync, readFileSync, writeFileSync, mkdirSync } from 'fs';
|
||||
import { join, dirname } from 'path';
|
||||
import { tmpdir } from 'os';
|
||||
import { homedir } from 'os';
|
||||
|
||||
interface HookOptions {
|
||||
stdin?: boolean;
|
||||
@@ -53,9 +53,10 @@ async function readStdin(): Promise<string> {
|
||||
|
||||
/**
|
||||
* Get session state file path
|
||||
* Uses ~/.claude/.ccw-sessions/ for reliable persistence across sessions
|
||||
*/
|
||||
function getSessionStateFile(sessionId: string): string {
|
||||
const stateDir = join(tmpdir(), '.ccw-sessions');
|
||||
const stateDir = join(homedir(), '.claude', '.ccw-sessions');
|
||||
if (!existsSync(stateDir)) {
|
||||
mkdirSync(stateDir, { recursive: true });
|
||||
}
|
||||
|
||||
@@ -0,0 +1,441 @@
|
||||
/**
|
||||
* LiteLLM API Config Manager
|
||||
* Manages provider credentials, endpoint configurations, and model discovery
|
||||
*/
|
||||
|
||||
import { join } from 'path';
|
||||
import { readFileSync, writeFileSync, existsSync, mkdirSync } from 'fs';
|
||||
import { homedir } from 'os';
|
||||
|
||||
// ===========================
|
||||
// Type Definitions
|
||||
// ===========================
|
||||
|
||||
export type ProviderType =
|
||||
| 'openai'
|
||||
| 'anthropic'
|
||||
| 'google'
|
||||
| 'cohere'
|
||||
| 'azure'
|
||||
| 'bedrock'
|
||||
| 'vertexai'
|
||||
| 'huggingface'
|
||||
| 'ollama'
|
||||
| 'custom';
|
||||
|
||||
export interface ProviderCredential {
|
||||
id: string;
|
||||
name: string;
|
||||
type: ProviderType;
|
||||
apiKey?: string;
|
||||
baseUrl?: string;
|
||||
apiVersion?: string;
|
||||
region?: string;
|
||||
projectId?: string;
|
||||
organizationId?: string;
|
||||
enabled: boolean;
|
||||
metadata?: Record<string, any>;
|
||||
createdAt: string;
|
||||
updatedAt: string;
|
||||
}
|
||||
|
||||
export interface EndpointConfig {
|
||||
id: string;
|
||||
name: string;
|
||||
providerId: string;
|
||||
model: string;
|
||||
alias?: string;
|
||||
temperature?: number;
|
||||
maxTokens?: number;
|
||||
topP?: number;
|
||||
enabled: boolean;
|
||||
metadata?: Record<string, any>;
|
||||
createdAt: string;
|
||||
updatedAt: string;
|
||||
}
|
||||
|
||||
export interface ModelInfo {
|
||||
id: string;
|
||||
name: string;
|
||||
provider: ProviderType;
|
||||
contextWindow: number;
|
||||
supportsFunctions: boolean;
|
||||
supportsStreaming: boolean;
|
||||
inputCostPer1k?: number;
|
||||
outputCostPer1k?: number;
|
||||
}
|
||||
|
||||
export interface LiteLLMApiConfig {
|
||||
version: string;
|
||||
providers: ProviderCredential[];
|
||||
endpoints: EndpointConfig[];
|
||||
}
|
||||
|
||||
// ===========================
|
||||
// Model Definitions
|
||||
// ===========================
|
||||
|
||||
export const PROVIDER_MODELS: Record<ProviderType, ModelInfo[]> = {
|
||||
openai: [
|
||||
{
|
||||
id: 'gpt-4-turbo',
|
||||
name: 'GPT-4 Turbo',
|
||||
provider: 'openai',
|
||||
contextWindow: 128000,
|
||||
supportsFunctions: true,
|
||||
supportsStreaming: true,
|
||||
inputCostPer1k: 0.01,
|
||||
outputCostPer1k: 0.03,
|
||||
},
|
||||
{
|
||||
id: 'gpt-4',
|
||||
name: 'GPT-4',
|
||||
provider: 'openai',
|
||||
contextWindow: 8192,
|
||||
supportsFunctions: true,
|
||||
supportsStreaming: true,
|
||||
inputCostPer1k: 0.03,
|
||||
outputCostPer1k: 0.06,
|
||||
},
|
||||
{
|
||||
id: 'gpt-3.5-turbo',
|
||||
name: 'GPT-3.5 Turbo',
|
||||
provider: 'openai',
|
||||
contextWindow: 16385,
|
||||
supportsFunctions: true,
|
||||
supportsStreaming: true,
|
||||
inputCostPer1k: 0.0005,
|
||||
outputCostPer1k: 0.0015,
|
||||
},
|
||||
],
|
||||
anthropic: [
|
||||
{
|
||||
id: 'claude-3-opus-20240229',
|
||||
name: 'Claude 3 Opus',
|
||||
provider: 'anthropic',
|
||||
contextWindow: 200000,
|
||||
supportsFunctions: true,
|
||||
supportsStreaming: true,
|
||||
inputCostPer1k: 0.015,
|
||||
outputCostPer1k: 0.075,
|
||||
},
|
||||
{
|
||||
id: 'claude-3-sonnet-20240229',
|
||||
name: 'Claude 3 Sonnet',
|
||||
provider: 'anthropic',
|
||||
contextWindow: 200000,
|
||||
supportsFunctions: true,
|
||||
supportsStreaming: true,
|
||||
inputCostPer1k: 0.003,
|
||||
outputCostPer1k: 0.015,
|
||||
},
|
||||
{
|
||||
id: 'claude-3-haiku-20240307',
|
||||
name: 'Claude 3 Haiku',
|
||||
provider: 'anthropic',
|
||||
contextWindow: 200000,
|
||||
supportsFunctions: true,
|
||||
supportsStreaming: true,
|
||||
inputCostPer1k: 0.00025,
|
||||
outputCostPer1k: 0.00125,
|
||||
},
|
||||
],
|
||||
google: [
|
||||
{
|
||||
id: 'gemini-pro',
|
||||
name: 'Gemini Pro',
|
||||
provider: 'google',
|
||||
contextWindow: 32768,
|
||||
supportsFunctions: true,
|
||||
supportsStreaming: true,
|
||||
},
|
||||
{
|
||||
id: 'gemini-pro-vision',
|
||||
name: 'Gemini Pro Vision',
|
||||
provider: 'google',
|
||||
contextWindow: 16384,
|
||||
supportsFunctions: false,
|
||||
supportsStreaming: true,
|
||||
},
|
||||
],
|
||||
cohere: [
|
||||
{
|
||||
id: 'command',
|
||||
name: 'Command',
|
||||
provider: 'cohere',
|
||||
contextWindow: 4096,
|
||||
supportsFunctions: false,
|
||||
supportsStreaming: true,
|
||||
},
|
||||
{
|
||||
id: 'command-light',
|
||||
name: 'Command Light',
|
||||
provider: 'cohere',
|
||||
contextWindow: 4096,
|
||||
supportsFunctions: false,
|
||||
supportsStreaming: true,
|
||||
},
|
||||
],
|
||||
azure: [],
|
||||
bedrock: [],
|
||||
vertexai: [],
|
||||
huggingface: [],
|
||||
ollama: [],
|
||||
custom: [],
|
||||
};
|
||||
|
||||
// ===========================
|
||||
// Config File Management
|
||||
// ===========================
|
||||
|
||||
const CONFIG_DIR = join(homedir(), '.claude', 'litellm');
|
||||
const CONFIG_FILE = join(CONFIG_DIR, 'config.json');
|
||||
|
||||
function ensureConfigDir(): void {
|
||||
if (!existsSync(CONFIG_DIR)) {
|
||||
mkdirSync(CONFIG_DIR, { recursive: true });
|
||||
}
|
||||
}
|
||||
|
||||
function loadConfig(): LiteLLMApiConfig {
|
||||
ensureConfigDir();
|
||||
|
||||
if (!existsSync(CONFIG_FILE)) {
|
||||
const defaultConfig: LiteLLMApiConfig = {
|
||||
version: '1.0.0',
|
||||
providers: [],
|
||||
endpoints: [],
|
||||
};
|
||||
saveConfig(defaultConfig);
|
||||
return defaultConfig;
|
||||
}
|
||||
|
||||
try {
|
||||
const content = readFileSync(CONFIG_FILE, 'utf-8');
|
||||
return JSON.parse(content);
|
||||
} catch (err) {
|
||||
throw new Error(`Failed to load config: ${(err as Error).message}`);
|
||||
}
|
||||
}
|
||||
|
||||
function saveConfig(config: LiteLLMApiConfig): void {
|
||||
ensureConfigDir();
|
||||
|
||||
try {
|
||||
writeFileSync(CONFIG_FILE, JSON.stringify(config, null, 2), 'utf-8');
|
||||
} catch (err) {
|
||||
throw new Error(`Failed to save config: ${(err as Error).message}`);
|
||||
}
|
||||
}
|
||||
|
||||
// ===========================
|
||||
// Provider Management
|
||||
// ===========================
|
||||
|
||||
export function getAllProviders(): ProviderCredential[] {
|
||||
const config = loadConfig();
|
||||
return config.providers;
|
||||
}
|
||||
|
||||
export function getProvider(id: string): ProviderCredential | null {
|
||||
const config = loadConfig();
|
||||
return config.providers.find((p) => p.id === id) || null;
|
||||
}
|
||||
|
||||
export function createProvider(
|
||||
data: Omit<ProviderCredential, 'id' | 'createdAt' | 'updatedAt'>
|
||||
): ProviderCredential {
|
||||
const config = loadConfig();
|
||||
|
||||
const now = new Date().toISOString();
|
||||
const provider: ProviderCredential = {
|
||||
...data,
|
||||
id: `provider-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`,
|
||||
createdAt: now,
|
||||
updatedAt: now,
|
||||
};
|
||||
|
||||
config.providers.push(provider);
|
||||
saveConfig(config);
|
||||
|
||||
return provider;
|
||||
}
|
||||
|
||||
export function updateProvider(
|
||||
id: string,
|
||||
updates: Partial<ProviderCredential>
|
||||
): ProviderCredential | null {
|
||||
const config = loadConfig();
|
||||
|
||||
const index = config.providers.findIndex((p) => p.id === id);
|
||||
if (index === -1) {
|
||||
return null;
|
||||
}
|
||||
|
||||
const updated: ProviderCredential = {
|
||||
...config.providers[index],
|
||||
...updates,
|
||||
id,
|
||||
updatedAt: new Date().toISOString(),
|
||||
};
|
||||
|
||||
config.providers[index] = updated;
|
||||
saveConfig(config);
|
||||
|
||||
return updated;
|
||||
}
|
||||
|
||||
export function deleteProvider(id: string): { success: boolean } {
|
||||
const config = loadConfig();
|
||||
|
||||
const index = config.providers.findIndex((p) => p.id === id);
|
||||
if (index === -1) {
|
||||
return { success: false };
|
||||
}
|
||||
|
||||
config.providers.splice(index, 1);
|
||||
|
||||
// Also delete endpoints using this provider
|
||||
config.endpoints = config.endpoints.filter((e) => e.providerId !== id);
|
||||
|
||||
saveConfig(config);
|
||||
|
||||
return { success: true };
|
||||
}
|
||||
|
||||
export async function testProviderConnection(
|
||||
providerId: string
|
||||
): Promise<{ success: boolean; error?: string }> {
|
||||
const provider = getProvider(providerId);
|
||||
|
||||
if (!provider) {
|
||||
return { success: false, error: 'Provider not found' };
|
||||
}
|
||||
|
||||
if (!provider.enabled) {
|
||||
return { success: false, error: 'Provider is disabled' };
|
||||
}
|
||||
|
||||
// Basic validation
|
||||
if (!provider.apiKey && provider.type !== 'ollama' && provider.type !== 'custom') {
|
||||
return { success: false, error: 'API key is required for this provider type' };
|
||||
}
|
||||
|
||||
// TODO: Implement actual provider connection testing using litellm-client
|
||||
// For now, just validate the configuration
|
||||
return { success: true };
|
||||
}
|
||||
|
||||
// ===========================
|
||||
// Endpoint Management
|
||||
// ===========================
|
||||
|
||||
export function getAllEndpoints(): EndpointConfig[] {
|
||||
const config = loadConfig();
|
||||
return config.endpoints;
|
||||
}
|
||||
|
||||
export function getEndpoint(id: string): EndpointConfig | null {
|
||||
const config = loadConfig();
|
||||
return config.endpoints.find((e) => e.id === id) || null;
|
||||
}
|
||||
|
||||
export function createEndpoint(
|
||||
data: Omit<EndpointConfig, 'id' | 'createdAt' | 'updatedAt'>
|
||||
): EndpointConfig {
|
||||
const config = loadConfig();
|
||||
|
||||
// Validate provider exists
|
||||
const provider = config.providers.find((p) => p.id === data.providerId);
|
||||
if (!provider) {
|
||||
throw new Error('Provider not found');
|
||||
}
|
||||
|
||||
const now = new Date().toISOString();
|
||||
const endpoint: EndpointConfig = {
|
||||
...data,
|
||||
id: `endpoint-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`,
|
||||
createdAt: now,
|
||||
updatedAt: now,
|
||||
};
|
||||
|
||||
config.endpoints.push(endpoint);
|
||||
saveConfig(config);
|
||||
|
||||
return endpoint;
|
||||
}
|
||||
|
||||
export function updateEndpoint(
|
||||
id: string,
|
||||
updates: Partial<EndpointConfig>
|
||||
): EndpointConfig | null {
|
||||
const config = loadConfig();
|
||||
|
||||
const index = config.endpoints.findIndex((e) => e.id === id);
|
||||
if (index === -1) {
|
||||
return null;
|
||||
}
|
||||
|
||||
// Validate provider if being updated
|
||||
if (updates.providerId) {
|
||||
const provider = config.providers.find((p) => p.id === updates.providerId);
|
||||
if (!provider) {
|
||||
throw new Error('Provider not found');
|
||||
}
|
||||
}
|
||||
|
||||
const updated: EndpointConfig = {
|
||||
...config.endpoints[index],
|
||||
...updates,
|
||||
id,
|
||||
updatedAt: new Date().toISOString(),
|
||||
};
|
||||
|
||||
config.endpoints[index] = updated;
|
||||
saveConfig(config);
|
||||
|
||||
return updated;
|
||||
}
|
||||
|
||||
export function deleteEndpoint(id: string): { success: boolean } {
|
||||
const config = loadConfig();
|
||||
|
||||
const index = config.endpoints.findIndex((e) => e.id === id);
|
||||
if (index === -1) {
|
||||
return { success: false };
|
||||
}
|
||||
|
||||
config.endpoints.splice(index, 1);
|
||||
saveConfig(config);
|
||||
|
||||
return { success: true };
|
||||
}
|
||||
|
||||
// ===========================
|
||||
// Model Discovery
|
||||
// ===========================
|
||||
|
||||
export function getModelsForProviderType(providerType: ProviderType): ModelInfo[] | null {
|
||||
return PROVIDER_MODELS[providerType] || null;
|
||||
}
|
||||
|
||||
export function getAllModels(): Record<ProviderType, ModelInfo[]> {
|
||||
return PROVIDER_MODELS;
|
||||
}
|
||||
|
||||
// ===========================
|
||||
// Config Access
|
||||
// ===========================
|
||||
|
||||
export function getFullConfig(): LiteLLMApiConfig {
|
||||
return loadConfig();
|
||||
}
|
||||
|
||||
export function resetConfig(): void {
|
||||
const defaultConfig: LiteLLMApiConfig = {
|
||||
version: '1.0.0',
|
||||
providers: [],
|
||||
endpoints: [],
|
||||
};
|
||||
saveConfig(defaultConfig);
|
||||
}
|
||||
@@ -25,10 +25,33 @@ export interface ModelInfo {
|
||||
}
|
||||
|
||||
/**
|
||||
* Predefined models for each provider
|
||||
* Embedding model information metadata
|
||||
*/
|
||||
export interface EmbeddingModelInfo {
|
||||
/** Model identifier (used in API calls) */
|
||||
id: string;
|
||||
|
||||
/** Human-readable display name */
|
||||
name: string;
|
||||
|
||||
/** Embedding dimensions */
|
||||
dimensions: number;
|
||||
|
||||
/** Maximum input tokens */
|
||||
maxTokens: number;
|
||||
|
||||
/** Provider identifier */
|
||||
provider: string;
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* Predefined models for each API format
|
||||
* Used for UI selection and validation
|
||||
* Note: Most providers use OpenAI-compatible format
|
||||
*/
|
||||
export const PROVIDER_MODELS: Record<ProviderType, ModelInfo[]> = {
|
||||
// OpenAI-compatible format (used by OpenAI, DeepSeek, Ollama, etc.)
|
||||
openai: [
|
||||
{
|
||||
id: 'gpt-4o',
|
||||
@@ -49,19 +72,32 @@ export const PROVIDER_MODELS: Record<ProviderType, ModelInfo[]> = {
|
||||
supportsCaching: true
|
||||
},
|
||||
{
|
||||
id: 'o1-mini',
|
||||
name: 'O1 Mini',
|
||||
contextWindow: 128000,
|
||||
supportsCaching: true
|
||||
id: 'deepseek-chat',
|
||||
name: 'DeepSeek Chat',
|
||||
contextWindow: 64000,
|
||||
supportsCaching: false
|
||||
},
|
||||
{
|
||||
id: 'gpt-4-turbo',
|
||||
name: 'GPT-4 Turbo',
|
||||
id: 'deepseek-coder',
|
||||
name: 'DeepSeek Coder',
|
||||
contextWindow: 64000,
|
||||
supportsCaching: false
|
||||
},
|
||||
{
|
||||
id: 'llama3.2',
|
||||
name: 'Llama 3.2',
|
||||
contextWindow: 128000,
|
||||
supportsCaching: false
|
||||
},
|
||||
{
|
||||
id: 'qwen2.5-coder',
|
||||
name: 'Qwen 2.5 Coder',
|
||||
contextWindow: 32000,
|
||||
supportsCaching: false
|
||||
}
|
||||
],
|
||||
|
||||
// Anthropic format
|
||||
anthropic: [
|
||||
{
|
||||
id: 'claude-sonnet-4-20250514',
|
||||
@@ -89,135 +125,7 @@ export const PROVIDER_MODELS: Record<ProviderType, ModelInfo[]> = {
|
||||
}
|
||||
],
|
||||
|
||||
ollama: [
|
||||
{
|
||||
id: 'llama3.2',
|
||||
name: 'Llama 3.2',
|
||||
contextWindow: 128000,
|
||||
supportsCaching: false
|
||||
},
|
||||
{
|
||||
id: 'llama3.1',
|
||||
name: 'Llama 3.1',
|
||||
contextWindow: 128000,
|
||||
supportsCaching: false
|
||||
},
|
||||
{
|
||||
id: 'qwen2.5-coder',
|
||||
name: 'Qwen 2.5 Coder',
|
||||
contextWindow: 32000,
|
||||
supportsCaching: false
|
||||
},
|
||||
{
|
||||
id: 'codellama',
|
||||
name: 'Code Llama',
|
||||
contextWindow: 16000,
|
||||
supportsCaching: false
|
||||
},
|
||||
{
|
||||
id: 'mistral',
|
||||
name: 'Mistral',
|
||||
contextWindow: 32000,
|
||||
supportsCaching: false
|
||||
}
|
||||
],
|
||||
|
||||
azure: [
|
||||
{
|
||||
id: 'gpt-4o',
|
||||
name: 'GPT-4o (Azure)',
|
||||
contextWindow: 128000,
|
||||
supportsCaching: true
|
||||
},
|
||||
{
|
||||
id: 'gpt-4o-mini',
|
||||
name: 'GPT-4o Mini (Azure)',
|
||||
contextWindow: 128000,
|
||||
supportsCaching: true
|
||||
},
|
||||
{
|
||||
id: 'gpt-4-turbo',
|
||||
name: 'GPT-4 Turbo (Azure)',
|
||||
contextWindow: 128000,
|
||||
supportsCaching: false
|
||||
},
|
||||
{
|
||||
id: 'gpt-35-turbo',
|
||||
name: 'GPT-3.5 Turbo (Azure)',
|
||||
contextWindow: 16000,
|
||||
supportsCaching: false
|
||||
}
|
||||
],
|
||||
|
||||
google: [
|
||||
{
|
||||
id: 'gemini-2.0-flash-exp',
|
||||
name: 'Gemini 2.0 Flash Experimental',
|
||||
contextWindow: 1048576,
|
||||
supportsCaching: true
|
||||
},
|
||||
{
|
||||
id: 'gemini-1.5-pro',
|
||||
name: 'Gemini 1.5 Pro',
|
||||
contextWindow: 2097152,
|
||||
supportsCaching: true
|
||||
},
|
||||
{
|
||||
id: 'gemini-1.5-flash',
|
||||
name: 'Gemini 1.5 Flash',
|
||||
contextWindow: 1048576,
|
||||
supportsCaching: true
|
||||
},
|
||||
{
|
||||
id: 'gemini-1.0-pro',
|
||||
name: 'Gemini 1.0 Pro',
|
||||
contextWindow: 32000,
|
||||
supportsCaching: false
|
||||
}
|
||||
],
|
||||
|
||||
mistral: [
|
||||
{
|
||||
id: 'mistral-large-latest',
|
||||
name: 'Mistral Large',
|
||||
contextWindow: 128000,
|
||||
supportsCaching: false
|
||||
},
|
||||
{
|
||||
id: 'mistral-medium-latest',
|
||||
name: 'Mistral Medium',
|
||||
contextWindow: 32000,
|
||||
supportsCaching: false
|
||||
},
|
||||
{
|
||||
id: 'mistral-small-latest',
|
||||
name: 'Mistral Small',
|
||||
contextWindow: 32000,
|
||||
supportsCaching: false
|
||||
},
|
||||
{
|
||||
id: 'codestral-latest',
|
||||
name: 'Codestral',
|
||||
contextWindow: 32000,
|
||||
supportsCaching: false
|
||||
}
|
||||
],
|
||||
|
||||
deepseek: [
|
||||
{
|
||||
id: 'deepseek-chat',
|
||||
name: 'DeepSeek Chat',
|
||||
contextWindow: 64000,
|
||||
supportsCaching: false
|
||||
},
|
||||
{
|
||||
id: 'deepseek-coder',
|
||||
name: 'DeepSeek Coder',
|
||||
contextWindow: 64000,
|
||||
supportsCaching: false
|
||||
}
|
||||
],
|
||||
|
||||
// Custom format
|
||||
custom: [
|
||||
{
|
||||
id: 'custom-model',
|
||||
@@ -237,6 +145,61 @@ export function getModelsForProvider(providerType: ProviderType): ModelInfo[] {
|
||||
return PROVIDER_MODELS[providerType] || [];
|
||||
}
|
||||
|
||||
/**
|
||||
* Predefined embedding models for each API format
|
||||
* Used for UI selection and validation
|
||||
*/
|
||||
export const EMBEDDING_MODELS: Record<ProviderType, EmbeddingModelInfo[]> = {
|
||||
// OpenAI embedding models
|
||||
openai: [
|
||||
{
|
||||
id: 'text-embedding-3-small',
|
||||
name: 'Text Embedding 3 Small',
|
||||
dimensions: 1536,
|
||||
maxTokens: 8191,
|
||||
provider: 'openai'
|
||||
},
|
||||
{
|
||||
id: 'text-embedding-3-large',
|
||||
name: 'Text Embedding 3 Large',
|
||||
dimensions: 3072,
|
||||
maxTokens: 8191,
|
||||
provider: 'openai'
|
||||
},
|
||||
{
|
||||
id: 'text-embedding-ada-002',
|
||||
name: 'Ada 002',
|
||||
dimensions: 1536,
|
||||
maxTokens: 8191,
|
||||
provider: 'openai'
|
||||
}
|
||||
],
|
||||
|
||||
// Anthropic doesn't have embedding models
|
||||
anthropic: [],
|
||||
|
||||
// Custom embedding models
|
||||
custom: [
|
||||
{
|
||||
id: 'custom-embedding',
|
||||
name: 'Custom Embedding',
|
||||
dimensions: 1536,
|
||||
maxTokens: 8192,
|
||||
provider: 'custom'
|
||||
}
|
||||
]
|
||||
};
|
||||
|
||||
/**
|
||||
* Get embedding models for a specific provider
|
||||
* @param providerType - Provider type to get embedding models for
|
||||
* @returns Array of embedding model information
|
||||
*/
|
||||
export function getEmbeddingModelsForProvider(providerType: ProviderType): EmbeddingModelInfo[] {
|
||||
return EMBEDDING_MODELS[providerType] || [];
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* Get model information by ID within a provider
|
||||
* @param providerType - Provider type
|
||||
|
||||
@@ -181,29 +181,13 @@ function deleteHookFromSettings(projectPath, scope, event, hookIndex) {
|
||||
}
|
||||
|
||||
// ========================================
|
||||
// Session State Tracking (for progressive disclosure)
|
||||
// Session State Tracking
|
||||
// ========================================
|
||||
|
||||
// Track sessions that have received startup context
|
||||
// Key: sessionId, Value: timestamp of first context load
|
||||
const sessionContextState = new Map<string, {
|
||||
firstLoad: string;
|
||||
loadCount: number;
|
||||
lastPrompt?: string;
|
||||
}>();
|
||||
|
||||
// Cleanup old sessions (older than 24 hours)
|
||||
function cleanupOldSessions() {
|
||||
const cutoff = Date.now() - 24 * 60 * 60 * 1000;
|
||||
for (const [sessionId, state] of sessionContextState.entries()) {
|
||||
if (new Date(state.firstLoad).getTime() < cutoff) {
|
||||
sessionContextState.delete(sessionId);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Run cleanup every hour
|
||||
setInterval(cleanupOldSessions, 60 * 60 * 1000);
|
||||
// NOTE: Session state is managed by the CLI command (src/commands/hook.ts)
|
||||
// using file-based persistence (~/.claude/.ccw-sessions/).
|
||||
// This ensures consistent state tracking across all invocation methods.
|
||||
// The /api/hook endpoint delegates to SessionClusteringService without
|
||||
// managing its own state, as the authoritative state lives in the CLI layer.
|
||||
|
||||
// ========================================
|
||||
// Route Handler
|
||||
@@ -286,7 +270,8 @@ export async function handleHooksRoutes(ctx: RouteContext): Promise<boolean> {
|
||||
}
|
||||
|
||||
// API: Unified Session Context endpoint (Progressive Disclosure)
|
||||
// Automatically detects first prompt vs subsequent prompts
|
||||
// DEPRECATED: Use CLI command `ccw hook session-context --stdin` instead.
|
||||
// This endpoint now uses file-based state (shared with CLI) for consistency.
|
||||
// - First prompt: returns cluster-based session overview
|
||||
// - Subsequent prompts: returns intent-matched sessions based on prompt
|
||||
if (pathname === '/api/hook/session-context' && req.method === 'POST') {
|
||||
@@ -306,21 +291,30 @@ export async function handleHooksRoutes(ctx: RouteContext): Promise<boolean> {
|
||||
const { SessionClusteringService } = await import('../session-clustering-service.js');
|
||||
const clusteringService = new SessionClusteringService(projectPath);
|
||||
|
||||
// Check if this is the first prompt for this session
|
||||
const existingState = sessionContextState.get(sessionId);
|
||||
// Use file-based session state (shared with CLI hook.ts)
|
||||
const sessionStateDir = join(homedir(), '.claude', '.ccw-sessions');
|
||||
const sessionStateFile = join(sessionStateDir, `session-${sessionId}.json`);
|
||||
|
||||
let existingState: { firstLoad: string; loadCount: number; lastPrompt?: string } | null = null;
|
||||
if (existsSync(sessionStateFile)) {
|
||||
try {
|
||||
existingState = JSON.parse(readFileSync(sessionStateFile, 'utf-8'));
|
||||
} catch {
|
||||
existingState = null;
|
||||
}
|
||||
}
|
||||
|
||||
const isFirstPrompt = !existingState;
|
||||
|
||||
// Update session state
|
||||
if (isFirstPrompt) {
|
||||
sessionContextState.set(sessionId, {
|
||||
firstLoad: new Date().toISOString(),
|
||||
loadCount: 1,
|
||||
lastPrompt: prompt
|
||||
});
|
||||
} else {
|
||||
existingState.loadCount++;
|
||||
existingState.lastPrompt = prompt;
|
||||
// Update session state (file-based)
|
||||
const newState = isFirstPrompt
|
||||
? { firstLoad: new Date().toISOString(), loadCount: 1, lastPrompt: prompt }
|
||||
: { ...existingState!, loadCount: existingState!.loadCount + 1, lastPrompt: prompt };
|
||||
|
||||
if (!existsSync(sessionStateDir)) {
|
||||
mkdirSync(sessionStateDir, { recursive: true });
|
||||
}
|
||||
writeFileSync(sessionStateFile, JSON.stringify(newState, null, 2));
|
||||
|
||||
// Determine which type of context to return
|
||||
let contextType: 'session-start' | 'context';
|
||||
@@ -351,7 +345,7 @@ export async function handleHooksRoutes(ctx: RouteContext): Promise<boolean> {
|
||||
success: true,
|
||||
type: contextType,
|
||||
isFirstPrompt,
|
||||
loadCount: sessionContextState.get(sessionId)?.loadCount || 1,
|
||||
loadCount: newState.loadCount,
|
||||
content,
|
||||
sessionId
|
||||
};
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -23,6 +23,8 @@ const i18n = {
|
||||
'common.loading': 'Loading...',
|
||||
'common.error': 'Error',
|
||||
'common.success': 'Success',
|
||||
'common.deleteSuccess': 'Deleted successfully',
|
||||
'common.deleteFailed': 'Delete failed',
|
||||
'common.retry': 'Retry',
|
||||
'common.refresh': 'Refresh',
|
||||
'common.minutes': 'minutes',
|
||||
@@ -1345,17 +1347,64 @@ const i18n = {
|
||||
'apiSettings.editEndpoint': 'Edit Endpoint',
|
||||
'apiSettings.deleteEndpoint': 'Delete Endpoint',
|
||||
'apiSettings.providerType': 'Provider Type',
|
||||
'apiSettings.apiFormat': 'API Format',
|
||||
'apiSettings.compatible': 'Compatible',
|
||||
'apiSettings.customFormat': 'Custom Format',
|
||||
'apiSettings.apiFormatHint': 'Most providers (DeepSeek, Ollama, etc.) use OpenAI-compatible format',
|
||||
'apiSettings.displayName': 'Display Name',
|
||||
'apiSettings.apiKey': 'API Key',
|
||||
'apiSettings.apiBaseUrl': 'API Base URL',
|
||||
'apiSettings.useEnvVar': 'Use environment variable',
|
||||
'apiSettings.enableProvider': 'Enable provider',
|
||||
'apiSettings.advancedSettings': 'Advanced Settings',
|
||||
'apiSettings.basicInfo': 'Basic Info',
|
||||
'apiSettings.endpointSettings': 'Endpoint Settings',
|
||||
'apiSettings.timeout': 'Timeout (seconds)',
|
||||
'apiSettings.seconds': 'seconds',
|
||||
'apiSettings.timeoutHint': 'Request timeout in seconds (default: 300)',
|
||||
'apiSettings.maxRetries': 'Max Retries',
|
||||
'apiSettings.maxRetriesHint': 'Maximum retry attempts on failure',
|
||||
'apiSettings.organization': 'Organization ID',
|
||||
'apiSettings.organizationHint': 'OpenAI organization ID (org-...)',
|
||||
'apiSettings.apiVersion': 'API Version',
|
||||
'apiSettings.apiVersionHint': 'Azure API version (e.g., 2024-02-01)',
|
||||
'apiSettings.rpm': 'RPM Limit',
|
||||
'apiSettings.tpm': 'TPM Limit',
|
||||
'apiSettings.unlimited': 'Unlimited',
|
||||
'apiSettings.proxy': 'Proxy Server',
|
||||
'apiSettings.proxyHint': 'HTTP proxy server URL',
|
||||
'apiSettings.customHeaders': 'Custom Headers',
|
||||
'apiSettings.customHeadersHint': 'JSON object with custom HTTP headers',
|
||||
'apiSettings.invalidJsonHeaders': 'Invalid JSON in custom headers',
|
||||
'apiSettings.searchProviders': 'Search providers...',
|
||||
'apiSettings.selectProvider': 'Select a Provider',
|
||||
'apiSettings.selectProviderHint': 'Select a provider from the list to view and manage its settings',
|
||||
'apiSettings.noProvidersFound': 'No providers found',
|
||||
'apiSettings.llmModels': 'LLM Models',
|
||||
'apiSettings.embeddingModels': 'Embedding Models',
|
||||
'apiSettings.manageModels': 'Manage',
|
||||
'apiSettings.addModel': 'Add Model',
|
||||
'apiSettings.multiKeySettings': 'Multi-Key Settings',
|
||||
'apiSettings.noModels': 'No models configured',
|
||||
'apiSettings.previewModel': 'Preview',
|
||||
'apiSettings.modelSettings': 'Model Settings',
|
||||
'apiSettings.deleteModel': 'Delete Model',
|
||||
'apiSettings.providerUpdated': 'Provider updated',
|
||||
'apiSettings.preview': 'Preview',
|
||||
'apiSettings.used': 'used',
|
||||
'apiSettings.total': 'total',
|
||||
'apiSettings.testConnection': 'Test Connection',
|
||||
'apiSettings.endpointId': 'Endpoint ID',
|
||||
'apiSettings.endpointIdHint': 'Usage: ccw cli -p "..." --model <endpoint-id>',
|
||||
'apiSettings.endpoints': 'Endpoints',
|
||||
'apiSettings.addEndpointHint': 'Create custom endpoint aliases for CLI usage',
|
||||
'apiSettings.endpointModel': 'Model',
|
||||
'apiSettings.selectEndpoint': 'Select an endpoint',
|
||||
'apiSettings.selectEndpointHint': 'Choose an endpoint from the list to view or edit its settings',
|
||||
'apiSettings.provider': 'Provider',
|
||||
'apiSettings.model': 'Model',
|
||||
'apiSettings.selectModel': 'Select model',
|
||||
'apiSettings.noModelsConfigured': 'No models configured for this provider',
|
||||
'apiSettings.cacheStrategy': 'Cache Strategy',
|
||||
'apiSettings.enableContextCaching': 'Enable Context Caching',
|
||||
'apiSettings.cacheTTL': 'TTL (minutes)',
|
||||
@@ -1386,6 +1435,82 @@ const i18n = {
|
||||
'apiSettings.addProviderFirst': 'Please add a provider first',
|
||||
'apiSettings.failedToLoad': 'Failed to load API settings',
|
||||
'apiSettings.toggleVisibility': 'Toggle visibility',
|
||||
'apiSettings.noProvidersHint': 'Add an API provider to get started',
|
||||
'apiSettings.noEndpointsHint': 'Create custom endpoints for quick access to models',
|
||||
'apiSettings.cache': 'Cache',
|
||||
'apiSettings.off': 'Off',
|
||||
'apiSettings.used': 'used',
|
||||
'apiSettings.total': 'total',
|
||||
'apiSettings.cacheUsage': 'Usage',
|
||||
'apiSettings.cacheSize': 'Size',
|
||||
'apiSettings.endpointsDescription': 'Manage custom API endpoints for quick model access',
|
||||
'apiSettings.totalEndpoints': 'Total Endpoints',
|
||||
'apiSettings.cachedEndpoints': 'Cached Endpoints',
|
||||
'apiSettings.cacheTabHint': 'Configure global cache settings and view statistics in the main panel',
|
||||
'apiSettings.cacheDescription': 'Manage response caching to improve performance and reduce costs',
|
||||
'apiSettings.cachedEntries': 'Cached Entries',
|
||||
'apiSettings.storageUsed': 'Storage Used',
|
||||
'apiSettings.cacheActions': 'Cache Actions',
|
||||
'apiSettings.cacheStatistics': 'Cache Statistics',
|
||||
'apiSettings.globalCache': 'Global Cache',
|
||||
|
||||
// Multi-key management
|
||||
'apiSettings.apiKeys': 'API Keys',
|
||||
'apiSettings.addKey': 'Add Key',
|
||||
'apiSettings.keyLabel': 'Label',
|
||||
'apiSettings.keyValue': 'API Key',
|
||||
'apiSettings.keyWeight': 'Weight',
|
||||
'apiSettings.removeKey': 'Remove',
|
||||
'apiSettings.noKeys': 'No API keys configured',
|
||||
'apiSettings.primaryKey': 'Primary Key',
|
||||
|
||||
// Routing strategy
|
||||
'apiSettings.routingStrategy': 'Routing Strategy',
|
||||
'apiSettings.simpleShuffleRouting': 'Simple Shuffle (Random)',
|
||||
'apiSettings.weightedRouting': 'Weighted Distribution',
|
||||
'apiSettings.latencyRouting': 'Latency-Based',
|
||||
'apiSettings.costRouting': 'Cost-Based',
|
||||
'apiSettings.leastBusyRouting': 'Least Busy',
|
||||
'apiSettings.routingHint': 'How to distribute requests across multiple API keys',
|
||||
|
||||
// Health check
|
||||
'apiSettings.healthCheck': 'Health Check',
|
||||
'apiSettings.enableHealthCheck': 'Enable Health Check',
|
||||
'apiSettings.healthInterval': 'Check Interval (seconds)',
|
||||
'apiSettings.healthCooldown': 'Cooldown (seconds)',
|
||||
'apiSettings.failureThreshold': 'Failure Threshold',
|
||||
'apiSettings.healthStatus': 'Status',
|
||||
'apiSettings.healthy': 'Healthy',
|
||||
'apiSettings.unhealthy': 'Unhealthy',
|
||||
'apiSettings.unknown': 'Unknown',
|
||||
'apiSettings.lastCheck': 'Last Check',
|
||||
'apiSettings.testKey': 'Test Key',
|
||||
'apiSettings.testingKey': 'Testing...',
|
||||
'apiSettings.keyValid': 'Key is valid',
|
||||
'apiSettings.keyInvalid': 'Key is invalid',
|
||||
|
||||
// Embedding models
|
||||
'apiSettings.embeddingDimensions': 'Dimensions',
|
||||
'apiSettings.embeddingMaxTokens': 'Max Tokens',
|
||||
'apiSettings.selectEmbeddingModel': 'Select Embedding Model',
|
||||
|
||||
// Model modal
|
||||
'apiSettings.addLlmModel': 'Add LLM Model',
|
||||
'apiSettings.addEmbeddingModel': 'Add Embedding Model',
|
||||
'apiSettings.modelId': 'Model ID',
|
||||
'apiSettings.modelName': 'Display Name',
|
||||
'apiSettings.modelSeries': 'Series',
|
||||
'apiSettings.selectFromPresets': 'Select from Presets',
|
||||
'apiSettings.customModel': 'Custom Model',
|
||||
'apiSettings.capabilities': 'Capabilities',
|
||||
'apiSettings.streaming': 'Streaming',
|
||||
'apiSettings.functionCalling': 'Function Calling',
|
||||
'apiSettings.vision': 'Vision',
|
||||
'apiSettings.contextWindow': 'Context Window',
|
||||
'apiSettings.description': 'Description',
|
||||
'apiSettings.optional': 'Optional',
|
||||
'apiSettings.modelIdExists': 'Model ID already exists',
|
||||
'apiSettings.useModelTreeToManage': 'Use the model tree to manage individual models',
|
||||
|
||||
// Common
|
||||
'common.cancel': 'Cancel',
|
||||
@@ -1410,6 +1535,7 @@ const i18n = {
|
||||
'common.saveFailed': 'Failed to save',
|
||||
'common.unknownError': 'Unknown error',
|
||||
'common.exception': 'Exception',
|
||||
'common.status': 'Status',
|
||||
|
||||
// Core Memory
|
||||
'title.coreMemory': 'Core Memory',
|
||||
@@ -1537,6 +1663,8 @@ const i18n = {
|
||||
'common.loading': '加载中...',
|
||||
'common.error': '错误',
|
||||
'common.success': '成功',
|
||||
'common.deleteSuccess': '删除成功',
|
||||
'common.deleteFailed': '删除失败',
|
||||
'common.retry': '重试',
|
||||
'common.refresh': '刷新',
|
||||
'common.minutes': '分钟',
|
||||
@@ -2869,17 +2997,64 @@ const i18n = {
|
||||
'apiSettings.editEndpoint': '编辑端点',
|
||||
'apiSettings.deleteEndpoint': '删除端点',
|
||||
'apiSettings.providerType': '提供商类型',
|
||||
'apiSettings.apiFormat': 'API 格式',
|
||||
'apiSettings.compatible': '兼容',
|
||||
'apiSettings.customFormat': '自定义格式',
|
||||
'apiSettings.apiFormatHint': '大多数供应商(DeepSeek、Ollama 等)使用 OpenAI 兼容格式',
|
||||
'apiSettings.displayName': '显示名称',
|
||||
'apiSettings.apiKey': 'API 密钥',
|
||||
'apiSettings.apiBaseUrl': 'API 基础 URL',
|
||||
'apiSettings.useEnvVar': '使用环境变量',
|
||||
'apiSettings.enableProvider': '启用提供商',
|
||||
'apiSettings.advancedSettings': '高级设置',
|
||||
'apiSettings.basicInfo': '基本信息',
|
||||
'apiSettings.endpointSettings': '端点设置',
|
||||
'apiSettings.timeout': '超时时间(秒)',
|
||||
'apiSettings.seconds': '秒',
|
||||
'apiSettings.timeoutHint': '请求超时时间,单位秒(默认:300)',
|
||||
'apiSettings.maxRetries': '最大重试次数',
|
||||
'apiSettings.maxRetriesHint': '失败后最大重试次数',
|
||||
'apiSettings.organization': '组织 ID',
|
||||
'apiSettings.organizationHint': 'OpenAI 组织 ID(org-...)',
|
||||
'apiSettings.apiVersion': 'API 版本',
|
||||
'apiSettings.apiVersionHint': 'Azure API 版本(如 2024-02-01)',
|
||||
'apiSettings.rpm': 'RPM 限制',
|
||||
'apiSettings.tpm': 'TPM 限制',
|
||||
'apiSettings.unlimited': '无限制',
|
||||
'apiSettings.proxy': '代理服务器',
|
||||
'apiSettings.proxyHint': 'HTTP 代理服务器 URL',
|
||||
'apiSettings.customHeaders': '自定义请求头',
|
||||
'apiSettings.customHeadersHint': '自定义 HTTP 请求头的 JSON 对象',
|
||||
'apiSettings.invalidJsonHeaders': '自定义请求头 JSON 格式无效',
|
||||
'apiSettings.searchProviders': '搜索供应商...',
|
||||
'apiSettings.selectProvider': '选择供应商',
|
||||
'apiSettings.selectProviderHint': '从列表中选择一个供应商来查看和管理其设置',
|
||||
'apiSettings.noProvidersFound': '未找到供应商',
|
||||
'apiSettings.llmModels': '大语言模型',
|
||||
'apiSettings.embeddingModels': '向量模型',
|
||||
'apiSettings.manageModels': '管理',
|
||||
'apiSettings.addModel': '添加模型',
|
||||
'apiSettings.multiKeySettings': '多密钥设置',
|
||||
'apiSettings.noModels': '暂无模型配置',
|
||||
'apiSettings.previewModel': '预览',
|
||||
'apiSettings.modelSettings': '模型设置',
|
||||
'apiSettings.deleteModel': '删除模型',
|
||||
'apiSettings.providerUpdated': '供应商已更新',
|
||||
'apiSettings.preview': '预览',
|
||||
'apiSettings.used': '已使用',
|
||||
'apiSettings.total': '总计',
|
||||
'apiSettings.testConnection': '测试连接',
|
||||
'apiSettings.endpointId': '端点 ID',
|
||||
'apiSettings.endpointIdHint': '用法: ccw cli -p "..." --model <端点ID>',
|
||||
'apiSettings.endpoints': '端点',
|
||||
'apiSettings.addEndpointHint': '创建用于 CLI 的自定义端点别名',
|
||||
'apiSettings.endpointModel': '模型',
|
||||
'apiSettings.selectEndpoint': '选择端点',
|
||||
'apiSettings.selectEndpointHint': '从列表中选择一个端点以查看或编辑其设置',
|
||||
'apiSettings.provider': '提供商',
|
||||
'apiSettings.model': '模型',
|
||||
'apiSettings.selectModel': '选择模型',
|
||||
'apiSettings.noModelsConfigured': '该供应商未配置模型',
|
||||
'apiSettings.cacheStrategy': '缓存策略',
|
||||
'apiSettings.enableContextCaching': '启用上下文缓存',
|
||||
'apiSettings.cacheTTL': 'TTL (分钟)',
|
||||
@@ -2910,6 +3085,82 @@ const i18n = {
|
||||
'apiSettings.addProviderFirst': '请先添加提供商',
|
||||
'apiSettings.failedToLoad': '加载 API 设置失败',
|
||||
'apiSettings.toggleVisibility': '切换可见性',
|
||||
'apiSettings.noProvidersHint': '添加 API 提供商以开始使用',
|
||||
'apiSettings.noEndpointsHint': '创建自定义端点以快速访问模型',
|
||||
'apiSettings.cache': '缓存',
|
||||
'apiSettings.off': '关闭',
|
||||
'apiSettings.used': '已用',
|
||||
'apiSettings.total': '总计',
|
||||
'apiSettings.cacheUsage': '使用率',
|
||||
'apiSettings.cacheSize': '大小',
|
||||
'apiSettings.endpointsDescription': '管理自定义 API 端点以快速访问模型',
|
||||
'apiSettings.totalEndpoints': '总端点数',
|
||||
'apiSettings.cachedEndpoints': '缓存端点数',
|
||||
'apiSettings.cacheTabHint': '在主面板中配置全局缓存设置并查看统计信息',
|
||||
'apiSettings.cacheDescription': '管理响应缓存以提高性能并降低成本',
|
||||
'apiSettings.cachedEntries': '缓存条目',
|
||||
'apiSettings.storageUsed': '已用存储',
|
||||
'apiSettings.cacheActions': '缓存操作',
|
||||
'apiSettings.cacheStatistics': '缓存统计',
|
||||
'apiSettings.globalCache': '全局缓存',
|
||||
|
||||
// Multi-key management
|
||||
'apiSettings.apiKeys': 'API 密钥',
|
||||
'apiSettings.addKey': '添加密钥',
|
||||
'apiSettings.keyLabel': '标签',
|
||||
'apiSettings.keyValue': 'API 密钥',
|
||||
'apiSettings.keyWeight': '权重',
|
||||
'apiSettings.removeKey': '移除',
|
||||
'apiSettings.noKeys': '未配置 API 密钥',
|
||||
'apiSettings.primaryKey': '主密钥',
|
||||
|
||||
// Routing strategy
|
||||
'apiSettings.routingStrategy': '路由策略',
|
||||
'apiSettings.simpleShuffleRouting': '简单随机',
|
||||
'apiSettings.weightedRouting': '权重分配',
|
||||
'apiSettings.latencyRouting': '延迟优先',
|
||||
'apiSettings.costRouting': '成本优先',
|
||||
'apiSettings.leastBusyRouting': '最少并发',
|
||||
'apiSettings.routingHint': '如何在多个 API 密钥间分配请求',
|
||||
|
||||
// Health check
|
||||
'apiSettings.healthCheck': '健康检查',
|
||||
'apiSettings.enableHealthCheck': '启用健康检查',
|
||||
'apiSettings.healthInterval': '检查间隔(秒)',
|
||||
'apiSettings.healthCooldown': '冷却时间(秒)',
|
||||
'apiSettings.failureThreshold': '失败阈值',
|
||||
'apiSettings.healthStatus': '状态',
|
||||
'apiSettings.healthy': '健康',
|
||||
'apiSettings.unhealthy': '异常',
|
||||
'apiSettings.unknown': '未知',
|
||||
'apiSettings.lastCheck': '最后检查',
|
||||
'apiSettings.testKey': '测试密钥',
|
||||
'apiSettings.testingKey': '测试中...',
|
||||
'apiSettings.keyValid': '密钥有效',
|
||||
'apiSettings.keyInvalid': '密钥无效',
|
||||
|
||||
// Embedding models
|
||||
'apiSettings.embeddingDimensions': '向量维度',
|
||||
'apiSettings.embeddingMaxTokens': '最大 Token',
|
||||
'apiSettings.selectEmbeddingModel': '选择嵌入模型',
|
||||
|
||||
// Model modal
|
||||
'apiSettings.addLlmModel': '添加 LLM 模型',
|
||||
'apiSettings.addEmbeddingModel': '添加嵌入模型',
|
||||
'apiSettings.modelId': '模型 ID',
|
||||
'apiSettings.modelName': '显示名称',
|
||||
'apiSettings.modelSeries': '模型系列',
|
||||
'apiSettings.selectFromPresets': '从预设选择',
|
||||
'apiSettings.customModel': '自定义模型',
|
||||
'apiSettings.capabilities': '能力',
|
||||
'apiSettings.streaming': '流式输出',
|
||||
'apiSettings.functionCalling': '函数调用',
|
||||
'apiSettings.vision': '视觉能力',
|
||||
'apiSettings.contextWindow': '上下文窗口',
|
||||
'apiSettings.description': '描述',
|
||||
'apiSettings.optional': '可选',
|
||||
'apiSettings.modelIdExists': '模型 ID 已存在',
|
||||
'apiSettings.useModelTreeToManage': '使用模型树管理各个模型',
|
||||
|
||||
// Common
|
||||
'common.cancel': '取消',
|
||||
@@ -2934,6 +3185,7 @@ const i18n = {
|
||||
'common.saveFailed': '保存失败',
|
||||
'common.unknownError': '未知错误',
|
||||
'common.exception': '异常',
|
||||
'common.status': '状态',
|
||||
|
||||
// Core Memory
|
||||
'title.coreMemory': '核心记忆',
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -810,8 +810,8 @@ function buildManualDownloadGuide() {
|
||||
'<i data-lucide="info" class="w-3.5 h-3.5 mt-0.5 flex-shrink-0"></i>' +
|
||||
'<div>' +
|
||||
'<strong>' + (t('codexlens.cacheLocation') || 'Cache Location') + ':</strong><br>' +
|
||||
'<code class="text-xs">Windows: %LOCALAPPDATA%\\Temp\\fastembed_cache</code><br>' +
|
||||
'<code class="text-xs">Linux/Mac: ~/.cache/fastembed</code>' +
|
||||
'<code class="text-xs">Default: ~/.cache/huggingface</code><br>' +
|
||||
'<code class="text-xs text-muted-foreground">(Check HF_HOME env var if set)</code>' +
|
||||
'</div>' +
|
||||
'</div>' +
|
||||
'</div>' +
|
||||
|
||||
@@ -67,7 +67,7 @@ const ParamsSchema = z.object({
|
||||
model: z.string().optional(),
|
||||
cd: z.string().optional(),
|
||||
includeDirs: z.string().optional(),
|
||||
timeout: z.number().default(300000),
|
||||
timeout: z.number().default(0), // 0 = no internal timeout, controlled by external caller (e.g., bash timeout)
|
||||
resume: z.union([z.boolean(), z.string()]).optional(), // true = last, string = single ID or comma-separated IDs
|
||||
id: z.string().optional(), // Custom execution ID (e.g., IMPL-001-step1)
|
||||
noNative: z.boolean().optional(), // Force prompt concatenation instead of native resume
|
||||
@@ -1058,8 +1058,10 @@ async function executeCliTool(
|
||||
reject(new Error(`Failed to spawn ${tool}: ${error.message}`));
|
||||
});
|
||||
|
||||
// Timeout handling
|
||||
const timeoutId = setTimeout(() => {
|
||||
// Timeout handling (timeout=0 disables internal timeout, controlled by external caller)
|
||||
let timeoutId: NodeJS.Timeout | null = null;
|
||||
if (timeout > 0) {
|
||||
timeoutId = setTimeout(() => {
|
||||
timedOut = true;
|
||||
child.kill('SIGTERM');
|
||||
setTimeout(() => {
|
||||
@@ -1068,9 +1070,12 @@ async function executeCliTool(
|
||||
}
|
||||
}, 5000);
|
||||
}, timeout);
|
||||
}
|
||||
|
||||
child.on('close', () => {
|
||||
if (timeoutId) {
|
||||
clearTimeout(timeoutId);
|
||||
}
|
||||
});
|
||||
});
|
||||
}
|
||||
@@ -1115,8 +1120,8 @@ Modes:
|
||||
},
|
||||
timeout: {
|
||||
type: 'number',
|
||||
description: 'Timeout in milliseconds (default: 300000 = 5 minutes)',
|
||||
default: 300000
|
||||
description: 'Timeout in milliseconds (default: 0 = disabled, controlled by external caller)',
|
||||
default: 0
|
||||
}
|
||||
},
|
||||
required: ['tool', 'prompt']
|
||||
|
||||
@@ -6,17 +6,184 @@
|
||||
*/
|
||||
|
||||
/**
|
||||
* Supported LLM provider types
|
||||
* API format types (simplified)
|
||||
* Most providers use OpenAI-compatible format
|
||||
*/
|
||||
export type ProviderType =
|
||||
| 'openai'
|
||||
| 'anthropic'
|
||||
| 'ollama'
|
||||
| 'azure'
|
||||
| 'google'
|
||||
| 'mistral'
|
||||
| 'deepseek'
|
||||
| 'custom';
|
||||
| 'openai' // OpenAI-compatible format (most providers)
|
||||
| 'anthropic' // Anthropic format
|
||||
| 'custom'; // Custom format
|
||||
|
||||
/**
|
||||
* Advanced provider settings for LiteLLM compatibility
|
||||
* Maps to LiteLLM's provider configuration options
|
||||
*/
|
||||
export interface ProviderAdvancedSettings {
|
||||
/** Request timeout in seconds (default: 300) */
|
||||
timeout?: number;
|
||||
|
||||
/** Maximum retry attempts on failure (default: 3) */
|
||||
maxRetries?: number;
|
||||
|
||||
/** Organization ID (OpenAI-specific) */
|
||||
organization?: string;
|
||||
|
||||
/** API version string (Azure-specific, e.g., "2024-02-01") */
|
||||
apiVersion?: string;
|
||||
|
||||
/** Custom HTTP headers as JSON object */
|
||||
customHeaders?: Record<string, string>;
|
||||
|
||||
/** Requests per minute rate limit */
|
||||
rpm?: number;
|
||||
|
||||
/** Tokens per minute rate limit */
|
||||
tpm?: number;
|
||||
|
||||
/** Proxy server URL (e.g., "http://proxy.example.com:8080") */
|
||||
proxy?: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* Model type classification
|
||||
*/
|
||||
export type ModelType = 'llm' | 'embedding';
|
||||
|
||||
/**
|
||||
* Model capability metadata
|
||||
*/
|
||||
export interface ModelCapabilities {
|
||||
/** Whether the model supports streaming responses */
|
||||
streaming?: boolean;
|
||||
|
||||
/** Whether the model supports function/tool calling */
|
||||
functionCalling?: boolean;
|
||||
|
||||
/** Whether the model supports vision/image input */
|
||||
vision?: boolean;
|
||||
|
||||
/** Context window size in tokens */
|
||||
contextWindow?: number;
|
||||
|
||||
/** Embedding dimension (for embedding models only) */
|
||||
embeddingDimension?: number;
|
||||
|
||||
/** Maximum output tokens */
|
||||
maxOutputTokens?: number;
|
||||
}
|
||||
|
||||
/**
|
||||
* Routing strategy for load balancing across multiple keys
|
||||
*/
|
||||
export type RoutingStrategy =
|
||||
| 'simple-shuffle' // Random selection (default, recommended)
|
||||
| 'weighted' // Weight-based distribution
|
||||
| 'latency-based' // Route to lowest latency
|
||||
| 'cost-based' // Route to lowest cost
|
||||
| 'least-busy'; // Route to least concurrent
|
||||
|
||||
/**
|
||||
* Individual API key configuration with optional weight
|
||||
*/
|
||||
export interface ApiKeyEntry {
|
||||
/** Unique identifier */
|
||||
id: string;
|
||||
|
||||
/** API key value or env var reference */
|
||||
key: string;
|
||||
|
||||
/** Display label for this key */
|
||||
label?: string;
|
||||
|
||||
/** Weight for weighted routing (default: 1) */
|
||||
weight?: number;
|
||||
|
||||
/** Whether this key is enabled */
|
||||
enabled: boolean;
|
||||
|
||||
/** Last health check status */
|
||||
healthStatus?: 'healthy' | 'unhealthy' | 'unknown';
|
||||
|
||||
/** Last health check timestamp */
|
||||
lastHealthCheck?: string;
|
||||
|
||||
/** Error message if unhealthy */
|
||||
lastError?: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* Health check configuration
|
||||
*/
|
||||
export interface HealthCheckConfig {
|
||||
/** Enable automatic health checks */
|
||||
enabled: boolean;
|
||||
|
||||
/** Check interval in seconds (default: 300) */
|
||||
intervalSeconds: number;
|
||||
|
||||
/** Cooldown period after failure in seconds (default: 5) */
|
||||
cooldownSeconds: number;
|
||||
|
||||
/** Number of failures before marking unhealthy (default: 3) */
|
||||
failureThreshold: number;
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* Model-specific endpoint settings
|
||||
* Allows per-model configuration overrides
|
||||
*/
|
||||
export interface ModelEndpointSettings {
|
||||
/** Override base URL for this model */
|
||||
baseUrl?: string;
|
||||
|
||||
/** Override timeout for this model */
|
||||
timeout?: number;
|
||||
|
||||
/** Override max retries for this model */
|
||||
maxRetries?: number;
|
||||
|
||||
/** Custom headers for this model */
|
||||
customHeaders?: Record<string, string>;
|
||||
|
||||
/** Cache strategy for this model */
|
||||
cacheStrategy?: CacheStrategy;
|
||||
}
|
||||
|
||||
/**
|
||||
* Model definition with type and grouping
|
||||
*/
|
||||
export interface ModelDefinition {
|
||||
/** Unique identifier for this model */
|
||||
id: string;
|
||||
|
||||
/** Display name for UI */
|
||||
name: string;
|
||||
|
||||
/** Model type: LLM or Embedding */
|
||||
type: ModelType;
|
||||
|
||||
/** Model series for grouping (e.g., "GPT-4", "Claude-3") */
|
||||
series: string;
|
||||
|
||||
/** Whether this model is enabled */
|
||||
enabled: boolean;
|
||||
|
||||
/** Model capabilities */
|
||||
capabilities?: ModelCapabilities;
|
||||
|
||||
/** Model-specific endpoint settings */
|
||||
endpointSettings?: ModelEndpointSettings;
|
||||
|
||||
/** Optional description */
|
||||
description?: string;
|
||||
|
||||
/** Creation timestamp (ISO 8601) */
|
||||
createdAt: string;
|
||||
|
||||
/** Last update timestamp (ISO 8601) */
|
||||
updatedAt: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* Provider credential configuration
|
||||
@@ -41,6 +208,24 @@ export interface ProviderCredential {
|
||||
/** Whether this provider is enabled */
|
||||
enabled: boolean;
|
||||
|
||||
/** Advanced provider settings (optional) */
|
||||
advancedSettings?: ProviderAdvancedSettings;
|
||||
|
||||
/** Multiple API keys for load balancing */
|
||||
apiKeys?: ApiKeyEntry[];
|
||||
|
||||
/** Routing strategy for multi-key load balancing */
|
||||
routingStrategy?: RoutingStrategy;
|
||||
|
||||
/** Health check configuration */
|
||||
healthCheck?: HealthCheckConfig;
|
||||
|
||||
/** LLM models configured for this provider */
|
||||
llmModels?: ModelDefinition[];
|
||||
|
||||
/** Embedding models configured for this provider */
|
||||
embeddingModels?: ModelDefinition[];
|
||||
|
||||
/** Creation timestamp (ISO 8601) */
|
||||
createdAt: string;
|
||||
|
||||
|
||||
@@ -309,7 +309,7 @@ def generate_embeddings(
|
||||
|
||||
# Set/update model configuration for this index
|
||||
vector_store.set_model_config(
|
||||
model_profile, embedder.model_name, embedder.embedding_dim
|
||||
model_profile, embedder.model_name, embedder.embedding_dim, backend=embedding_backend
|
||||
)
|
||||
# Use bulk insert mode for efficient batch ANN index building
|
||||
# This defers ANN updates until end_bulk_insert() is called
|
||||
|
||||
@@ -107,8 +107,9 @@ def _get_model_cache_path(cache_dir: Path, info: Dict) -> Path:
|
||||
Path to the model cache directory
|
||||
"""
|
||||
# HuggingFace Hub naming: models--{org}--{model}
|
||||
model_name = info["model_name"]
|
||||
sanitized_name = f"models--{model_name.replace('/', '--')}"
|
||||
# Use cache_name if available (for mapped ONNX models), else model_name
|
||||
target_name = info.get("cache_name", info["model_name"])
|
||||
sanitized_name = f"models--{target_name.replace('/', '--')}"
|
||||
return cache_dir / sanitized_name
|
||||
|
||||
|
||||
|
||||
@@ -260,7 +260,7 @@ class HybridSearchEngine:
|
||||
return []
|
||||
|
||||
# Initialize embedder and vector store
|
||||
from codexlens.semantic.embedder import get_embedder
|
||||
from codexlens.semantic.factory import get_embedder
|
||||
from codexlens.semantic.vector_store import VectorStore
|
||||
|
||||
vector_store = VectorStore(index_path)
|
||||
@@ -277,32 +277,51 @@ class HybridSearchEngine:
|
||||
# Get stored model configuration (preferred) or auto-detect from dimension
|
||||
model_config = vector_store.get_model_config()
|
||||
if model_config:
|
||||
profile = model_config["model_profile"]
|
||||
backend = model_config.get("backend", "fastembed")
|
||||
model_name = model_config["model_name"]
|
||||
model_profile = model_config["model_profile"]
|
||||
self.logger.debug(
|
||||
"Using stored model config: %s (%s, %dd)",
|
||||
profile, model_config["model_name"], model_config["embedding_dim"]
|
||||
"Using stored model config: %s backend, %s (%s, %dd)",
|
||||
backend, model_profile, model_name, model_config["embedding_dim"]
|
||||
)
|
||||
|
||||
# Get embedder based on backend
|
||||
if backend == "litellm":
|
||||
embedder = get_embedder(backend="litellm", model=model_name)
|
||||
else:
|
||||
embedder = get_embedder(backend="fastembed", profile=model_profile)
|
||||
else:
|
||||
# Fallback: auto-detect from embedding dimension
|
||||
detected_dim = vector_store.dimension
|
||||
if detected_dim is None:
|
||||
self.logger.info("Vector store dimension unknown, using default profile")
|
||||
profile = "code" # Default fallback
|
||||
embedder = get_embedder(backend="fastembed", profile="code")
|
||||
elif detected_dim == 384:
|
||||
profile = "fast"
|
||||
embedder = get_embedder(backend="fastembed", profile="fast")
|
||||
elif detected_dim == 768:
|
||||
profile = "code"
|
||||
embedder = get_embedder(backend="fastembed", profile="code")
|
||||
elif detected_dim == 1024:
|
||||
profile = "multilingual" # or balanced, both are 1024
|
||||
else:
|
||||
profile = "code" # Default fallback
|
||||
self.logger.debug(
|
||||
"No stored model config, auto-detected profile '%s' from dimension %s",
|
||||
profile, detected_dim
|
||||
embedder = get_embedder(backend="fastembed", profile="multilingual")
|
||||
elif detected_dim == 1536:
|
||||
# Likely OpenAI text-embedding-3-small or ada-002
|
||||
self.logger.info(
|
||||
"Detected 1536-dim embeddings (likely OpenAI), using litellm backend with text-embedding-3-small"
|
||||
)
|
||||
embedder = get_embedder(backend="litellm", model="text-embedding-3-small")
|
||||
elif detected_dim == 3072:
|
||||
# Likely OpenAI text-embedding-3-large
|
||||
self.logger.info(
|
||||
"Detected 3072-dim embeddings (likely OpenAI), using litellm backend with text-embedding-3-large"
|
||||
)
|
||||
embedder = get_embedder(backend="litellm", model="text-embedding-3-large")
|
||||
else:
|
||||
self.logger.debug(
|
||||
"Unknown dimension %s, using default fastembed profile 'code'",
|
||||
detected_dim
|
||||
)
|
||||
embedder = get_embedder(backend="fastembed", profile="code")
|
||||
|
||||
|
||||
# Use cached embedder (singleton) for performance
|
||||
embedder = get_embedder(profile=profile)
|
||||
|
||||
# Generate query embedding
|
||||
query_embedding = embedder.embed_single(query)
|
||||
|
||||
@@ -123,12 +123,34 @@ class VectorStore:
|
||||
model_profile TEXT NOT NULL,
|
||||
model_name TEXT NOT NULL,
|
||||
embedding_dim INTEGER NOT NULL,
|
||||
backend TEXT NOT NULL DEFAULT 'fastembed',
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||||
)
|
||||
""")
|
||||
|
||||
# Migration: Add backend column to existing tables
|
||||
self._migrate_backend_column(conn)
|
||||
|
||||
conn.commit()
|
||||
|
||||
def _migrate_backend_column(self, conn: sqlite3.Connection) -> None:
|
||||
"""Add backend column to existing embeddings_config table if not present.
|
||||
|
||||
Args:
|
||||
conn: Active SQLite connection
|
||||
"""
|
||||
# Check if backend column exists
|
||||
cursor = conn.execute("PRAGMA table_info(embeddings_config)")
|
||||
columns = [row[1] for row in cursor.fetchall()]
|
||||
|
||||
if 'backend' not in columns:
|
||||
logger.info("Migrating embeddings_config table: adding backend column")
|
||||
conn.execute("""
|
||||
ALTER TABLE embeddings_config
|
||||
ADD COLUMN backend TEXT NOT NULL DEFAULT 'fastembed'
|
||||
""")
|
||||
|
||||
def _init_ann_index(self) -> None:
|
||||
"""Initialize ANN index (lazy loading from existing data)."""
|
||||
if not HNSWLIB_AVAILABLE:
|
||||
@@ -947,11 +969,11 @@ class VectorStore:
|
||||
"""Get the model configuration used for embeddings in this store.
|
||||
|
||||
Returns:
|
||||
Dictionary with model_profile, model_name, embedding_dim, or None if not set.
|
||||
Dictionary with model_profile, model_name, embedding_dim, backend, or None if not set.
|
||||
"""
|
||||
with sqlite3.connect(self.db_path) as conn:
|
||||
row = conn.execute(
|
||||
"SELECT model_profile, model_name, embedding_dim, created_at, updated_at "
|
||||
"SELECT model_profile, model_name, embedding_dim, backend, created_at, updated_at "
|
||||
"FROM embeddings_config WHERE id = 1"
|
||||
).fetchone()
|
||||
if row:
|
||||
@@ -959,13 +981,14 @@ class VectorStore:
|
||||
"model_profile": row[0],
|
||||
"model_name": row[1],
|
||||
"embedding_dim": row[2],
|
||||
"created_at": row[3],
|
||||
"updated_at": row[4],
|
||||
"backend": row[3],
|
||||
"created_at": row[4],
|
||||
"updated_at": row[5],
|
||||
}
|
||||
return None
|
||||
|
||||
def set_model_config(
|
||||
self, model_profile: str, model_name: str, embedding_dim: int
|
||||
self, model_profile: str, model_name: str, embedding_dim: int, backend: str = 'fastembed'
|
||||
) -> None:
|
||||
"""Set the model configuration for embeddings in this store.
|
||||
|
||||
@@ -976,19 +999,21 @@ class VectorStore:
|
||||
model_profile: Model profile name (fast, code, minilm, etc.)
|
||||
model_name: Full model name (e.g., jinaai/jina-embeddings-v2-base-code)
|
||||
embedding_dim: Embedding dimension (e.g., 768)
|
||||
backend: Backend used for embeddings (fastembed or litellm, default: fastembed)
|
||||
"""
|
||||
with sqlite3.connect(self.db_path) as conn:
|
||||
conn.execute(
|
||||
"""
|
||||
INSERT INTO embeddings_config (id, model_profile, model_name, embedding_dim)
|
||||
VALUES (1, ?, ?, ?)
|
||||
INSERT INTO embeddings_config (id, model_profile, model_name, embedding_dim, backend)
|
||||
VALUES (1, ?, ?, ?, ?)
|
||||
ON CONFLICT(id) DO UPDATE SET
|
||||
model_profile = excluded.model_profile,
|
||||
model_name = excluded.model_name,
|
||||
embedding_dim = excluded.embedding_dim,
|
||||
backend = excluded.backend,
|
||||
updated_at = CURRENT_TIMESTAMP
|
||||
""",
|
||||
(model_profile, model_name, embedding_dim)
|
||||
(model_profile, model_name, embedding_dim, backend)
|
||||
)
|
||||
conn.commit()
|
||||
|
||||
|
||||
Reference in New Issue
Block a user