feat: Enhance embedding management and model configuration

- Updated embedding_manager.py to include backend parameter in model configuration.
- Modified model_manager.py to utilize cache_name for ONNX models.
- Refactored hybrid_search.py to improve embedder initialization based on backend type.
- Added backend column to vector_store.py for better model configuration management.
- Implemented migration for existing database to include backend information.
- Enhanced API settings implementation with comprehensive provider and endpoint management.
- Introduced LiteLLM integration guide detailing configuration and usage.
- Added examples for LiteLLM usage in TypeScript.
This commit is contained in:
catlog22
2025-12-24 14:03:59 +08:00
parent 9b926d1a1e
commit b00113d212
22 changed files with 5507 additions and 706 deletions

308
ccw/LITELLM_INTEGRATION.md Normal file
View File

@@ -0,0 +1,308 @@
# LiteLLM Integration Guide
## Overview
CCW now supports custom LiteLLM endpoints with integrated context caching. You can configure multiple providers (OpenAI, Anthropic, Ollama, etc.) and create custom endpoints with file-based caching strategies.
## Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ CLI Executor │
│ │
│ ┌─────────────┐ ┌──────────────────────────────┐ │
│ │ --model │────────>│ Route Decision: │ │
│ │ flag │ │ - gemini/qwen/codex → CLI │ │
│ └─────────────┘ │ - custom ID → LiteLLM │ │
│ └──────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ LiteLLM Executor │
│ │
│ 1. Load endpoint config (litellm-api-config.json) │
│ 2. Extract @patterns from prompt │
│ 3. Pack files via context-cache │
│ 4. Call LiteLLM client with cached content + prompt │
│ 5. Return result │
└─────────────────────────────────────────────────────────────┘
```
## Configuration
### File Location
Configuration is stored per-project:
```
<project>/.ccw/storage/config/litellm-api-config.json
```
### Configuration Structure
```json
{
"version": 1,
"providers": [
{
"id": "openai-1234567890",
"name": "My OpenAI",
"type": "openai",
"apiKey": "${OPENAI_API_KEY}",
"enabled": true,
"createdAt": "2025-01-01T00:00:00.000Z",
"updatedAt": "2025-01-01T00:00:00.000Z"
}
],
"endpoints": [
{
"id": "my-gpt4o",
"name": "GPT-4o with Context Cache",
"providerId": "openai-1234567890",
"model": "gpt-4o",
"description": "GPT-4o with automatic file caching",
"cacheStrategy": {
"enabled": true,
"ttlMinutes": 60,
"maxSizeKB": 512,
"filePatterns": ["*.md", "*.ts", "*.js"]
},
"enabled": true,
"createdAt": "2025-01-01T00:00:00.000Z",
"updatedAt": "2025-01-01T00:00:00.000Z"
}
],
"defaultEndpoint": "my-gpt4o",
"globalCacheSettings": {
"enabled": true,
"cacheDir": "~/.ccw/cache/context",
"maxTotalSizeMB": 100
}
}
```
## Usage
### Via CLI
```bash
# Use custom endpoint with --model flag
ccw cli -p "Analyze authentication flow" --tool litellm --model my-gpt4o
# With context patterns (automatically cached)
ccw cli -p "@src/auth/**/*.ts Review security" --tool litellm --model my-gpt4o
# Disable caching for specific call
ccw cli -p "Quick question" --tool litellm --model my-gpt4o --no-cache
```
### Via Dashboard API
#### Create Provider
```bash
curl -X POST http://localhost:3000/api/litellm-api/providers \
-H "Content-Type: application/json" \
-d '{
"name": "My OpenAI",
"type": "openai",
"apiKey": "${OPENAI_API_KEY}",
"enabled": true
}'
```
#### Create Endpoint
```bash
curl -X POST http://localhost:3000/api/litellm-api/endpoints \
-H "Content-Type: application/json" \
-d '{
"id": "my-gpt4o",
"name": "GPT-4o with Cache",
"providerId": "openai-1234567890",
"model": "gpt-4o",
"cacheStrategy": {
"enabled": true,
"ttlMinutes": 60,
"maxSizeKB": 512,
"filePatterns": ["*.md", "*.ts"]
},
"enabled": true
}'
```
#### Test Provider Connection
```bash
curl -X POST http://localhost:3000/api/litellm-api/providers/openai-1234567890/test
```
## Context Caching
### How It Works
1. **Pattern Detection**: LiteLLM executor scans prompt for `@patterns`
```
@src/**/*.ts
@CLAUDE.md
@../shared/**/*
```
2. **File Packing**: Files matching patterns are packed via `context-cache` tool
- Respects `max_file_size` limit (default: 1MB per file)
- Applies TTL from endpoint config
- Generates session ID for retrieval
3. **Cache Integration**: Cached content is prepended to prompt
```
<cached files>
---
<original prompt>
```
4. **LLM Call**: Combined prompt sent to LiteLLM with provider credentials
### Cache Strategy Configuration
```typescript
interface CacheStrategy {
enabled: boolean; // Enable/disable caching for this endpoint
ttlMinutes: number; // Cache lifetime (default: 60)
maxSizeKB: number; // Max cache size (default: 512KB)
filePatterns: string[]; // Glob patterns to cache
}
```
### Example: Security Audit with Cache
```bash
ccw cli -p "
PURPOSE: OWASP Top 10 security audit of authentication module
TASK: • Check SQL injection • Verify session management • Test XSS vectors
CONTEXT: @src/auth/**/*.ts @src/middleware/auth.ts
EXPECTED: Security report with severity levels and remediation steps
" --tool litellm --model my-security-scanner --mode analysis
```
**What happens:**
1. Executor detects `@src/auth/**/*.ts` and `@src/middleware/auth.ts`
2. Packs matching files into context cache
3. Cache entry valid for 60 minutes (per endpoint config)
4. Subsequent calls reuse cached files (no re-packing)
5. LiteLLM receives full context without manual file specification
## Environment Variables
### Provider API Keys
LiteLLM uses standard environment variable names:
| Provider | Env Var Name |
|------------|-----------------------|
| OpenAI | `OPENAI_API_KEY` |
| Anthropic | `ANTHROPIC_API_KEY` |
| Google | `GOOGLE_API_KEY` |
| Azure | `AZURE_API_KEY` |
| Mistral | `MISTRAL_API_KEY` |
| DeepSeek | `DEEPSEEK_API_KEY` |
### Configuration Syntax
Use `${ENV_VAR}` syntax in config:
```json
{
"apiKey": "${OPENAI_API_KEY}"
}
```
The executor resolves these at runtime via `resolveEnvVar()`.
## API Reference
### Config Manager (`litellm-api-config-manager.ts`)
#### Provider Management
```typescript
getAllProviders(baseDir: string): ProviderCredential[]
getProvider(baseDir: string, providerId: string): ProviderCredential | null
getProviderWithResolvedEnvVars(baseDir: string, providerId: string): ProviderCredential & { resolvedApiKey: string } | null
addProvider(baseDir: string, providerData): ProviderCredential
updateProvider(baseDir: string, providerId: string, updates): ProviderCredential
deleteProvider(baseDir: string, providerId: string): boolean
```
#### Endpoint Management
```typescript
getAllEndpoints(baseDir: string): CustomEndpoint[]
getEndpoint(baseDir: string, endpointId: string): CustomEndpoint | null
findEndpointById(baseDir: string, endpointId: string): CustomEndpoint | null
addEndpoint(baseDir: string, endpointData): CustomEndpoint
updateEndpoint(baseDir: string, endpointId: string, updates): CustomEndpoint
deleteEndpoint(baseDir: string, endpointId: string): boolean
```
### Executor (`litellm-executor.ts`)
```typescript
interface LiteLLMExecutionOptions {
prompt: string;
endpointId: string;
baseDir: string;
cwd?: string;
includeDirs?: string[];
enableCache?: boolean;
onOutput?: (data: { type: string; data: string }) => void;
}
interface LiteLLMExecutionResult {
success: boolean;
output: string;
model: string;
provider: string;
cacheUsed: boolean;
cachedFiles?: string[];
error?: string;
}
executeLiteLLMEndpoint(options: LiteLLMExecutionOptions): Promise<LiteLLMExecutionResult>
extractPatterns(prompt: string): string[]
```
## Dashboard Integration
The dashboard provides UI for managing LiteLLM configuration:
- **Providers**: Add/edit/delete provider credentials
- **Endpoints**: Configure custom endpoints with cache strategies
- **Cache Stats**: View cache usage and clear entries
- **Test Connections**: Verify provider API access
Routes are handled by `litellm-api-routes.ts`.
## Limitations
1. **Python Dependency**: Requires `ccw-litellm` Python package installed
2. **Model Support**: Limited to models supported by LiteLLM library
3. **Cache Scope**: Context cache is in-memory (not persisted across restarts)
4. **Pattern Syntax**: Only supports glob-style `@patterns`, not regex
## Troubleshooting
### Error: "Endpoint not found"
- Verify endpoint ID matches config file
- Check `litellm-api-config.json` exists in `.ccw/storage/config/`
### Error: "API key not configured"
- Ensure environment variable is set
- Verify `${ENV_VAR}` syntax in config
- Test with `echo $OPENAI_API_KEY`
### Error: "Failed to spawn Python process"
- Install ccw-litellm: `pip install ccw-litellm`
- Verify Python accessible: `python --version`
### Cache Not Applied
- Check endpoint has `cacheStrategy.enabled: true`
- Verify prompt contains `@patterns`
- Check cache TTL hasn't expired
## Examples
See `examples/litellm-config.json` for complete configuration template.

View File

@@ -0,0 +1,77 @@
/**
* LiteLLM Usage Examples
* Demonstrates how to use the LiteLLM TypeScript client
*/
import { getLiteLLMClient, getLiteLLMStatus } from '../src/tools/litellm-client';
async function main() {
console.log('=== LiteLLM TypeScript Bridge Examples ===\n');
// Example 1: Check availability
console.log('1. Checking LiteLLM availability...');
const status = await getLiteLLMStatus();
console.log(' Status:', status);
console.log('');
if (!status.available) {
console.log('❌ LiteLLM is not available. Please install ccw-litellm:');
console.log(' pip install ccw-litellm');
return;
}
const client = getLiteLLMClient();
// Example 2: Get configuration
console.log('2. Getting configuration...');
try {
const config = await client.getConfig();
console.log(' Config:', config);
} catch (error) {
console.log(' Error:', error.message);
}
console.log('');
// Example 3: Generate embeddings
console.log('3. Generating embeddings...');
try {
const texts = ['Hello world', 'Machine learning is amazing'];
const embedResult = await client.embed(texts, 'default');
console.log(' Dimensions:', embedResult.dimensions);
console.log(' Vectors count:', embedResult.vectors.length);
console.log(' First vector (first 5 dims):', embedResult.vectors[0]?.slice(0, 5));
} catch (error) {
console.log(' Error:', error.message);
}
console.log('');
// Example 4: Single message chat
console.log('4. Single message chat...');
try {
const response = await client.chat('What is 2+2?', 'default');
console.log(' Response:', response);
} catch (error) {
console.log(' Error:', error.message);
}
console.log('');
// Example 5: Multi-turn chat
console.log('5. Multi-turn chat...');
try {
const chatResponse = await client.chatMessages([
{ role: 'system', content: 'You are a helpful math tutor.' },
{ role: 'user', content: 'What is the Pythagorean theorem?' }
], 'default');
console.log(' Content:', chatResponse.content);
console.log(' Model:', chatResponse.model);
console.log(' Usage:', chatResponse.usage);
} catch (error) {
console.log(' Error:', error.message);
}
console.log('');
console.log('=== Examples completed ===');
}
// Run examples
main().catch(console.error);

View File

@@ -855,7 +855,7 @@ export async function cliCommand(
console.log(chalk.gray(' --model <model> Model override'));
console.log(chalk.gray(' --cd <path> Working directory'));
console.log(chalk.gray(' --includeDirs <dirs> Additional directories'));
console.log(chalk.gray(' --timeout <ms> Timeout (default: 300000)'));
console.log(chalk.gray(' --timeout <ms> Timeout (default: 0=disabled)'));
console.log(chalk.gray(' --resume [id] Resume previous session'));
console.log(chalk.gray(' --cache <items> Cache: comma-separated @patterns and text'));
console.log(chalk.gray(' --inject-mode <m> Inject mode: none, full, progressive'));

View File

@@ -6,7 +6,7 @@
import chalk from 'chalk';
import { existsSync, readFileSync, writeFileSync, mkdirSync } from 'fs';
import { join, dirname } from 'path';
import { tmpdir } from 'os';
import { homedir } from 'os';
interface HookOptions {
stdin?: boolean;
@@ -53,9 +53,10 @@ async function readStdin(): Promise<string> {
/**
* Get session state file path
* Uses ~/.claude/.ccw-sessions/ for reliable persistence across sessions
*/
function getSessionStateFile(sessionId: string): string {
const stateDir = join(tmpdir(), '.ccw-sessions');
const stateDir = join(homedir(), '.claude', '.ccw-sessions');
if (!existsSync(stateDir)) {
mkdirSync(stateDir, { recursive: true });
}

View File

@@ -0,0 +1,441 @@
/**
* LiteLLM API Config Manager
* Manages provider credentials, endpoint configurations, and model discovery
*/
import { join } from 'path';
import { readFileSync, writeFileSync, existsSync, mkdirSync } from 'fs';
import { homedir } from 'os';
// ===========================
// Type Definitions
// ===========================
export type ProviderType =
| 'openai'
| 'anthropic'
| 'google'
| 'cohere'
| 'azure'
| 'bedrock'
| 'vertexai'
| 'huggingface'
| 'ollama'
| 'custom';
export interface ProviderCredential {
id: string;
name: string;
type: ProviderType;
apiKey?: string;
baseUrl?: string;
apiVersion?: string;
region?: string;
projectId?: string;
organizationId?: string;
enabled: boolean;
metadata?: Record<string, any>;
createdAt: string;
updatedAt: string;
}
export interface EndpointConfig {
id: string;
name: string;
providerId: string;
model: string;
alias?: string;
temperature?: number;
maxTokens?: number;
topP?: number;
enabled: boolean;
metadata?: Record<string, any>;
createdAt: string;
updatedAt: string;
}
export interface ModelInfo {
id: string;
name: string;
provider: ProviderType;
contextWindow: number;
supportsFunctions: boolean;
supportsStreaming: boolean;
inputCostPer1k?: number;
outputCostPer1k?: number;
}
export interface LiteLLMApiConfig {
version: string;
providers: ProviderCredential[];
endpoints: EndpointConfig[];
}
// ===========================
// Model Definitions
// ===========================
export const PROVIDER_MODELS: Record<ProviderType, ModelInfo[]> = {
openai: [
{
id: 'gpt-4-turbo',
name: 'GPT-4 Turbo',
provider: 'openai',
contextWindow: 128000,
supportsFunctions: true,
supportsStreaming: true,
inputCostPer1k: 0.01,
outputCostPer1k: 0.03,
},
{
id: 'gpt-4',
name: 'GPT-4',
provider: 'openai',
contextWindow: 8192,
supportsFunctions: true,
supportsStreaming: true,
inputCostPer1k: 0.03,
outputCostPer1k: 0.06,
},
{
id: 'gpt-3.5-turbo',
name: 'GPT-3.5 Turbo',
provider: 'openai',
contextWindow: 16385,
supportsFunctions: true,
supportsStreaming: true,
inputCostPer1k: 0.0005,
outputCostPer1k: 0.0015,
},
],
anthropic: [
{
id: 'claude-3-opus-20240229',
name: 'Claude 3 Opus',
provider: 'anthropic',
contextWindow: 200000,
supportsFunctions: true,
supportsStreaming: true,
inputCostPer1k: 0.015,
outputCostPer1k: 0.075,
},
{
id: 'claude-3-sonnet-20240229',
name: 'Claude 3 Sonnet',
provider: 'anthropic',
contextWindow: 200000,
supportsFunctions: true,
supportsStreaming: true,
inputCostPer1k: 0.003,
outputCostPer1k: 0.015,
},
{
id: 'claude-3-haiku-20240307',
name: 'Claude 3 Haiku',
provider: 'anthropic',
contextWindow: 200000,
supportsFunctions: true,
supportsStreaming: true,
inputCostPer1k: 0.00025,
outputCostPer1k: 0.00125,
},
],
google: [
{
id: 'gemini-pro',
name: 'Gemini Pro',
provider: 'google',
contextWindow: 32768,
supportsFunctions: true,
supportsStreaming: true,
},
{
id: 'gemini-pro-vision',
name: 'Gemini Pro Vision',
provider: 'google',
contextWindow: 16384,
supportsFunctions: false,
supportsStreaming: true,
},
],
cohere: [
{
id: 'command',
name: 'Command',
provider: 'cohere',
contextWindow: 4096,
supportsFunctions: false,
supportsStreaming: true,
},
{
id: 'command-light',
name: 'Command Light',
provider: 'cohere',
contextWindow: 4096,
supportsFunctions: false,
supportsStreaming: true,
},
],
azure: [],
bedrock: [],
vertexai: [],
huggingface: [],
ollama: [],
custom: [],
};
// ===========================
// Config File Management
// ===========================
const CONFIG_DIR = join(homedir(), '.claude', 'litellm');
const CONFIG_FILE = join(CONFIG_DIR, 'config.json');
function ensureConfigDir(): void {
if (!existsSync(CONFIG_DIR)) {
mkdirSync(CONFIG_DIR, { recursive: true });
}
}
function loadConfig(): LiteLLMApiConfig {
ensureConfigDir();
if (!existsSync(CONFIG_FILE)) {
const defaultConfig: LiteLLMApiConfig = {
version: '1.0.0',
providers: [],
endpoints: [],
};
saveConfig(defaultConfig);
return defaultConfig;
}
try {
const content = readFileSync(CONFIG_FILE, 'utf-8');
return JSON.parse(content);
} catch (err) {
throw new Error(`Failed to load config: ${(err as Error).message}`);
}
}
function saveConfig(config: LiteLLMApiConfig): void {
ensureConfigDir();
try {
writeFileSync(CONFIG_FILE, JSON.stringify(config, null, 2), 'utf-8');
} catch (err) {
throw new Error(`Failed to save config: ${(err as Error).message}`);
}
}
// ===========================
// Provider Management
// ===========================
export function getAllProviders(): ProviderCredential[] {
const config = loadConfig();
return config.providers;
}
export function getProvider(id: string): ProviderCredential | null {
const config = loadConfig();
return config.providers.find((p) => p.id === id) || null;
}
export function createProvider(
data: Omit<ProviderCredential, 'id' | 'createdAt' | 'updatedAt'>
): ProviderCredential {
const config = loadConfig();
const now = new Date().toISOString();
const provider: ProviderCredential = {
...data,
id: `provider-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`,
createdAt: now,
updatedAt: now,
};
config.providers.push(provider);
saveConfig(config);
return provider;
}
export function updateProvider(
id: string,
updates: Partial<ProviderCredential>
): ProviderCredential | null {
const config = loadConfig();
const index = config.providers.findIndex((p) => p.id === id);
if (index === -1) {
return null;
}
const updated: ProviderCredential = {
...config.providers[index],
...updates,
id,
updatedAt: new Date().toISOString(),
};
config.providers[index] = updated;
saveConfig(config);
return updated;
}
export function deleteProvider(id: string): { success: boolean } {
const config = loadConfig();
const index = config.providers.findIndex((p) => p.id === id);
if (index === -1) {
return { success: false };
}
config.providers.splice(index, 1);
// Also delete endpoints using this provider
config.endpoints = config.endpoints.filter((e) => e.providerId !== id);
saveConfig(config);
return { success: true };
}
export async function testProviderConnection(
providerId: string
): Promise<{ success: boolean; error?: string }> {
const provider = getProvider(providerId);
if (!provider) {
return { success: false, error: 'Provider not found' };
}
if (!provider.enabled) {
return { success: false, error: 'Provider is disabled' };
}
// Basic validation
if (!provider.apiKey && provider.type !== 'ollama' && provider.type !== 'custom') {
return { success: false, error: 'API key is required for this provider type' };
}
// TODO: Implement actual provider connection testing using litellm-client
// For now, just validate the configuration
return { success: true };
}
// ===========================
// Endpoint Management
// ===========================
export function getAllEndpoints(): EndpointConfig[] {
const config = loadConfig();
return config.endpoints;
}
export function getEndpoint(id: string): EndpointConfig | null {
const config = loadConfig();
return config.endpoints.find((e) => e.id === id) || null;
}
export function createEndpoint(
data: Omit<EndpointConfig, 'id' | 'createdAt' | 'updatedAt'>
): EndpointConfig {
const config = loadConfig();
// Validate provider exists
const provider = config.providers.find((p) => p.id === data.providerId);
if (!provider) {
throw new Error('Provider not found');
}
const now = new Date().toISOString();
const endpoint: EndpointConfig = {
...data,
id: `endpoint-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`,
createdAt: now,
updatedAt: now,
};
config.endpoints.push(endpoint);
saveConfig(config);
return endpoint;
}
export function updateEndpoint(
id: string,
updates: Partial<EndpointConfig>
): EndpointConfig | null {
const config = loadConfig();
const index = config.endpoints.findIndex((e) => e.id === id);
if (index === -1) {
return null;
}
// Validate provider if being updated
if (updates.providerId) {
const provider = config.providers.find((p) => p.id === updates.providerId);
if (!provider) {
throw new Error('Provider not found');
}
}
const updated: EndpointConfig = {
...config.endpoints[index],
...updates,
id,
updatedAt: new Date().toISOString(),
};
config.endpoints[index] = updated;
saveConfig(config);
return updated;
}
export function deleteEndpoint(id: string): { success: boolean } {
const config = loadConfig();
const index = config.endpoints.findIndex((e) => e.id === id);
if (index === -1) {
return { success: false };
}
config.endpoints.splice(index, 1);
saveConfig(config);
return { success: true };
}
// ===========================
// Model Discovery
// ===========================
export function getModelsForProviderType(providerType: ProviderType): ModelInfo[] | null {
return PROVIDER_MODELS[providerType] || null;
}
export function getAllModels(): Record<ProviderType, ModelInfo[]> {
return PROVIDER_MODELS;
}
// ===========================
// Config Access
// ===========================
export function getFullConfig(): LiteLLMApiConfig {
return loadConfig();
}
export function resetConfig(): void {
const defaultConfig: LiteLLMApiConfig = {
version: '1.0.0',
providers: [],
endpoints: [],
};
saveConfig(defaultConfig);
}

View File

@@ -25,10 +25,33 @@ export interface ModelInfo {
}
/**
* Predefined models for each provider
* Embedding model information metadata
*/
export interface EmbeddingModelInfo {
/** Model identifier (used in API calls) */
id: string;
/** Human-readable display name */
name: string;
/** Embedding dimensions */
dimensions: number;
/** Maximum input tokens */
maxTokens: number;
/** Provider identifier */
provider: string;
}
/**
* Predefined models for each API format
* Used for UI selection and validation
* Note: Most providers use OpenAI-compatible format
*/
export const PROVIDER_MODELS: Record<ProviderType, ModelInfo[]> = {
// OpenAI-compatible format (used by OpenAI, DeepSeek, Ollama, etc.)
openai: [
{
id: 'gpt-4o',
@@ -49,19 +72,32 @@ export const PROVIDER_MODELS: Record<ProviderType, ModelInfo[]> = {
supportsCaching: true
},
{
id: 'o1-mini',
name: 'O1 Mini',
contextWindow: 128000,
supportsCaching: true
id: 'deepseek-chat',
name: 'DeepSeek Chat',
contextWindow: 64000,
supportsCaching: false
},
{
id: 'gpt-4-turbo',
name: 'GPT-4 Turbo',
id: 'deepseek-coder',
name: 'DeepSeek Coder',
contextWindow: 64000,
supportsCaching: false
},
{
id: 'llama3.2',
name: 'Llama 3.2',
contextWindow: 128000,
supportsCaching: false
},
{
id: 'qwen2.5-coder',
name: 'Qwen 2.5 Coder',
contextWindow: 32000,
supportsCaching: false
}
],
// Anthropic format
anthropic: [
{
id: 'claude-sonnet-4-20250514',
@@ -89,135 +125,7 @@ export const PROVIDER_MODELS: Record<ProviderType, ModelInfo[]> = {
}
],
ollama: [
{
id: 'llama3.2',
name: 'Llama 3.2',
contextWindow: 128000,
supportsCaching: false
},
{
id: 'llama3.1',
name: 'Llama 3.1',
contextWindow: 128000,
supportsCaching: false
},
{
id: 'qwen2.5-coder',
name: 'Qwen 2.5 Coder',
contextWindow: 32000,
supportsCaching: false
},
{
id: 'codellama',
name: 'Code Llama',
contextWindow: 16000,
supportsCaching: false
},
{
id: 'mistral',
name: 'Mistral',
contextWindow: 32000,
supportsCaching: false
}
],
azure: [
{
id: 'gpt-4o',
name: 'GPT-4o (Azure)',
contextWindow: 128000,
supportsCaching: true
},
{
id: 'gpt-4o-mini',
name: 'GPT-4o Mini (Azure)',
contextWindow: 128000,
supportsCaching: true
},
{
id: 'gpt-4-turbo',
name: 'GPT-4 Turbo (Azure)',
contextWindow: 128000,
supportsCaching: false
},
{
id: 'gpt-35-turbo',
name: 'GPT-3.5 Turbo (Azure)',
contextWindow: 16000,
supportsCaching: false
}
],
google: [
{
id: 'gemini-2.0-flash-exp',
name: 'Gemini 2.0 Flash Experimental',
contextWindow: 1048576,
supportsCaching: true
},
{
id: 'gemini-1.5-pro',
name: 'Gemini 1.5 Pro',
contextWindow: 2097152,
supportsCaching: true
},
{
id: 'gemini-1.5-flash',
name: 'Gemini 1.5 Flash',
contextWindow: 1048576,
supportsCaching: true
},
{
id: 'gemini-1.0-pro',
name: 'Gemini 1.0 Pro',
contextWindow: 32000,
supportsCaching: false
}
],
mistral: [
{
id: 'mistral-large-latest',
name: 'Mistral Large',
contextWindow: 128000,
supportsCaching: false
},
{
id: 'mistral-medium-latest',
name: 'Mistral Medium',
contextWindow: 32000,
supportsCaching: false
},
{
id: 'mistral-small-latest',
name: 'Mistral Small',
contextWindow: 32000,
supportsCaching: false
},
{
id: 'codestral-latest',
name: 'Codestral',
contextWindow: 32000,
supportsCaching: false
}
],
deepseek: [
{
id: 'deepseek-chat',
name: 'DeepSeek Chat',
contextWindow: 64000,
supportsCaching: false
},
{
id: 'deepseek-coder',
name: 'DeepSeek Coder',
contextWindow: 64000,
supportsCaching: false
}
],
// Custom format
custom: [
{
id: 'custom-model',
@@ -237,6 +145,61 @@ export function getModelsForProvider(providerType: ProviderType): ModelInfo[] {
return PROVIDER_MODELS[providerType] || [];
}
/**
* Predefined embedding models for each API format
* Used for UI selection and validation
*/
export const EMBEDDING_MODELS: Record<ProviderType, EmbeddingModelInfo[]> = {
// OpenAI embedding models
openai: [
{
id: 'text-embedding-3-small',
name: 'Text Embedding 3 Small',
dimensions: 1536,
maxTokens: 8191,
provider: 'openai'
},
{
id: 'text-embedding-3-large',
name: 'Text Embedding 3 Large',
dimensions: 3072,
maxTokens: 8191,
provider: 'openai'
},
{
id: 'text-embedding-ada-002',
name: 'Ada 002',
dimensions: 1536,
maxTokens: 8191,
provider: 'openai'
}
],
// Anthropic doesn't have embedding models
anthropic: [],
// Custom embedding models
custom: [
{
id: 'custom-embedding',
name: 'Custom Embedding',
dimensions: 1536,
maxTokens: 8192,
provider: 'custom'
}
]
};
/**
* Get embedding models for a specific provider
* @param providerType - Provider type to get embedding models for
* @returns Array of embedding model information
*/
export function getEmbeddingModelsForProvider(providerType: ProviderType): EmbeddingModelInfo[] {
return EMBEDDING_MODELS[providerType] || [];
}
/**
* Get model information by ID within a provider
* @param providerType - Provider type

View File

@@ -181,29 +181,13 @@ function deleteHookFromSettings(projectPath, scope, event, hookIndex) {
}
// ========================================
// Session State Tracking (for progressive disclosure)
// Session State Tracking
// ========================================
// Track sessions that have received startup context
// Key: sessionId, Value: timestamp of first context load
const sessionContextState = new Map<string, {
firstLoad: string;
loadCount: number;
lastPrompt?: string;
}>();
// Cleanup old sessions (older than 24 hours)
function cleanupOldSessions() {
const cutoff = Date.now() - 24 * 60 * 60 * 1000;
for (const [sessionId, state] of sessionContextState.entries()) {
if (new Date(state.firstLoad).getTime() < cutoff) {
sessionContextState.delete(sessionId);
}
}
}
// Run cleanup every hour
setInterval(cleanupOldSessions, 60 * 60 * 1000);
// NOTE: Session state is managed by the CLI command (src/commands/hook.ts)
// using file-based persistence (~/.claude/.ccw-sessions/).
// This ensures consistent state tracking across all invocation methods.
// The /api/hook endpoint delegates to SessionClusteringService without
// managing its own state, as the authoritative state lives in the CLI layer.
// ========================================
// Route Handler
@@ -286,7 +270,8 @@ export async function handleHooksRoutes(ctx: RouteContext): Promise<boolean> {
}
// API: Unified Session Context endpoint (Progressive Disclosure)
// Automatically detects first prompt vs subsequent prompts
// DEPRECATED: Use CLI command `ccw hook session-context --stdin` instead.
// This endpoint now uses file-based state (shared with CLI) for consistency.
// - First prompt: returns cluster-based session overview
// - Subsequent prompts: returns intent-matched sessions based on prompt
if (pathname === '/api/hook/session-context' && req.method === 'POST') {
@@ -306,21 +291,30 @@ export async function handleHooksRoutes(ctx: RouteContext): Promise<boolean> {
const { SessionClusteringService } = await import('../session-clustering-service.js');
const clusteringService = new SessionClusteringService(projectPath);
// Check if this is the first prompt for this session
const existingState = sessionContextState.get(sessionId);
// Use file-based session state (shared with CLI hook.ts)
const sessionStateDir = join(homedir(), '.claude', '.ccw-sessions');
const sessionStateFile = join(sessionStateDir, `session-${sessionId}.json`);
let existingState: { firstLoad: string; loadCount: number; lastPrompt?: string } | null = null;
if (existsSync(sessionStateFile)) {
try {
existingState = JSON.parse(readFileSync(sessionStateFile, 'utf-8'));
} catch {
existingState = null;
}
}
const isFirstPrompt = !existingState;
// Update session state
if (isFirstPrompt) {
sessionContextState.set(sessionId, {
firstLoad: new Date().toISOString(),
loadCount: 1,
lastPrompt: prompt
});
} else {
existingState.loadCount++;
existingState.lastPrompt = prompt;
// Update session state (file-based)
const newState = isFirstPrompt
? { firstLoad: new Date().toISOString(), loadCount: 1, lastPrompt: prompt }
: { ...existingState!, loadCount: existingState!.loadCount + 1, lastPrompt: prompt };
if (!existsSync(sessionStateDir)) {
mkdirSync(sessionStateDir, { recursive: true });
}
writeFileSync(sessionStateFile, JSON.stringify(newState, null, 2));
// Determine which type of context to return
let contextType: 'session-start' | 'context';
@@ -351,7 +345,7 @@ export async function handleHooksRoutes(ctx: RouteContext): Promise<boolean> {
success: true,
type: contextType,
isFirstPrompt,
loadCount: sessionContextState.get(sessionId)?.loadCount || 1,
loadCount: newState.loadCount,
content,
sessionId
};

File diff suppressed because it is too large Load Diff

View File

@@ -23,6 +23,8 @@ const i18n = {
'common.loading': 'Loading...',
'common.error': 'Error',
'common.success': 'Success',
'common.deleteSuccess': 'Deleted successfully',
'common.deleteFailed': 'Delete failed',
'common.retry': 'Retry',
'common.refresh': 'Refresh',
'common.minutes': 'minutes',
@@ -1345,17 +1347,64 @@ const i18n = {
'apiSettings.editEndpoint': 'Edit Endpoint',
'apiSettings.deleteEndpoint': 'Delete Endpoint',
'apiSettings.providerType': 'Provider Type',
'apiSettings.apiFormat': 'API Format',
'apiSettings.compatible': 'Compatible',
'apiSettings.customFormat': 'Custom Format',
'apiSettings.apiFormatHint': 'Most providers (DeepSeek, Ollama, etc.) use OpenAI-compatible format',
'apiSettings.displayName': 'Display Name',
'apiSettings.apiKey': 'API Key',
'apiSettings.apiBaseUrl': 'API Base URL',
'apiSettings.useEnvVar': 'Use environment variable',
'apiSettings.enableProvider': 'Enable provider',
'apiSettings.advancedSettings': 'Advanced Settings',
'apiSettings.basicInfo': 'Basic Info',
'apiSettings.endpointSettings': 'Endpoint Settings',
'apiSettings.timeout': 'Timeout (seconds)',
'apiSettings.seconds': 'seconds',
'apiSettings.timeoutHint': 'Request timeout in seconds (default: 300)',
'apiSettings.maxRetries': 'Max Retries',
'apiSettings.maxRetriesHint': 'Maximum retry attempts on failure',
'apiSettings.organization': 'Organization ID',
'apiSettings.organizationHint': 'OpenAI organization ID (org-...)',
'apiSettings.apiVersion': 'API Version',
'apiSettings.apiVersionHint': 'Azure API version (e.g., 2024-02-01)',
'apiSettings.rpm': 'RPM Limit',
'apiSettings.tpm': 'TPM Limit',
'apiSettings.unlimited': 'Unlimited',
'apiSettings.proxy': 'Proxy Server',
'apiSettings.proxyHint': 'HTTP proxy server URL',
'apiSettings.customHeaders': 'Custom Headers',
'apiSettings.customHeadersHint': 'JSON object with custom HTTP headers',
'apiSettings.invalidJsonHeaders': 'Invalid JSON in custom headers',
'apiSettings.searchProviders': 'Search providers...',
'apiSettings.selectProvider': 'Select a Provider',
'apiSettings.selectProviderHint': 'Select a provider from the list to view and manage its settings',
'apiSettings.noProvidersFound': 'No providers found',
'apiSettings.llmModels': 'LLM Models',
'apiSettings.embeddingModels': 'Embedding Models',
'apiSettings.manageModels': 'Manage',
'apiSettings.addModel': 'Add Model',
'apiSettings.multiKeySettings': 'Multi-Key Settings',
'apiSettings.noModels': 'No models configured',
'apiSettings.previewModel': 'Preview',
'apiSettings.modelSettings': 'Model Settings',
'apiSettings.deleteModel': 'Delete Model',
'apiSettings.providerUpdated': 'Provider updated',
'apiSettings.preview': 'Preview',
'apiSettings.used': 'used',
'apiSettings.total': 'total',
'apiSettings.testConnection': 'Test Connection',
'apiSettings.endpointId': 'Endpoint ID',
'apiSettings.endpointIdHint': 'Usage: ccw cli -p "..." --model <endpoint-id>',
'apiSettings.endpoints': 'Endpoints',
'apiSettings.addEndpointHint': 'Create custom endpoint aliases for CLI usage',
'apiSettings.endpointModel': 'Model',
'apiSettings.selectEndpoint': 'Select an endpoint',
'apiSettings.selectEndpointHint': 'Choose an endpoint from the list to view or edit its settings',
'apiSettings.provider': 'Provider',
'apiSettings.model': 'Model',
'apiSettings.selectModel': 'Select model',
'apiSettings.noModelsConfigured': 'No models configured for this provider',
'apiSettings.cacheStrategy': 'Cache Strategy',
'apiSettings.enableContextCaching': 'Enable Context Caching',
'apiSettings.cacheTTL': 'TTL (minutes)',
@@ -1386,6 +1435,82 @@ const i18n = {
'apiSettings.addProviderFirst': 'Please add a provider first',
'apiSettings.failedToLoad': 'Failed to load API settings',
'apiSettings.toggleVisibility': 'Toggle visibility',
'apiSettings.noProvidersHint': 'Add an API provider to get started',
'apiSettings.noEndpointsHint': 'Create custom endpoints for quick access to models',
'apiSettings.cache': 'Cache',
'apiSettings.off': 'Off',
'apiSettings.used': 'used',
'apiSettings.total': 'total',
'apiSettings.cacheUsage': 'Usage',
'apiSettings.cacheSize': 'Size',
'apiSettings.endpointsDescription': 'Manage custom API endpoints for quick model access',
'apiSettings.totalEndpoints': 'Total Endpoints',
'apiSettings.cachedEndpoints': 'Cached Endpoints',
'apiSettings.cacheTabHint': 'Configure global cache settings and view statistics in the main panel',
'apiSettings.cacheDescription': 'Manage response caching to improve performance and reduce costs',
'apiSettings.cachedEntries': 'Cached Entries',
'apiSettings.storageUsed': 'Storage Used',
'apiSettings.cacheActions': 'Cache Actions',
'apiSettings.cacheStatistics': 'Cache Statistics',
'apiSettings.globalCache': 'Global Cache',
// Multi-key management
'apiSettings.apiKeys': 'API Keys',
'apiSettings.addKey': 'Add Key',
'apiSettings.keyLabel': 'Label',
'apiSettings.keyValue': 'API Key',
'apiSettings.keyWeight': 'Weight',
'apiSettings.removeKey': 'Remove',
'apiSettings.noKeys': 'No API keys configured',
'apiSettings.primaryKey': 'Primary Key',
// Routing strategy
'apiSettings.routingStrategy': 'Routing Strategy',
'apiSettings.simpleShuffleRouting': 'Simple Shuffle (Random)',
'apiSettings.weightedRouting': 'Weighted Distribution',
'apiSettings.latencyRouting': 'Latency-Based',
'apiSettings.costRouting': 'Cost-Based',
'apiSettings.leastBusyRouting': 'Least Busy',
'apiSettings.routingHint': 'How to distribute requests across multiple API keys',
// Health check
'apiSettings.healthCheck': 'Health Check',
'apiSettings.enableHealthCheck': 'Enable Health Check',
'apiSettings.healthInterval': 'Check Interval (seconds)',
'apiSettings.healthCooldown': 'Cooldown (seconds)',
'apiSettings.failureThreshold': 'Failure Threshold',
'apiSettings.healthStatus': 'Status',
'apiSettings.healthy': 'Healthy',
'apiSettings.unhealthy': 'Unhealthy',
'apiSettings.unknown': 'Unknown',
'apiSettings.lastCheck': 'Last Check',
'apiSettings.testKey': 'Test Key',
'apiSettings.testingKey': 'Testing...',
'apiSettings.keyValid': 'Key is valid',
'apiSettings.keyInvalid': 'Key is invalid',
// Embedding models
'apiSettings.embeddingDimensions': 'Dimensions',
'apiSettings.embeddingMaxTokens': 'Max Tokens',
'apiSettings.selectEmbeddingModel': 'Select Embedding Model',
// Model modal
'apiSettings.addLlmModel': 'Add LLM Model',
'apiSettings.addEmbeddingModel': 'Add Embedding Model',
'apiSettings.modelId': 'Model ID',
'apiSettings.modelName': 'Display Name',
'apiSettings.modelSeries': 'Series',
'apiSettings.selectFromPresets': 'Select from Presets',
'apiSettings.customModel': 'Custom Model',
'apiSettings.capabilities': 'Capabilities',
'apiSettings.streaming': 'Streaming',
'apiSettings.functionCalling': 'Function Calling',
'apiSettings.vision': 'Vision',
'apiSettings.contextWindow': 'Context Window',
'apiSettings.description': 'Description',
'apiSettings.optional': 'Optional',
'apiSettings.modelIdExists': 'Model ID already exists',
'apiSettings.useModelTreeToManage': 'Use the model tree to manage individual models',
// Common
'common.cancel': 'Cancel',
@@ -1410,6 +1535,7 @@ const i18n = {
'common.saveFailed': 'Failed to save',
'common.unknownError': 'Unknown error',
'common.exception': 'Exception',
'common.status': 'Status',
// Core Memory
'title.coreMemory': 'Core Memory',
@@ -1537,6 +1663,8 @@ const i18n = {
'common.loading': '加载中...',
'common.error': '错误',
'common.success': '成功',
'common.deleteSuccess': '删除成功',
'common.deleteFailed': '删除失败',
'common.retry': '重试',
'common.refresh': '刷新',
'common.minutes': '分钟',
@@ -2869,17 +2997,64 @@ const i18n = {
'apiSettings.editEndpoint': '编辑端点',
'apiSettings.deleteEndpoint': '删除端点',
'apiSettings.providerType': '提供商类型',
'apiSettings.apiFormat': 'API 格式',
'apiSettings.compatible': '兼容',
'apiSettings.customFormat': '自定义格式',
'apiSettings.apiFormatHint': '大多数供应商DeepSeek、Ollama 等)使用 OpenAI 兼容格式',
'apiSettings.displayName': '显示名称',
'apiSettings.apiKey': 'API 密钥',
'apiSettings.apiBaseUrl': 'API 基础 URL',
'apiSettings.useEnvVar': '使用环境变量',
'apiSettings.enableProvider': '启用提供商',
'apiSettings.advancedSettings': '高级设置',
'apiSettings.basicInfo': '基本信息',
'apiSettings.endpointSettings': '端点设置',
'apiSettings.timeout': '超时时间(秒)',
'apiSettings.seconds': '秒',
'apiSettings.timeoutHint': '请求超时时间单位秒默认300',
'apiSettings.maxRetries': '最大重试次数',
'apiSettings.maxRetriesHint': '失败后最大重试次数',
'apiSettings.organization': '组织 ID',
'apiSettings.organizationHint': 'OpenAI 组织 IDorg-...',
'apiSettings.apiVersion': 'API 版本',
'apiSettings.apiVersionHint': 'Azure API 版本(如 2024-02-01',
'apiSettings.rpm': 'RPM 限制',
'apiSettings.tpm': 'TPM 限制',
'apiSettings.unlimited': '无限制',
'apiSettings.proxy': '代理服务器',
'apiSettings.proxyHint': 'HTTP 代理服务器 URL',
'apiSettings.customHeaders': '自定义请求头',
'apiSettings.customHeadersHint': '自定义 HTTP 请求头的 JSON 对象',
'apiSettings.invalidJsonHeaders': '自定义请求头 JSON 格式无效',
'apiSettings.searchProviders': '搜索供应商...',
'apiSettings.selectProvider': '选择供应商',
'apiSettings.selectProviderHint': '从列表中选择一个供应商来查看和管理其设置',
'apiSettings.noProvidersFound': '未找到供应商',
'apiSettings.llmModels': '大语言模型',
'apiSettings.embeddingModels': '向量模型',
'apiSettings.manageModels': '管理',
'apiSettings.addModel': '添加模型',
'apiSettings.multiKeySettings': '多密钥设置',
'apiSettings.noModels': '暂无模型配置',
'apiSettings.previewModel': '预览',
'apiSettings.modelSettings': '模型设置',
'apiSettings.deleteModel': '删除模型',
'apiSettings.providerUpdated': '供应商已更新',
'apiSettings.preview': '预览',
'apiSettings.used': '已使用',
'apiSettings.total': '总计',
'apiSettings.testConnection': '测试连接',
'apiSettings.endpointId': '端点 ID',
'apiSettings.endpointIdHint': '用法: ccw cli -p "..." --model <端点ID>',
'apiSettings.endpoints': '端点',
'apiSettings.addEndpointHint': '创建用于 CLI 的自定义端点别名',
'apiSettings.endpointModel': '模型',
'apiSettings.selectEndpoint': '选择端点',
'apiSettings.selectEndpointHint': '从列表中选择一个端点以查看或编辑其设置',
'apiSettings.provider': '提供商',
'apiSettings.model': '模型',
'apiSettings.selectModel': '选择模型',
'apiSettings.noModelsConfigured': '该供应商未配置模型',
'apiSettings.cacheStrategy': '缓存策略',
'apiSettings.enableContextCaching': '启用上下文缓存',
'apiSettings.cacheTTL': 'TTL (分钟)',
@@ -2910,6 +3085,82 @@ const i18n = {
'apiSettings.addProviderFirst': '请先添加提供商',
'apiSettings.failedToLoad': '加载 API 设置失败',
'apiSettings.toggleVisibility': '切换可见性',
'apiSettings.noProvidersHint': '添加 API 提供商以开始使用',
'apiSettings.noEndpointsHint': '创建自定义端点以快速访问模型',
'apiSettings.cache': '缓存',
'apiSettings.off': '关闭',
'apiSettings.used': '已用',
'apiSettings.total': '总计',
'apiSettings.cacheUsage': '使用率',
'apiSettings.cacheSize': '大小',
'apiSettings.endpointsDescription': '管理自定义 API 端点以快速访问模型',
'apiSettings.totalEndpoints': '总端点数',
'apiSettings.cachedEndpoints': '缓存端点数',
'apiSettings.cacheTabHint': '在主面板中配置全局缓存设置并查看统计信息',
'apiSettings.cacheDescription': '管理响应缓存以提高性能并降低成本',
'apiSettings.cachedEntries': '缓存条目',
'apiSettings.storageUsed': '已用存储',
'apiSettings.cacheActions': '缓存操作',
'apiSettings.cacheStatistics': '缓存统计',
'apiSettings.globalCache': '全局缓存',
// Multi-key management
'apiSettings.apiKeys': 'API 密钥',
'apiSettings.addKey': '添加密钥',
'apiSettings.keyLabel': '标签',
'apiSettings.keyValue': 'API 密钥',
'apiSettings.keyWeight': '权重',
'apiSettings.removeKey': '移除',
'apiSettings.noKeys': '未配置 API 密钥',
'apiSettings.primaryKey': '主密钥',
// Routing strategy
'apiSettings.routingStrategy': '路由策略',
'apiSettings.simpleShuffleRouting': '简单随机',
'apiSettings.weightedRouting': '权重分配',
'apiSettings.latencyRouting': '延迟优先',
'apiSettings.costRouting': '成本优先',
'apiSettings.leastBusyRouting': '最少并发',
'apiSettings.routingHint': '如何在多个 API 密钥间分配请求',
// Health check
'apiSettings.healthCheck': '健康检查',
'apiSettings.enableHealthCheck': '启用健康检查',
'apiSettings.healthInterval': '检查间隔(秒)',
'apiSettings.healthCooldown': '冷却时间(秒)',
'apiSettings.failureThreshold': '失败阈值',
'apiSettings.healthStatus': '状态',
'apiSettings.healthy': '健康',
'apiSettings.unhealthy': '异常',
'apiSettings.unknown': '未知',
'apiSettings.lastCheck': '最后检查',
'apiSettings.testKey': '测试密钥',
'apiSettings.testingKey': '测试中...',
'apiSettings.keyValid': '密钥有效',
'apiSettings.keyInvalid': '密钥无效',
// Embedding models
'apiSettings.embeddingDimensions': '向量维度',
'apiSettings.embeddingMaxTokens': '最大 Token',
'apiSettings.selectEmbeddingModel': '选择嵌入模型',
// Model modal
'apiSettings.addLlmModel': '添加 LLM 模型',
'apiSettings.addEmbeddingModel': '添加嵌入模型',
'apiSettings.modelId': '模型 ID',
'apiSettings.modelName': '显示名称',
'apiSettings.modelSeries': '模型系列',
'apiSettings.selectFromPresets': '从预设选择',
'apiSettings.customModel': '自定义模型',
'apiSettings.capabilities': '能力',
'apiSettings.streaming': '流式输出',
'apiSettings.functionCalling': '函数调用',
'apiSettings.vision': '视觉能力',
'apiSettings.contextWindow': '上下文窗口',
'apiSettings.description': '描述',
'apiSettings.optional': '可选',
'apiSettings.modelIdExists': '模型 ID 已存在',
'apiSettings.useModelTreeToManage': '使用模型树管理各个模型',
// Common
'common.cancel': '取消',
@@ -2934,6 +3185,7 @@ const i18n = {
'common.saveFailed': '保存失败',
'common.unknownError': '未知错误',
'common.exception': '异常',
'common.status': '状态',
// Core Memory
'title.coreMemory': '核心记忆',

File diff suppressed because it is too large Load Diff

View File

@@ -810,8 +810,8 @@ function buildManualDownloadGuide() {
'<i data-lucide="info" class="w-3.5 h-3.5 mt-0.5 flex-shrink-0"></i>' +
'<div>' +
'<strong>' + (t('codexlens.cacheLocation') || 'Cache Location') + ':</strong><br>' +
'<code class="text-xs">Windows: %LOCALAPPDATA%\\Temp\\fastembed_cache</code><br>' +
'<code class="text-xs">Linux/Mac: ~/.cache/fastembed</code>' +
'<code class="text-xs">Default: ~/.cache/huggingface</code><br>' +
'<code class="text-xs text-muted-foreground">(Check HF_HOME env var if set)</code>' +
'</div>' +
'</div>' +
'</div>' +

View File

@@ -67,7 +67,7 @@ const ParamsSchema = z.object({
model: z.string().optional(),
cd: z.string().optional(),
includeDirs: z.string().optional(),
timeout: z.number().default(300000),
timeout: z.number().default(0), // 0 = no internal timeout, controlled by external caller (e.g., bash timeout)
resume: z.union([z.boolean(), z.string()]).optional(), // true = last, string = single ID or comma-separated IDs
id: z.string().optional(), // Custom execution ID (e.g., IMPL-001-step1)
noNative: z.boolean().optional(), // Force prompt concatenation instead of native resume
@@ -1058,19 +1058,24 @@ async function executeCliTool(
reject(new Error(`Failed to spawn ${tool}: ${error.message}`));
});
// Timeout handling
const timeoutId = setTimeout(() => {
timedOut = true;
child.kill('SIGTERM');
setTimeout(() => {
if (!child.killed) {
child.kill('SIGKILL');
}
}, 5000);
}, timeout);
// Timeout handling (timeout=0 disables internal timeout, controlled by external caller)
let timeoutId: NodeJS.Timeout | null = null;
if (timeout > 0) {
timeoutId = setTimeout(() => {
timedOut = true;
child.kill('SIGTERM');
setTimeout(() => {
if (!child.killed) {
child.kill('SIGKILL');
}
}, 5000);
}, timeout);
}
child.on('close', () => {
clearTimeout(timeoutId);
if (timeoutId) {
clearTimeout(timeoutId);
}
});
});
}
@@ -1115,8 +1120,8 @@ Modes:
},
timeout: {
type: 'number',
description: 'Timeout in milliseconds (default: 300000 = 5 minutes)',
default: 300000
description: 'Timeout in milliseconds (default: 0 = disabled, controlled by external caller)',
default: 0
}
},
required: ['tool', 'prompt']

View File

@@ -6,17 +6,184 @@
*/
/**
* Supported LLM provider types
* API format types (simplified)
* Most providers use OpenAI-compatible format
*/
export type ProviderType =
| 'openai'
| 'anthropic'
| 'ollama'
| 'azure'
| 'google'
| 'mistral'
| 'deepseek'
| 'custom';
| 'openai' // OpenAI-compatible format (most providers)
| 'anthropic' // Anthropic format
| 'custom'; // Custom format
/**
* Advanced provider settings for LiteLLM compatibility
* Maps to LiteLLM's provider configuration options
*/
export interface ProviderAdvancedSettings {
/** Request timeout in seconds (default: 300) */
timeout?: number;
/** Maximum retry attempts on failure (default: 3) */
maxRetries?: number;
/** Organization ID (OpenAI-specific) */
organization?: string;
/** API version string (Azure-specific, e.g., "2024-02-01") */
apiVersion?: string;
/** Custom HTTP headers as JSON object */
customHeaders?: Record<string, string>;
/** Requests per minute rate limit */
rpm?: number;
/** Tokens per minute rate limit */
tpm?: number;
/** Proxy server URL (e.g., "http://proxy.example.com:8080") */
proxy?: string;
}
/**
* Model type classification
*/
export type ModelType = 'llm' | 'embedding';
/**
* Model capability metadata
*/
export interface ModelCapabilities {
/** Whether the model supports streaming responses */
streaming?: boolean;
/** Whether the model supports function/tool calling */
functionCalling?: boolean;
/** Whether the model supports vision/image input */
vision?: boolean;
/** Context window size in tokens */
contextWindow?: number;
/** Embedding dimension (for embedding models only) */
embeddingDimension?: number;
/** Maximum output tokens */
maxOutputTokens?: number;
}
/**
* Routing strategy for load balancing across multiple keys
*/
export type RoutingStrategy =
| 'simple-shuffle' // Random selection (default, recommended)
| 'weighted' // Weight-based distribution
| 'latency-based' // Route to lowest latency
| 'cost-based' // Route to lowest cost
| 'least-busy'; // Route to least concurrent
/**
* Individual API key configuration with optional weight
*/
export interface ApiKeyEntry {
/** Unique identifier */
id: string;
/** API key value or env var reference */
key: string;
/** Display label for this key */
label?: string;
/** Weight for weighted routing (default: 1) */
weight?: number;
/** Whether this key is enabled */
enabled: boolean;
/** Last health check status */
healthStatus?: 'healthy' | 'unhealthy' | 'unknown';
/** Last health check timestamp */
lastHealthCheck?: string;
/** Error message if unhealthy */
lastError?: string;
}
/**
* Health check configuration
*/
export interface HealthCheckConfig {
/** Enable automatic health checks */
enabled: boolean;
/** Check interval in seconds (default: 300) */
intervalSeconds: number;
/** Cooldown period after failure in seconds (default: 5) */
cooldownSeconds: number;
/** Number of failures before marking unhealthy (default: 3) */
failureThreshold: number;
}
/**
* Model-specific endpoint settings
* Allows per-model configuration overrides
*/
export interface ModelEndpointSettings {
/** Override base URL for this model */
baseUrl?: string;
/** Override timeout for this model */
timeout?: number;
/** Override max retries for this model */
maxRetries?: number;
/** Custom headers for this model */
customHeaders?: Record<string, string>;
/** Cache strategy for this model */
cacheStrategy?: CacheStrategy;
}
/**
* Model definition with type and grouping
*/
export interface ModelDefinition {
/** Unique identifier for this model */
id: string;
/** Display name for UI */
name: string;
/** Model type: LLM or Embedding */
type: ModelType;
/** Model series for grouping (e.g., "GPT-4", "Claude-3") */
series: string;
/** Whether this model is enabled */
enabled: boolean;
/** Model capabilities */
capabilities?: ModelCapabilities;
/** Model-specific endpoint settings */
endpointSettings?: ModelEndpointSettings;
/** Optional description */
description?: string;
/** Creation timestamp (ISO 8601) */
createdAt: string;
/** Last update timestamp (ISO 8601) */
updatedAt: string;
}
/**
* Provider credential configuration
@@ -41,6 +208,24 @@ export interface ProviderCredential {
/** Whether this provider is enabled */
enabled: boolean;
/** Advanced provider settings (optional) */
advancedSettings?: ProviderAdvancedSettings;
/** Multiple API keys for load balancing */
apiKeys?: ApiKeyEntry[];
/** Routing strategy for multi-key load balancing */
routingStrategy?: RoutingStrategy;
/** Health check configuration */
healthCheck?: HealthCheckConfig;
/** LLM models configured for this provider */
llmModels?: ModelDefinition[];
/** Embedding models configured for this provider */
embeddingModels?: ModelDefinition[];
/** Creation timestamp (ISO 8601) */
createdAt: string;