mirror of
https://github.com/catlog22/Claude-Code-Workflow.git
synced 2026-02-05 01:50:27 +08:00
- Updated embedding_manager.py to include backend parameter in model configuration. - Modified model_manager.py to utilize cache_name for ONNX models. - Refactored hybrid_search.py to improve embedder initialization based on backend type. - Added backend column to vector_store.py for better model configuration management. - Implemented migration for existing database to include backend information. - Enhanced API settings implementation with comprehensive provider and endpoint management. - Introduced LiteLLM integration guide detailing configuration and usage. - Added examples for LiteLLM usage in TypeScript.
309 lines
9.6 KiB
Markdown
309 lines
9.6 KiB
Markdown
# LiteLLM Integration Guide
|
|
|
|
## Overview
|
|
|
|
CCW now supports custom LiteLLM endpoints with integrated context caching. You can configure multiple providers (OpenAI, Anthropic, Ollama, etc.) and create custom endpoints with file-based caching strategies.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ CLI Executor │
|
|
│ │
|
|
│ ┌─────────────┐ ┌──────────────────────────────┐ │
|
|
│ │ --model │────────>│ Route Decision: │ │
|
|
│ │ flag │ │ - gemini/qwen/codex → CLI │ │
|
|
│ └─────────────┘ │ - custom ID → LiteLLM │ │
|
|
│ └──────────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ LiteLLM Executor │
|
|
│ │
|
|
│ 1. Load endpoint config (litellm-api-config.json) │
|
|
│ 2. Extract @patterns from prompt │
|
|
│ 3. Pack files via context-cache │
|
|
│ 4. Call LiteLLM client with cached content + prompt │
|
|
│ 5. Return result │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### File Location
|
|
|
|
Configuration is stored per-project:
|
|
```
|
|
<project>/.ccw/storage/config/litellm-api-config.json
|
|
```
|
|
|
|
### Configuration Structure
|
|
|
|
```json
|
|
{
|
|
"version": 1,
|
|
"providers": [
|
|
{
|
|
"id": "openai-1234567890",
|
|
"name": "My OpenAI",
|
|
"type": "openai",
|
|
"apiKey": "${OPENAI_API_KEY}",
|
|
"enabled": true,
|
|
"createdAt": "2025-01-01T00:00:00.000Z",
|
|
"updatedAt": "2025-01-01T00:00:00.000Z"
|
|
}
|
|
],
|
|
"endpoints": [
|
|
{
|
|
"id": "my-gpt4o",
|
|
"name": "GPT-4o with Context Cache",
|
|
"providerId": "openai-1234567890",
|
|
"model": "gpt-4o",
|
|
"description": "GPT-4o with automatic file caching",
|
|
"cacheStrategy": {
|
|
"enabled": true,
|
|
"ttlMinutes": 60,
|
|
"maxSizeKB": 512,
|
|
"filePatterns": ["*.md", "*.ts", "*.js"]
|
|
},
|
|
"enabled": true,
|
|
"createdAt": "2025-01-01T00:00:00.000Z",
|
|
"updatedAt": "2025-01-01T00:00:00.000Z"
|
|
}
|
|
],
|
|
"defaultEndpoint": "my-gpt4o",
|
|
"globalCacheSettings": {
|
|
"enabled": true,
|
|
"cacheDir": "~/.ccw/cache/context",
|
|
"maxTotalSizeMB": 100
|
|
}
|
|
}
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Via CLI
|
|
|
|
```bash
|
|
# Use custom endpoint with --model flag
|
|
ccw cli -p "Analyze authentication flow" --tool litellm --model my-gpt4o
|
|
|
|
# With context patterns (automatically cached)
|
|
ccw cli -p "@src/auth/**/*.ts Review security" --tool litellm --model my-gpt4o
|
|
|
|
# Disable caching for specific call
|
|
ccw cli -p "Quick question" --tool litellm --model my-gpt4o --no-cache
|
|
```
|
|
|
|
### Via Dashboard API
|
|
|
|
#### Create Provider
|
|
```bash
|
|
curl -X POST http://localhost:3000/api/litellm-api/providers \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"name": "My OpenAI",
|
|
"type": "openai",
|
|
"apiKey": "${OPENAI_API_KEY}",
|
|
"enabled": true
|
|
}'
|
|
```
|
|
|
|
#### Create Endpoint
|
|
```bash
|
|
curl -X POST http://localhost:3000/api/litellm-api/endpoints \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"id": "my-gpt4o",
|
|
"name": "GPT-4o with Cache",
|
|
"providerId": "openai-1234567890",
|
|
"model": "gpt-4o",
|
|
"cacheStrategy": {
|
|
"enabled": true,
|
|
"ttlMinutes": 60,
|
|
"maxSizeKB": 512,
|
|
"filePatterns": ["*.md", "*.ts"]
|
|
},
|
|
"enabled": true
|
|
}'
|
|
```
|
|
|
|
#### Test Provider Connection
|
|
```bash
|
|
curl -X POST http://localhost:3000/api/litellm-api/providers/openai-1234567890/test
|
|
```
|
|
|
|
## Context Caching
|
|
|
|
### How It Works
|
|
|
|
1. **Pattern Detection**: LiteLLM executor scans prompt for `@patterns`
|
|
```
|
|
@src/**/*.ts
|
|
@CLAUDE.md
|
|
@../shared/**/*
|
|
```
|
|
|
|
2. **File Packing**: Files matching patterns are packed via `context-cache` tool
|
|
- Respects `max_file_size` limit (default: 1MB per file)
|
|
- Applies TTL from endpoint config
|
|
- Generates session ID for retrieval
|
|
|
|
3. **Cache Integration**: Cached content is prepended to prompt
|
|
```
|
|
<cached files>
|
|
---
|
|
<original prompt>
|
|
```
|
|
|
|
4. **LLM Call**: Combined prompt sent to LiteLLM with provider credentials
|
|
|
|
### Cache Strategy Configuration
|
|
|
|
```typescript
|
|
interface CacheStrategy {
|
|
enabled: boolean; // Enable/disable caching for this endpoint
|
|
ttlMinutes: number; // Cache lifetime (default: 60)
|
|
maxSizeKB: number; // Max cache size (default: 512KB)
|
|
filePatterns: string[]; // Glob patterns to cache
|
|
}
|
|
```
|
|
|
|
### Example: Security Audit with Cache
|
|
|
|
```bash
|
|
ccw cli -p "
|
|
PURPOSE: OWASP Top 10 security audit of authentication module
|
|
TASK: • Check SQL injection • Verify session management • Test XSS vectors
|
|
CONTEXT: @src/auth/**/*.ts @src/middleware/auth.ts
|
|
EXPECTED: Security report with severity levels and remediation steps
|
|
" --tool litellm --model my-security-scanner --mode analysis
|
|
```
|
|
|
|
**What happens:**
|
|
1. Executor detects `@src/auth/**/*.ts` and `@src/middleware/auth.ts`
|
|
2. Packs matching files into context cache
|
|
3. Cache entry valid for 60 minutes (per endpoint config)
|
|
4. Subsequent calls reuse cached files (no re-packing)
|
|
5. LiteLLM receives full context without manual file specification
|
|
|
|
## Environment Variables
|
|
|
|
### Provider API Keys
|
|
|
|
LiteLLM uses standard environment variable names:
|
|
|
|
| Provider | Env Var Name |
|
|
|------------|-----------------------|
|
|
| OpenAI | `OPENAI_API_KEY` |
|
|
| Anthropic | `ANTHROPIC_API_KEY` |
|
|
| Google | `GOOGLE_API_KEY` |
|
|
| Azure | `AZURE_API_KEY` |
|
|
| Mistral | `MISTRAL_API_KEY` |
|
|
| DeepSeek | `DEEPSEEK_API_KEY` |
|
|
|
|
### Configuration Syntax
|
|
|
|
Use `${ENV_VAR}` syntax in config:
|
|
```json
|
|
{
|
|
"apiKey": "${OPENAI_API_KEY}"
|
|
}
|
|
```
|
|
|
|
The executor resolves these at runtime via `resolveEnvVar()`.
|
|
|
|
## API Reference
|
|
|
|
### Config Manager (`litellm-api-config-manager.ts`)
|
|
|
|
#### Provider Management
|
|
```typescript
|
|
getAllProviders(baseDir: string): ProviderCredential[]
|
|
getProvider(baseDir: string, providerId: string): ProviderCredential | null
|
|
getProviderWithResolvedEnvVars(baseDir: string, providerId: string): ProviderCredential & { resolvedApiKey: string } | null
|
|
addProvider(baseDir: string, providerData): ProviderCredential
|
|
updateProvider(baseDir: string, providerId: string, updates): ProviderCredential
|
|
deleteProvider(baseDir: string, providerId: string): boolean
|
|
```
|
|
|
|
#### Endpoint Management
|
|
```typescript
|
|
getAllEndpoints(baseDir: string): CustomEndpoint[]
|
|
getEndpoint(baseDir: string, endpointId: string): CustomEndpoint | null
|
|
findEndpointById(baseDir: string, endpointId: string): CustomEndpoint | null
|
|
addEndpoint(baseDir: string, endpointData): CustomEndpoint
|
|
updateEndpoint(baseDir: string, endpointId: string, updates): CustomEndpoint
|
|
deleteEndpoint(baseDir: string, endpointId: string): boolean
|
|
```
|
|
|
|
### Executor (`litellm-executor.ts`)
|
|
|
|
```typescript
|
|
interface LiteLLMExecutionOptions {
|
|
prompt: string;
|
|
endpointId: string;
|
|
baseDir: string;
|
|
cwd?: string;
|
|
includeDirs?: string[];
|
|
enableCache?: boolean;
|
|
onOutput?: (data: { type: string; data: string }) => void;
|
|
}
|
|
|
|
interface LiteLLMExecutionResult {
|
|
success: boolean;
|
|
output: string;
|
|
model: string;
|
|
provider: string;
|
|
cacheUsed: boolean;
|
|
cachedFiles?: string[];
|
|
error?: string;
|
|
}
|
|
|
|
executeLiteLLMEndpoint(options: LiteLLMExecutionOptions): Promise<LiteLLMExecutionResult>
|
|
extractPatterns(prompt: string): string[]
|
|
```
|
|
|
|
## Dashboard Integration
|
|
|
|
The dashboard provides UI for managing LiteLLM configuration:
|
|
|
|
- **Providers**: Add/edit/delete provider credentials
|
|
- **Endpoints**: Configure custom endpoints with cache strategies
|
|
- **Cache Stats**: View cache usage and clear entries
|
|
- **Test Connections**: Verify provider API access
|
|
|
|
Routes are handled by `litellm-api-routes.ts`.
|
|
|
|
## Limitations
|
|
|
|
1. **Python Dependency**: Requires `ccw-litellm` Python package installed
|
|
2. **Model Support**: Limited to models supported by LiteLLM library
|
|
3. **Cache Scope**: Context cache is in-memory (not persisted across restarts)
|
|
4. **Pattern Syntax**: Only supports glob-style `@patterns`, not regex
|
|
|
|
## Troubleshooting
|
|
|
|
### Error: "Endpoint not found"
|
|
- Verify endpoint ID matches config file
|
|
- Check `litellm-api-config.json` exists in `.ccw/storage/config/`
|
|
|
|
### Error: "API key not configured"
|
|
- Ensure environment variable is set
|
|
- Verify `${ENV_VAR}` syntax in config
|
|
- Test with `echo $OPENAI_API_KEY`
|
|
|
|
### Error: "Failed to spawn Python process"
|
|
- Install ccw-litellm: `pip install ccw-litellm`
|
|
- Verify Python accessible: `python --version`
|
|
|
|
### Cache Not Applied
|
|
- Check endpoint has `cacheStrategy.enabled: true`
|
|
- Verify prompt contains `@patterns`
|
|
- Check cache TTL hasn't expired
|
|
|
|
## Examples
|
|
|
|
See `examples/litellm-config.json` for complete configuration template.
|