feat: Add unified LiteLLM API management with dashboard UI and CLI integration

- Create ccw-litellm Python package with AbstractEmbedder and AbstractLLMClient interfaces - Add BaseEmbedder abstraction and factory pattern to codex-lens for pluggable backends - Implement API Settings dashboard page for provider credentials and custom endpoints - Add REST API routes for CRUD operations on providers and endpoints - Extend CLI with --model parameter for custom endpoint routing - Integrate existing context-cache for @pattern file resolution - Add provider model registry with predefined models per provider type - Include i18n translations (en/zh) for all new UI elements 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-12 02:37:45 +08:00 · 2025-12-23 20:36:32 +08:00
parent 5228581324
commit bf66b095c7
44 changed files with 4948 additions and 19 deletions
--- a/ccw-litellm/README.md
+++ b/ccw-litellm/README.md
@@ -0,0 +1,180 @@
 # ccw-litellm
 Unified LiteLLM interface layer shared by ccw and codex-lens projects.
 ## Features
 - **Unified LLM Interface**: Abstract interface for LLM operations (chat, completion)
 - **Unified Embedding Interface**: Abstract interface for text embeddings
 - **Multi-Provider Support**: OpenAI, Anthropic, Azure, and more via LiteLLM
 - **Configuration Management**: YAML-based configuration with environment variable substitution
 - **Type Safety**: Full type annotations with Pydantic models
 ## Installation
 ```bash
 pip install -e .
 ```
 ## Quick Start
 ### Configuration
 Create a configuration file at `~/.ccw/config/litellm-config.yaml`:
 ```yaml
 version: 1
 default_provider: openai
 providers:
  openai:
    api_key: ${OPENAI_API_KEY}
    api_base: https://api.openai.com/v1
 llm_models:
  default:
    provider: openai
    model: gpt-4
 embedding_models:
  default:
    provider: openai
    model: text-embedding-3-small
    dimensions: 1536
 ```
 ### Usage
 #### LLM Client
 ```python
 from ccw_litellm import LiteLLMClient, ChatMessage
 # Initialize client with default model
 client = LiteLLMClient(model="default")
 # Chat completion
 messages = [
    ChatMessage(role="user", content="Hello, how are you?")
 ]
 response = client.chat(messages)
 print(response.content)
 # Text completion
 response = client.complete("Once upon a time")
 print(response.content)
 ```
 #### Embedder
 ```python
 from ccw_litellm import LiteLLMEmbedder
 # Initialize embedder with default model
 embedder = LiteLLMEmbedder(model="default")
 # Embed single text
 vector = embedder.embed("Hello world")
 print(vector.shape)  # (1, 1536)
 # Embed multiple texts
 vectors = embedder.embed(["Text 1", "Text 2", "Text 3"])
 print(vectors.shape)  # (3, 1536)
 ```
 #### Custom Configuration
 ```python
 from ccw_litellm import LiteLLMClient, load_config
 # Load custom configuration
 config = load_config("/path/to/custom-config.yaml")
 # Use custom configuration
 client = LiteLLMClient(model="fast", config=config)
 ```
 ## Configuration Reference
 ### Provider Configuration
 ```yaml
 providers:
  <provider_name>:
    api_key: <api_key_or_${ENV_VAR}>
    api_base: <base_url>
 ```
 Supported providers: `openai`, `anthropic`, `azure`, `vertex_ai`, `bedrock`, etc.
 ### LLM Model Configuration
 ```yaml
 llm_models:
  <model_name>:
    provider: <provider_name>
    model: <model_identifier>
 ```
 ### Embedding Model Configuration
 ```yaml
 embedding_models:
  <model_name>:
    provider: <provider_name>
    model: <model_identifier>
    dimensions: <embedding_dimensions>
 ```
 ## Environment Variables
 The configuration supports environment variable substitution using the `${VAR}` or `${VAR:-default}` syntax:
 ```yaml
 providers:
  openai:
    api_key: ${OPENAI_API_KEY}              # Required
    api_base: ${OPENAI_API_BASE:-https://api.openai.com/v1}  # With default
 ```
 ## API Reference
 ### Interfaces
 - `AbstractLLMClient`: Abstract base class for LLM clients
 - `AbstractEmbedder`: Abstract base class for embedders
 - `ChatMessage`: Message data class (role, content)
 - `LLMResponse`: Response data class (content, raw)
 ### Implementations
 - `LiteLLMClient`: LiteLLM implementation of AbstractLLMClient
 - `LiteLLMEmbedder`: LiteLLM implementation of AbstractEmbedder
 ### Configuration
 - `LiteLLMConfig`: Root configuration model
 - `ProviderConfig`: Provider configuration model
 - `LLMModelConfig`: LLM model configuration model
 - `EmbeddingModelConfig`: Embedding model configuration model
 - `load_config(path)`: Load configuration from YAML file
 - `get_config(path, reload)`: Get global configuration singleton
 - `reset_config()`: Reset global configuration (for testing)
 ## Development
 ### Running Tests
 ```bash
 pytest tests/ -v
 ```
 ### Type Checking
 ```bash
 mypy src/ccw_litellm
 ```
 ## License
 MIT
--- a/ccw-litellm/litellm-config.yaml.example
+++ b/ccw-litellm/litellm-config.yaml.example
@@ -0,0 +1,53 @@
 # LiteLLM Unified Configuration
 # Copy to ~/.ccw/config/litellm-config.yaml
 version: 1
 # Default provider for LLM calls
 default_provider: openai
 # Provider configurations
 providers:
  openai:
    api_key: ${OPENAI_API_KEY}
    api_base: https://api.openai.com/v1
  anthropic:
    api_key: ${ANTHROPIC_API_KEY}
  ollama:
    api_base: http://localhost:11434
  azure:
    api_key: ${AZURE_API_KEY}
    api_base: ${AZURE_API_BASE}
 # LLM model configurations
 llm_models:
  default:
    provider: openai
    model: gpt-4o
  fast:
    provider: openai
    model: gpt-4o-mini
  claude:
    provider: anthropic
    model: claude-sonnet-4-20250514
  local:
    provider: ollama
    model: llama3.2
 # Embedding model configurations
 embedding_models:
  default:
    provider: openai
    model: text-embedding-3-small
    dimensions: 1536
  large:
    provider: openai
    model: text-embedding-3-large
    dimensions: 3072
  ada:
    provider: openai
    model: text-embedding-ada-002
    dimensions: 1536
--- a/ccw-litellm/pyproject.toml
+++ b/ccw-litellm/pyproject.toml
@@ -0,0 +1,35 @@
 [build-system]
 requires = ["setuptools>=61.0"]
 build-backend = "setuptools.build_meta"
 [project]
 name = "ccw-litellm"
 version = "0.1.0"
 description = "Unified LiteLLM interface layer shared by ccw and codex-lens"
 requires-python = ">=3.10"
 authors = [{ name = "ccw-litellm contributors" }]
 dependencies = [
  "litellm>=1.0.0",
  "pyyaml",
  "numpy",
  "pydantic>=2.0",
 ]
 [project.optional-dependencies]
 dev = [
  "pytest>=7.0",
 ]
 [project.scripts]
 ccw-litellm = "ccw_litellm.cli:main"
 [tool.setuptools]
 package-dir = { "" = "src" }
 [tool.setuptools.packages.find]
 where = ["src"]
 include = ["ccw_litellm*"]
 [tool.pytest.ini_options]
 testpaths = ["tests"]
 addopts = "-q"
--- a/ccw-litellm/src/ccw_litellm.egg-info/PKG-INFO
+++ b/ccw-litellm/src/ccw_litellm.egg-info/PKG-INFO
@@ -0,0 +1,12 @@
 Metadata-Version: 2.4
 Name: ccw-litellm
 Version: 0.1.0
 Summary: Unified LiteLLM interface layer shared by ccw and codex-lens
 Author: ccw-litellm contributors
 Requires-Python: >=3.10
 Requires-Dist: litellm>=1.0.0
 Requires-Dist: pyyaml
 Requires-Dist: numpy
 Requires-Dist: pydantic>=2.0
 Provides-Extra: dev
 Requires-Dist: pytest>=7.0; extra == "dev"
--- a/ccw-litellm/src/ccw_litellm.egg-info/SOURCES.txt
+++ b/ccw-litellm/src/ccw_litellm.egg-info/SOURCES.txt
@@ -0,0 +1,17 @@
 pyproject.toml
 src/ccw_litellm/__init__.py
 src/ccw_litellm.egg-info/PKG-INFO
 src/ccw_litellm.egg-info/SOURCES.txt
 src/ccw_litellm.egg-info/dependency_links.txt
 src/ccw_litellm.egg-info/requires.txt
 src/ccw_litellm.egg-info/top_level.txt
 src/ccw_litellm/clients/__init__.py
 src/ccw_litellm/clients/litellm_embedder.py
 src/ccw_litellm/clients/litellm_llm.py
 src/ccw_litellm/config/__init__.py
 src/ccw_litellm/config/loader.py
 src/ccw_litellm/config/models.py
 src/ccw_litellm/interfaces/__init__.py
 src/ccw_litellm/interfaces/embedder.py
 src/ccw_litellm/interfaces/llm.py
 tests/test_interfaces.py
--- a/ccw-litellm/src/ccw_litellm.egg-info/dependency_links.txt
+++ b/ccw-litellm/src/ccw_litellm.egg-info/dependency_links.txt
@@ -0,0 +1 @@
--- a/ccw-litellm/src/ccw_litellm.egg-info/requires.txt
+++ b/ccw-litellm/src/ccw_litellm.egg-info/requires.txt
@@ -0,0 +1,7 @@
 litellm>=1.0.0
 pyyaml
 numpy
 pydantic>=2.0
 [dev]
 pytest>=7.0
--- a/ccw-litellm/src/ccw_litellm.egg-info/top_level.txt
+++ b/ccw-litellm/src/ccw_litellm.egg-info/top_level.txt
@@ -0,0 +1 @@
 ccw_litellm
--- a/ccw-litellm/src/ccw_litellm/init.py
+++ b/ccw-litellm/src/ccw_litellm/init.py
@@ -0,0 +1,47 @@
 """ccw-litellm package.
 This package provides a small, stable interface layer around LiteLLM to share
 between the ccw and codex-lens projects.
 """
 from __future__ import annotations
 from .clients import LiteLLMClient, LiteLLMEmbedder
 from .config import (
    EmbeddingModelConfig,
    LiteLLMConfig,
    LLMModelConfig,
    ProviderConfig,
    get_config,
    load_config,
    reset_config,
 )
 from .interfaces import (
    AbstractEmbedder,
    AbstractLLMClient,
    ChatMessage,
    LLMResponse,
 )
 __version__ = "0.1.0"
 __all__ = [
    "__version__",
    # Abstract interfaces
    "AbstractEmbedder",
    "AbstractLLMClient",
    "ChatMessage",
    "LLMResponse",
    # Client implementations
    "LiteLLMClient",
    "LiteLLMEmbedder",
    # Configuration
    "LiteLLMConfig",
    "ProviderConfig",
    "LLMModelConfig",
    "EmbeddingModelConfig",
    "load_config",
    "get_config",
    "reset_config",
 ]
--- a/ccw-litellm/src/ccw_litellm/cli.py
+++ b/ccw-litellm/src/ccw_litellm/cli.py
@@ -0,0 +1,108 @@
 """CLI entry point for ccw-litellm."""
 from __future__ import annotations
 import argparse
 import json
 import sys
 from pathlib import Path
 def main() -> int:
    """Main CLI entry point."""
    parser = argparse.ArgumentParser(
        prog="ccw-litellm",
        description="Unified LiteLLM interface for ccw and codex-lens",
    )
    subparsers = parser.add_subparsers(dest="command", help="Available commands")
    # config command
    config_parser = subparsers.add_parser("config", help="Show configuration")
    config_parser.add_argument(
        "--path",
        type=Path,
        help="Configuration file path",
    )
    # embed command
    embed_parser = subparsers.add_parser("embed", help="Generate embeddings")
    embed_parser.add_argument("texts", nargs="+", help="Texts to embed")
    embed_parser.add_argument(
        "--model",
        default="default",
        help="Embedding model name (default: default)",
    )
    embed_parser.add_argument(
        "--output",
        choices=["json", "shape"],
        default="shape",
        help="Output format (default: shape)",
    )
    # chat command
    chat_parser = subparsers.add_parser("chat", help="Chat with LLM")
    chat_parser.add_argument("message", help="Message to send")
    chat_parser.add_argument(
        "--model",
        default="default",
        help="LLM model name (default: default)",
    )
    # version command
    subparsers.add_parser("version", help="Show version")
    args = parser.parse_args()
    if args.command == "version":
        from . import __version__
        print(f"ccw-litellm {__version__}")
        return 0
    if args.command == "config":
        from .config import get_config
        try:
            config = get_config(config_path=args.path if hasattr(args, "path") else None)
            print(config.model_dump_json(indent=2))
        except Exception as e:
            print(f"Error loading config: {e}", file=sys.stderr)
            return 1
        return 0
    if args.command == "embed":
        from .clients import LiteLLMEmbedder
        try:
            embedder = LiteLLMEmbedder(model=args.model)
            vectors = embedder.embed(args.texts)
            if args.output == "json":
                print(json.dumps(vectors.tolist()))
            else:
                print(f"Shape: {vectors.shape}")
                print(f"Dimensions: {embedder.dimensions}")
        except Exception as e:
            print(f"Error: {e}", file=sys.stderr)
            return 1
        return 0
    if args.command == "chat":
        from .clients import LiteLLMClient
        from .interfaces import ChatMessage
        try:
            client = LiteLLMClient(model=args.model)
            response = client.chat([ChatMessage(role="user", content=args.message)])
            print(response.content)
        except Exception as e:
            print(f"Error: {e}", file=sys.stderr)
            return 1
        return 0
    parser.print_help()
    return 0
 if __name__ == "__main__":
    sys.exit(main())
--- a/ccw-litellm/src/ccw_litellm/clients/init.py
+++ b/ccw-litellm/src/ccw_litellm/clients/init.py
@@ -0,0 +1,12 @@
 """Client implementations for ccw-litellm."""
 from __future__ import annotations
 from .litellm_embedder import LiteLLMEmbedder
 from .litellm_llm import LiteLLMClient
 __all__ = [
    "LiteLLMClient",
    "LiteLLMEmbedder",
 ]
--- a/ccw-litellm/src/ccw_litellm/clients/litellm_embedder.py
+++ b/ccw-litellm/src/ccw_litellm/clients/litellm_embedder.py
@@ -0,0 +1,170 @@
 """LiteLLM embedder implementation for text embeddings."""
 from __future__ import annotations
 import logging
 from typing import Any, Sequence
 import litellm
 import numpy as np
 from numpy.typing import NDArray
 from ..config import LiteLLMConfig, get_config
 from ..interfaces.embedder import AbstractEmbedder
 logger = logging.getLogger(__name__)
 class LiteLLMEmbedder(AbstractEmbedder):
    """LiteLLM embedder implementation.
    Supports multiple embedding providers (OpenAI, etc.) through LiteLLM's unified interface.
    Example:
        embedder = LiteLLMEmbedder(model="default")
        vectors = embedder.embed(["Hello world", "Another text"])
        print(vectors.shape)  # (2, 1536)
    """
    def __init__(
        self,
        model: str = "default",
        config: LiteLLMConfig | None = None,
        **litellm_kwargs: Any,
    ) -> None:
        """Initialize LiteLLM embedder.
        Args:
            model: Model name from configuration (default: "default")
            config: Configuration instance (default: use global config)
            **litellm_kwargs: Additional arguments to pass to litellm.embedding()
        """
        self._config = config or get_config()
        self._model_name = model
        self._litellm_kwargs = litellm_kwargs
        # Get embedding model configuration
        try:
            self._model_config = self._config.get_embedding_model(model)
        except ValueError as e:
            logger.error(f"Failed to get embedding model configuration: {e}")
            raise
        # Get provider configuration
        try:
            self._provider_config = self._config.get_provider(self._model_config.provider)
        except ValueError as e:
            logger.error(f"Failed to get provider configuration: {e}")
            raise
        # Set up LiteLLM environment
        self._setup_litellm()
    def _setup_litellm(self) -> None:
        """Configure LiteLLM with provider settings."""
        provider = self._model_config.provider
        # Set API key
        if self._provider_config.api_key:
            litellm.api_key = self._provider_config.api_key
            # Also set environment-specific keys
            if provider == "openai":
                litellm.openai_key = self._provider_config.api_key
            elif provider == "anthropic":
                litellm.anthropic_key = self._provider_config.api_key
        # Set API base
        if self._provider_config.api_base:
            litellm.api_base = self._provider_config.api_base
    def _format_model_name(self) -> str:
        """Format model name for LiteLLM.
        Returns:
            Formatted model name (e.g., "text-embedding-3-small")
        """
        provider = self._model_config.provider
        model = self._model_config.model
        # For some providers, LiteLLM expects explicit prefix
        if provider in ["azure", "vertex_ai", "bedrock"]:
            return f"{provider}/{model}"
        return model
    @property
    def dimensions(self) -> int:
        """Embedding vector size."""
        return self._model_config.dimensions
    def embed(
        self,
        texts: str | Sequence[str],
        *,
        batch_size: int | None = None,
        **kwargs: Any,
    ) -> NDArray[np.floating]:
        """Embed one or more texts.
        Args:
            texts: Single text or sequence of texts
            batch_size: Batch size for processing (currently unused, LiteLLM handles batching)
            **kwargs: Additional arguments for litellm.embedding()
        Returns:
            A numpy array of shape (n_texts, dimensions).
        Raises:
            Exception: If LiteLLM embedding fails
        """
        # Normalize input to list
        if isinstance(texts, str):
            text_list = [texts]
            single_input = True
        else:
            text_list = list(texts)
            single_input = False
        if not text_list:
            # Return empty array with correct shape
            return np.empty((0, self.dimensions), dtype=np.float32)
        # Merge kwargs
        embedding_kwargs = {**self._litellm_kwargs, **kwargs}
        try:
            # Call LiteLLM embedding
            response = litellm.embedding(
                model=self._format_model_name(),
                input=text_list,
                **embedding_kwargs,
            )
            # Extract embeddings
            embeddings = [item["embedding"] for item in response.data]
            # Convert to numpy array
            result = np.array(embeddings, dtype=np.float32)
            # Validate dimensions
            if result.shape[1] != self.dimensions:
                logger.warning(
                    f"Expected {self.dimensions} dimensions, got {result.shape[1]}. "
                    f"Configuration may be incorrect."
                )
            return result
        except Exception as e:
            logger.error(f"LiteLLM embedding failed: {e}")
            raise
    @property
    def model_name(self) -> str:
        """Get configured model name."""
        return self._model_name
    @property
    def provider(self) -> str:
        """Get configured provider name."""
        return self._model_config.provider
--- a/ccw-litellm/src/ccw_litellm/clients/litellm_llm.py
+++ b/ccw-litellm/src/ccw_litellm/clients/litellm_llm.py
@@ -0,0 +1,165 @@
 """LiteLLM client implementation for LLM operations."""
 from __future__ import annotations
 import logging
 from typing import Any, Sequence
 import litellm
 from ..config import LiteLLMConfig, get_config
 from ..interfaces.llm import AbstractLLMClient, ChatMessage, LLMResponse
 logger = logging.getLogger(__name__)
 class LiteLLMClient(AbstractLLMClient):
    """LiteLLM client implementation.
    Supports multiple providers (OpenAI, Anthropic, etc.) through LiteLLM's unified interface.
    Example:
        client = LiteLLMClient(model="default")
        response = client.chat([
            ChatMessage(role="user", content="Hello!")
        ])
        print(response.content)
    """
    def __init__(
        self,
        model: str = "default",
        config: LiteLLMConfig | None = None,
        **litellm_kwargs: Any,
    ) -> None:
        """Initialize LiteLLM client.
        Args:
            model: Model name from configuration (default: "default")
            config: Configuration instance (default: use global config)
            **litellm_kwargs: Additional arguments to pass to litellm.completion()
        """
        self._config = config or get_config()
        self._model_name = model
        self._litellm_kwargs = litellm_kwargs
        # Get model configuration
        try:
            self._model_config = self._config.get_llm_model(model)
        except ValueError as e:
            logger.error(f"Failed to get model configuration: {e}")
            raise
        # Get provider configuration
        try:
            self._provider_config = self._config.get_provider(self._model_config.provider)
        except ValueError as e:
            logger.error(f"Failed to get provider configuration: {e}")
            raise
        # Set up LiteLLM environment
        self._setup_litellm()
    def _setup_litellm(self) -> None:
        """Configure LiteLLM with provider settings."""
        provider = self._model_config.provider
        # Set API key
        if self._provider_config.api_key:
            env_var = f"{provider.upper()}_API_KEY"
            litellm.api_key = self._provider_config.api_key
            # Also set environment-specific keys
            if provider == "openai":
                litellm.openai_key = self._provider_config.api_key
            elif provider == "anthropic":
                litellm.anthropic_key = self._provider_config.api_key
        # Set API base
        if self._provider_config.api_base:
            litellm.api_base = self._provider_config.api_base
    def _format_model_name(self) -> str:
        """Format model name for LiteLLM.
        Returns:
            Formatted model name (e.g., "gpt-4", "claude-3-opus-20240229")
        """
        # LiteLLM expects model names in format: "provider/model" or just "model"
        # If provider is explicit, use provider/model format
        provider = self._model_config.provider
        model = self._model_config.model
        # For some providers, LiteLLM expects explicit prefix
        if provider in ["anthropic", "azure", "vertex_ai", "bedrock"]:
            return f"{provider}/{model}"
        return model
    def chat(
        self,
        messages: Sequence[ChatMessage],
        **kwargs: Any,
    ) -> LLMResponse:
        """Chat completion for a sequence of messages.
        Args:
            messages: Sequence of chat messages
            **kwargs: Additional arguments for litellm.completion()
        Returns:
            LLM response with content and raw response
        Raises:
            Exception: If LiteLLM completion fails
        """
        # Convert messages to LiteLLM format
        litellm_messages = [
            {"role": msg.role, "content": msg.content} for msg in messages
        ]
        # Merge kwargs
        completion_kwargs = {**self._litellm_kwargs, **kwargs}
        try:
            # Call LiteLLM
            response = litellm.completion(
                model=self._format_model_name(),
                messages=litellm_messages,
                **completion_kwargs,
            )
            # Extract content
            content = response.choices[0].message.content or ""
            return LLMResponse(content=content, raw=response)
        except Exception as e:
            logger.error(f"LiteLLM completion failed: {e}")
            raise
    def complete(self, prompt: str, **kwargs: Any) -> LLMResponse:
        """Text completion for a prompt.
        Args:
            prompt: Input prompt
            **kwargs: Additional arguments for litellm.completion()
        Returns:
            LLM response with content and raw response
        Raises:
            Exception: If LiteLLM completion fails
        """
        # Convert to chat format (most modern models use chat interface)
        messages = [ChatMessage(role="user", content=prompt)]
        return self.chat(messages, **kwargs)
    @property
    def model_name(self) -> str:
        """Get configured model name."""
        return self._model_name
    @property
    def provider(self) -> str:
        """Get configured provider name."""
        return self._model_config.provider
--- a/ccw-litellm/src/ccw_litellm/config/init.py
+++ b/ccw-litellm/src/ccw_litellm/config/init.py
@@ -0,0 +1,22 @@
 """Configuration management for LiteLLM integration."""
 from __future__ import annotations
 from .loader import get_config, load_config, reset_config
 from .models import (
    EmbeddingModelConfig,
    LiteLLMConfig,
    LLMModelConfig,
    ProviderConfig,
 )
 __all__ = [
    "LiteLLMConfig",
    "ProviderConfig",
    "LLMModelConfig",
    "EmbeddingModelConfig",
    "load_config",
    "get_config",
    "reset_config",
 ]
--- a/ccw-litellm/src/ccw_litellm/config/loader.py
+++ b/ccw-litellm/src/ccw_litellm/config/loader.py
@@ -0,0 +1,150 @@
 """Configuration loader with environment variable substitution."""
 from __future__ import annotations
 import os
 import re
 from pathlib import Path
 from typing import Any
 import yaml
 from .models import LiteLLMConfig
 # Default configuration path
 DEFAULT_CONFIG_PATH = Path.home() / ".ccw" / "config" / "litellm-config.yaml"
 # Global configuration singleton
 _config_instance: LiteLLMConfig | None = None
 def _substitute_env_vars(value: Any) -> Any:
    """Recursively substitute environment variables in configuration values.
    Supports ${ENV_VAR} and ${ENV_VAR:-default} syntax.
    Args:
        value: Configuration value (str, dict, list, or primitive)
    Returns:
        Value with environment variables substituted
    """
    if isinstance(value, str):
        # Pattern: ${VAR} or ${VAR:-default}
        pattern = r"\$\{([^:}]+)(?::-(.*?))?\}"
        def replace_var(match: re.Match) -> str:
            var_name = match.group(1)
            default_value = match.group(2) if match.group(2) is not None else ""
            return os.environ.get(var_name, default_value)
        return re.sub(pattern, replace_var, value)
    if isinstance(value, dict):
        return {k: _substitute_env_vars(v) for k, v in value.items()}
    if isinstance(value, list):
        return [_substitute_env_vars(item) for item in value]
    return value
 def _get_default_config() -> dict[str, Any]:
    """Get default configuration when no config file exists.
    Returns:
        Default configuration dictionary
    """
    return {
        "version": 1,
        "default_provider": "openai",
        "providers": {
            "openai": {
                "api_key": "${OPENAI_API_KEY}",
                "api_base": "https://api.openai.com/v1",
            },
        },
        "llm_models": {
            "default": {
                "provider": "openai",
                "model": "gpt-4",
            },
            "fast": {
                "provider": "openai",
                "model": "gpt-3.5-turbo",
            },
        },
        "embedding_models": {
            "default": {
                "provider": "openai",
                "model": "text-embedding-3-small",
                "dimensions": 1536,
            },
        },
    }
 def load_config(config_path: Path | str | None = None) -> LiteLLMConfig:
    """Load LiteLLM configuration from YAML file.
    Args:
        config_path: Path to configuration file (default: ~/.ccw/config/litellm-config.yaml)
    Returns:
        Parsed and validated configuration
    Raises:
        FileNotFoundError: If config file not found and no default available
        ValueError: If configuration is invalid
    """
    if config_path is None:
        config_path = DEFAULT_CONFIG_PATH
    else:
        config_path = Path(config_path)
    # Load configuration
    if config_path.exists():
        try:
            with open(config_path, "r", encoding="utf-8") as f:
                raw_config = yaml.safe_load(f)
        except Exception as e:
            raise ValueError(f"Failed to load configuration from {config_path}: {e}") from e
    else:
        # Use default configuration
        raw_config = _get_default_config()
    # Substitute environment variables
    config_data = _substitute_env_vars(raw_config)
    # Validate and parse with Pydantic
    try:
        return LiteLLMConfig.model_validate(config_data)
    except Exception as e:
        raise ValueError(f"Invalid configuration: {e}") from e
 def get_config(config_path: Path | str | None = None, reload: bool = False) -> LiteLLMConfig:
    """Get global configuration singleton.
    Args:
        config_path: Path to configuration file (default: ~/.ccw/config/litellm-config.yaml)
        reload: Force reload configuration from disk
    Returns:
        Global configuration instance
    """
    global _config_instance
    if _config_instance is None or reload:
        _config_instance = load_config(config_path)
    return _config_instance
 def reset_config() -> None:
    """Reset global configuration singleton.
    Useful for testing.
    """
    global _config_instance
    _config_instance = None
--- a/ccw-litellm/src/ccw_litellm/config/models.py
+++ b/ccw-litellm/src/ccw_litellm/config/models.py
@@ -0,0 +1,130 @@
 """Pydantic configuration models for LiteLLM integration."""
 from __future__ import annotations
 from typing import Any
 from pydantic import BaseModel, Field
 class ProviderConfig(BaseModel):
    """Provider API configuration.
    Supports environment variable substitution in the format ${ENV_VAR}.
    """
    api_key: str | None = None
    api_base: str | None = None
    model_config = {"extra": "allow"}
 class LLMModelConfig(BaseModel):
    """LLM model configuration."""
    provider: str
    model: str
    model_config = {"extra": "allow"}
 class EmbeddingModelConfig(BaseModel):
    """Embedding model configuration."""
    provider: str  # "openai", "fastembed", "ollama", etc.
    model: str
    dimensions: int
    model_config = {"extra": "allow"}
 class LiteLLMConfig(BaseModel):
    """Root configuration for LiteLLM integration.
    Example YAML:
        version: 1
        default_provider: openai
        providers:
          openai:
            api_key: ${OPENAI_API_KEY}
            api_base: https://api.openai.com/v1
          anthropic:
            api_key: ${ANTHROPIC_API_KEY}
        llm_models:
          default:
            provider: openai
            model: gpt-4
          fast:
            provider: openai
            model: gpt-3.5-turbo
        embedding_models:
          default:
            provider: openai
            model: text-embedding-3-small
            dimensions: 1536
    """
    version: int = 1
    default_provider: str = "openai"
    providers: dict[str, ProviderConfig] = Field(default_factory=dict)
    llm_models: dict[str, LLMModelConfig] = Field(default_factory=dict)
    embedding_models: dict[str, EmbeddingModelConfig] = Field(default_factory=dict)
    model_config = {"extra": "allow"}
    def get_llm_model(self, model: str = "default") -> LLMModelConfig:
        """Get LLM model configuration by name.
        Args:
            model: Model name or "default"
        Returns:
            LLM model configuration
        Raises:
            ValueError: If model not found
        """
        if model not in self.llm_models:
            raise ValueError(
                f"LLM model '{model}' not found in configuration. "
                f"Available models: {list(self.llm_models.keys())}"
            )
        return self.llm_models[model]
    def get_embedding_model(self, model: str = "default") -> EmbeddingModelConfig:
        """Get embedding model configuration by name.
        Args:
            model: Model name or "default"
        Returns:
            Embedding model configuration
        Raises:
            ValueError: If model not found
        """
        if model not in self.embedding_models:
            raise ValueError(
                f"Embedding model '{model}' not found in configuration. "
                f"Available models: {list(self.embedding_models.keys())}"
            )
        return self.embedding_models[model]
    def get_provider(self, provider: str) -> ProviderConfig:
        """Get provider configuration by name.
        Args:
            provider: Provider name
        Returns:
            Provider configuration
        Raises:
            ValueError: If provider not found
        """
        if provider not in self.providers:
            raise ValueError(
                f"Provider '{provider}' not found in configuration. "
                f"Available providers: {list(self.providers.keys())}"
            )
        return self.providers[provider]
--- a/ccw-litellm/src/ccw_litellm/interfaces/init.py
+++ b/ccw-litellm/src/ccw_litellm/interfaces/init.py
@@ -0,0 +1,14 @@
 """Abstract interfaces for ccw-litellm."""
 from __future__ import annotations
 from .embedder import AbstractEmbedder
 from .llm import AbstractLLMClient, ChatMessage, LLMResponse
 __all__ = [
    "AbstractEmbedder",
    "AbstractLLMClient",
    "ChatMessage",
    "LLMResponse",
 ]
--- a/ccw-litellm/src/ccw_litellm/interfaces/embedder.py
+++ b/ccw-litellm/src/ccw_litellm/interfaces/embedder.py
@@ -0,0 +1,52 @@
 from __future__ import annotations
 import asyncio
 from abc import ABC, abstractmethod
 from typing import Any, Sequence
 import numpy as np
 from numpy.typing import NDArray
 class AbstractEmbedder(ABC):
    """Embedding interface compatible with fastembed-style embedders.
    Implementers only need to provide the synchronous `embed` method; an
    asynchronous `aembed` wrapper is provided for convenience.
    """
    @property
    @abstractmethod
    def dimensions(self) -> int:
        """Embedding vector size."""
    @abstractmethod
    def embed(
        self,
        texts: str | Sequence[str],
        *,
        batch_size: int | None = None,
        **kwargs: Any,
    ) -> NDArray[np.floating]:
        """Embed one or more texts.
        Returns:
            A numpy array of shape (n_texts, dimensions).
        """
    async def aembed(
        self,
        texts: str | Sequence[str],
        *,
        batch_size: int | None = None,
        **kwargs: Any,
    ) -> NDArray[np.floating]:
        """Async wrapper around `embed` using a worker thread by default."""
        return await asyncio.to_thread(
            self.embed,
            texts,
            batch_size=batch_size,
            **kwargs,
        )
--- a/ccw-litellm/src/ccw_litellm/interfaces/llm.py
+++ b/ccw-litellm/src/ccw_litellm/interfaces/llm.py
@@ -0,0 +1,45 @@
 from __future__ import annotations
 import asyncio
 from abc import ABC, abstractmethod
 from dataclasses import dataclass
 from typing import Any, Literal, Sequence
@dataclass(frozen=True, slots=True)
 class ChatMessage:
    role: Literal["system", "user", "assistant", "tool"]
    content: str
@dataclass(frozen=True, slots=True)
 class LLMResponse:
    content: str
    raw: Any | None = None
 class AbstractLLMClient(ABC):
    """LiteLLM-like client interface.
    Implementers only need to provide synchronous methods; async wrappers are
    provided via `asyncio.to_thread`.
    """
    @abstractmethod
    def chat(self, messages: Sequence[ChatMessage], **kwargs: Any) -> LLMResponse:
        """Chat completion for a sequence of messages."""
    @abstractmethod
    def complete(self, prompt: str, **kwargs: Any) -> LLMResponse:
        """Text completion for a prompt."""
    async def achat(self, messages: Sequence[ChatMessage], **kwargs: Any) -> LLMResponse:
        """Async wrapper around `chat` using a worker thread by default."""
        return await asyncio.to_thread(self.chat, messages, **kwargs)
    async def acomplete(self, prompt: str, **kwargs: Any) -> LLMResponse:
        """Async wrapper around `complete` using a worker thread by default."""
        return await asyncio.to_thread(self.complete, prompt, **kwargs)
--- a/ccw-litellm/tests/conftest.py
+++ b/ccw-litellm/tests/conftest.py
@@ -0,0 +1,11 @@
 from __future__ import annotations
 import sys
 from pathlib import Path
 def pytest_configure() -> None:
    project_root = Path(__file__).resolve().parents[1]
    src_dir = project_root / "src"
    sys.path.insert(0, str(src_dir))
--- a/ccw-litellm/tests/test_interfaces.py
+++ b/ccw-litellm/tests/test_interfaces.py
@@ -0,0 +1,64 @@
 from __future__ import annotations
 import asyncio
 from typing import Any, Sequence
 import numpy as np
 from ccw_litellm.interfaces import AbstractEmbedder, AbstractLLMClient, ChatMessage, LLMResponse
 class _DummyEmbedder(AbstractEmbedder):
    @property
    def dimensions(self) -> int:
        return 3
    def embed(
        self,
        texts: str | Sequence[str],
        *,
        batch_size: int | None = None,
        **kwargs: Any,
    ) -> np.ndarray:
        if isinstance(texts, str):
            texts = [texts]
        _ = batch_size
        _ = kwargs
        return np.zeros((len(texts), self.dimensions), dtype=np.float32)
 class _DummyLLM(AbstractLLMClient):
    def chat(self, messages: Sequence[ChatMessage], **kwargs: Any) -> LLMResponse:
        _ = kwargs
        return LLMResponse(content="".join(m.content for m in messages))
    def complete(self, prompt: str, **kwargs: Any) -> LLMResponse:
        _ = kwargs
        return LLMResponse(content=prompt)
 def test_embed_sync_shape_and_dtype() -> None:
    emb = _DummyEmbedder()
    out = emb.embed(["a", "b"])
    assert out.shape == (2, 3)
    assert out.dtype == np.float32
 def test_embed_async_wrapper() -> None:
    emb = _DummyEmbedder()
    out = asyncio.run(emb.aembed("x"))
    assert out.shape == (1, 3)
 def test_llm_sync() -> None:
    llm = _DummyLLM()
    out = llm.chat([ChatMessage(role="user", content="hi")])
    assert out == LLMResponse(content="hi")
 def test_llm_async_wrappers() -> None:
    llm = _DummyLLM()
    out1 = asyncio.run(llm.achat([ChatMessage(role="user", content="a")]))
    out2 = asyncio.run(llm.acomplete("b"))
    assert out1.content == "a"
    assert out2.content == "b"
--- a/ccw/src/config/litellm-api-config-manager.ts
+++ b/ccw/src/config/litellm-api-config-manager.ts
@@ -0,0 +1,360 @@
 /**
 * LiteLLM API Configuration Manager
 * Manages provider credentials, custom endpoints, and cache settings
 */
 import { existsSync, readFileSync, writeFileSync } from 'fs';
 import { join } from 'path';
 import { StoragePaths, ensureStorageDir } from './storage-paths.js';
 import type {
  LiteLLMApiConfig,
  ProviderCredential,
  CustomEndpoint,
  GlobalCacheSettings,
  ProviderType,
  CacheStrategy,
 } from '../types/litellm-api-config.js';
 /**
 * Default configuration
 */
 function getDefaultConfig(): LiteLLMApiConfig {
  return {
    version: 1,
    providers: [],
    endpoints: [],
    globalCacheSettings: {
      enabled: true,
      cacheDir: '~/.ccw/cache/context',
      maxTotalSizeMB: 100,
    },
  };
 }
 /**
 * Get config file path for a project
 */
 function getConfigPath(baseDir: string): string {
  const paths = StoragePaths.project(baseDir);
  ensureStorageDir(paths.config);
  return join(paths.config, 'litellm-api-config.json');
 }
 /**
 * Load configuration from file
 */
 export function loadLiteLLMApiConfig(baseDir: string): LiteLLMApiConfig {
  const configPath = getConfigPath(baseDir);
  if (!existsSync(configPath)) {
    return getDefaultConfig();
  }
  try {
    const content = readFileSync(configPath, 'utf-8');
    return JSON.parse(content) as LiteLLMApiConfig;
  } catch (error) {
    console.error('[LiteLLM Config] Failed to load config:', error);
    return getDefaultConfig();
  }
 }
 /**
 * Save configuration to file
 */
 function saveConfig(baseDir: string, config: LiteLLMApiConfig): void {
  const configPath = getConfigPath(baseDir);
  writeFileSync(configPath, JSON.stringify(config, null, 2), 'utf-8');
 }
 /**
 * Resolve environment variables in API key
 * Supports ${ENV_VAR} syntax
 */
 export function resolveEnvVar(value: string): string {
  if (!value) return value;
  const envVarMatch = value.match(/^\$\{(.+)\}$/);
  if (envVarMatch) {
    const envVarName = envVarMatch[1];
    return process.env[envVarName] || '';
  }
  return value;
 }
 // ===========================
 // Provider Management
 // ===========================
 /**
 * Get all providers
 */
 export function getAllProviders(baseDir: string): ProviderCredential[] {
  const config = loadLiteLLMApiConfig(baseDir);
  return config.providers;
 }
 /**
 * Get provider by ID
 */
 export function getProvider(baseDir: string, providerId: string): ProviderCredential | null {
  const config = loadLiteLLMApiConfig(baseDir);
  return config.providers.find((p) => p.id === providerId) || null;
 }
 /**
 * Get provider with resolved environment variables
 */
 export function getProviderWithResolvedEnvVars(
  baseDir: string,
  providerId: string
 ): (ProviderCredential & { resolvedApiKey: string }) | null {
  const provider = getProvider(baseDir, providerId);
  if (!provider) return null;
  return {
    ...provider,
    resolvedApiKey: resolveEnvVar(provider.apiKey),
  };
 }
 /**
 * Add new provider
 */
 export function addProvider(
  baseDir: string,
  providerData: Omit<ProviderCredential, 'id' | 'createdAt' | 'updatedAt'>
 ): ProviderCredential {
  const config = loadLiteLLMApiConfig(baseDir);
  const provider: ProviderCredential = {
    ...providerData,
    id: `${providerData.type}-${Date.now()}`,
    createdAt: new Date().toISOString(),
    updatedAt: new Date().toISOString(),
  };
  config.providers.push(provider);
  saveConfig(baseDir, config);
  return provider;
 }
 /**
 * Update provider
 */
 export function updateProvider(
  baseDir: string,
  providerId: string,
  updates: Partial<Omit<ProviderCredential, 'id' | 'createdAt' | 'updatedAt'>>
 ): ProviderCredential {
  const config = loadLiteLLMApiConfig(baseDir);
  const providerIndex = config.providers.findIndex((p) => p.id === providerId);
  if (providerIndex === -1) {
    throw new Error(`Provider not found: ${providerId}`);
  }
  config.providers[providerIndex] = {
    ...config.providers[providerIndex],
    ...updates,
    updatedAt: new Date().toISOString(),
  };
  saveConfig(baseDir, config);
  return config.providers[providerIndex];
 }
 /**
 * Delete provider
 */
 export function deleteProvider(baseDir: string, providerId: string): boolean {
  const config = loadLiteLLMApiConfig(baseDir);
  const initialLength = config.providers.length;
  config.providers = config.providers.filter((p) => p.id !== providerId);
  if (config.providers.length === initialLength) {
    return false;
  }
  // Also remove endpoints using this provider
  config.endpoints = config.endpoints.filter((e) => e.providerId !== providerId);
  saveConfig(baseDir, config);
  return true;
 }
 // ===========================
 // Endpoint Management
 // ===========================
 /**
 * Get all endpoints
 */
 export function getAllEndpoints(baseDir: string): CustomEndpoint[] {
  const config = loadLiteLLMApiConfig(baseDir);
  return config.endpoints;
 }
 /**
 * Get endpoint by ID
 */
 export function getEndpoint(baseDir: string, endpointId: string): CustomEndpoint | null {
  const config = loadLiteLLMApiConfig(baseDir);
  return config.endpoints.find((e) => e.id === endpointId) || null;
 }
 /**
 * Find endpoint by ID (alias for getEndpoint)
 */
 export function findEndpointById(baseDir: string, endpointId: string): CustomEndpoint | null {
  return getEndpoint(baseDir, endpointId);
 }
 /**
 * Add new endpoint
 */
 export function addEndpoint(
  baseDir: string,
  endpointData: Omit<CustomEndpoint, 'createdAt' | 'updatedAt'>
 ): CustomEndpoint {
  const config = loadLiteLLMApiConfig(baseDir);
  // Check if ID already exists
  if (config.endpoints.some((e) => e.id === endpointData.id)) {
    throw new Error(`Endpoint ID already exists: ${endpointData.id}`);
  }
  // Verify provider exists
  if (!config.providers.find((p) => p.id === endpointData.providerId)) {
    throw new Error(`Provider not found: ${endpointData.providerId}`);
  }
  const endpoint: CustomEndpoint = {
    ...endpointData,
    createdAt: new Date().toISOString(),
    updatedAt: new Date().toISOString(),
  };
  config.endpoints.push(endpoint);
  saveConfig(baseDir, config);
  return endpoint;
 }
 /**
 * Update endpoint
 */
 export function updateEndpoint(
  baseDir: string,
  endpointId: string,
  updates: Partial<Omit<CustomEndpoint, 'id' | 'createdAt' | 'updatedAt'>>
 ): CustomEndpoint {
  const config = loadLiteLLMApiConfig(baseDir);
  const endpointIndex = config.endpoints.findIndex((e) => e.id === endpointId);
  if (endpointIndex === -1) {
    throw new Error(`Endpoint not found: ${endpointId}`);
  }
  // Verify provider exists if updating providerId
  if (updates.providerId && !config.providers.find((p) => p.id === updates.providerId)) {
    throw new Error(`Provider not found: ${updates.providerId}`);
  }
  config.endpoints[endpointIndex] = {
    ...config.endpoints[endpointIndex],
    ...updates,
    updatedAt: new Date().toISOString(),
  };
  saveConfig(baseDir, config);
  return config.endpoints[endpointIndex];
 }
 /**
 * Delete endpoint
 */
 export function deleteEndpoint(baseDir: string, endpointId: string): boolean {
  const config = loadLiteLLMApiConfig(baseDir);
  const initialLength = config.endpoints.length;
  config.endpoints = config.endpoints.filter((e) => e.id !== endpointId);
  if (config.endpoints.length === initialLength) {
    return false;
  }
  // Clear default endpoint if deleted
  if (config.defaultEndpoint === endpointId) {
    delete config.defaultEndpoint;
  }
  saveConfig(baseDir, config);
  return true;
 }
 // ===========================
 // Default Endpoint Management
 // ===========================
 /**
 * Get default endpoint
 */
 export function getDefaultEndpoint(baseDir: string): string | undefined {
  const config = loadLiteLLMApiConfig(baseDir);
  return config.defaultEndpoint;
 }
 /**
 * Set default endpoint
 */
 export function setDefaultEndpoint(baseDir: string, endpointId?: string): void {
  const config = loadLiteLLMApiConfig(baseDir);
  if (endpointId) {
    // Verify endpoint exists
    if (!config.endpoints.find((e) => e.id === endpointId)) {
      throw new Error(`Endpoint not found: ${endpointId}`);
    }
    config.defaultEndpoint = endpointId;
  } else {
    delete config.defaultEndpoint;
  }
  saveConfig(baseDir, config);
 }
 // ===========================
 // Cache Settings Management
 // ===========================
 /**
 * Get global cache settings
 */
 export function getGlobalCacheSettings(baseDir: string): GlobalCacheSettings {
  const config = loadLiteLLMApiConfig(baseDir);
  return config.globalCacheSettings;
 }
 /**
 * Update global cache settings
 */
 export function updateGlobalCacheSettings(
  baseDir: string,
  settings: Partial<GlobalCacheSettings>
 ): void {
  const config = loadLiteLLMApiConfig(baseDir);
  config.globalCacheSettings = {
    ...config.globalCacheSettings,
    ...settings,
  };
  saveConfig(baseDir, config);
 }
 // Re-export types
 export type { ProviderCredential, CustomEndpoint, ProviderType, CacheStrategy };
--- a/ccw/src/config/provider-models.ts
+++ b/ccw/src/config/provider-models.ts
@@ -0,0 +1,259 @@
 /**
 * Provider Model Presets
 *
 * Predefined model information for each supported LLM provider.
 * Used for UI dropdowns and validation.
 */
 import type { ProviderType } from '../types/litellm-api-config.js';
 /**
 * Model information metadata
 */
 export interface ModelInfo {
  /** Model identifier (used in API calls) */
  id: string;
  /** Human-readable display name */
  name: string;
  /** Context window size in tokens */
  contextWindow: number;
  /** Whether this model supports prompt caching */
  supportsCaching: boolean;
 }
 /**
 * Predefined models for each provider
 * Used for UI selection and validation
 */
 export const PROVIDER_MODELS: Record<ProviderType, ModelInfo[]> = {
  openai: [
    {
      id: 'gpt-4o',
      name: 'GPT-4o',
      contextWindow: 128000,
      supportsCaching: true
    },
    {
      id: 'gpt-4o-mini',
      name: 'GPT-4o Mini',
      contextWindow: 128000,
      supportsCaching: true
    },
    {
      id: 'o1',
      name: 'O1',
      contextWindow: 200000,
      supportsCaching: true
    },
    {
      id: 'o1-mini',
      name: 'O1 Mini',
      contextWindow: 128000,
      supportsCaching: true
    },
    {
      id: 'gpt-4-turbo',
      name: 'GPT-4 Turbo',
      contextWindow: 128000,
      supportsCaching: false
    }
  ],
  anthropic: [
    {
      id: 'claude-sonnet-4-20250514',
      name: 'Claude Sonnet 4',
      contextWindow: 200000,
      supportsCaching: true
    },
    {
      id: 'claude-3-5-sonnet-20241022',
      name: 'Claude 3.5 Sonnet',
      contextWindow: 200000,
      supportsCaching: true
    },
    {
      id: 'claude-3-5-haiku-20241022',
      name: 'Claude 3.5 Haiku',
      contextWindow: 200000,
      supportsCaching: true
    },
    {
      id: 'claude-3-opus-20240229',
      name: 'Claude 3 Opus',
      contextWindow: 200000,
      supportsCaching: false
    }
  ],
  ollama: [
    {
      id: 'llama3.2',
      name: 'Llama 3.2',
      contextWindow: 128000,
      supportsCaching: false
    },
    {
      id: 'llama3.1',
      name: 'Llama 3.1',
      contextWindow: 128000,
      supportsCaching: false
    },
    {
      id: 'qwen2.5-coder',
      name: 'Qwen 2.5 Coder',
      contextWindow: 32000,
      supportsCaching: false
    },
    {
      id: 'codellama',
      name: 'Code Llama',
      contextWindow: 16000,
      supportsCaching: false
    },
    {
      id: 'mistral',
      name: 'Mistral',
      contextWindow: 32000,
      supportsCaching: false
    }
  ],
  azure: [
    {
      id: 'gpt-4o',
      name: 'GPT-4o (Azure)',
      contextWindow: 128000,
      supportsCaching: true
    },
    {
      id: 'gpt-4o-mini',
      name: 'GPT-4o Mini (Azure)',
      contextWindow: 128000,
      supportsCaching: true
    },
    {
      id: 'gpt-4-turbo',
      name: 'GPT-4 Turbo (Azure)',
      contextWindow: 128000,
      supportsCaching: false
    },
    {
      id: 'gpt-35-turbo',
      name: 'GPT-3.5 Turbo (Azure)',
      contextWindow: 16000,
      supportsCaching: false
    }
  ],
  google: [
    {
      id: 'gemini-2.0-flash-exp',
      name: 'Gemini 2.0 Flash Experimental',
      contextWindow: 1048576,
      supportsCaching: true
    },
    {
      id: 'gemini-1.5-pro',
      name: 'Gemini 1.5 Pro',
      contextWindow: 2097152,
      supportsCaching: true
    },
    {
      id: 'gemini-1.5-flash',
      name: 'Gemini 1.5 Flash',
      contextWindow: 1048576,
      supportsCaching: true
    },
    {
      id: 'gemini-1.0-pro',
      name: 'Gemini 1.0 Pro',
      contextWindow: 32000,
      supportsCaching: false
    }
  ],
  mistral: [
    {
      id: 'mistral-large-latest',
      name: 'Mistral Large',
      contextWindow: 128000,
      supportsCaching: false
    },
    {
      id: 'mistral-medium-latest',
      name: 'Mistral Medium',
      contextWindow: 32000,
      supportsCaching: false
    },
    {
      id: 'mistral-small-latest',
      name: 'Mistral Small',
      contextWindow: 32000,
      supportsCaching: false
    },
    {
      id: 'codestral-latest',
      name: 'Codestral',
      contextWindow: 32000,
      supportsCaching: false
    }
  ],
  deepseek: [
    {
      id: 'deepseek-chat',
      name: 'DeepSeek Chat',
      contextWindow: 64000,
      supportsCaching: false
    },
    {
      id: 'deepseek-coder',
      name: 'DeepSeek Coder',
      contextWindow: 64000,
      supportsCaching: false
    }
  ],
  custom: [
    {
      id: 'custom-model',
      name: 'Custom Model',
      contextWindow: 128000,
      supportsCaching: false
    }
  ]
 };
 /**
 * Get models for a specific provider
 * @param providerType - Provider type to get models for
 * @returns Array of model information
 */
 export function getModelsForProvider(providerType: ProviderType): ModelInfo[] {
  return PROVIDER_MODELS[providerType] || [];
 }
 /**
 * Get model information by ID within a provider
 * @param providerType - Provider type
 * @param modelId - Model identifier
 * @returns Model information or undefined if not found
 */
 export function getModelInfo(providerType: ProviderType, modelId: string): ModelInfo | undefined {
  const models = PROVIDER_MODELS[providerType] || [];
  return models.find(m => m.id === modelId);
 }
 /**
 * Validate if a model ID is supported by a provider
 * @param providerType - Provider type
 * @param modelId - Model identifier to validate
 * @returns true if model is valid for provider
 */
 export function isValidModel(providerType: ProviderType, modelId: string): boolean {
  return getModelInfo(providerType, modelId) !== undefined;
 }
--- a/ccw/src/core/dashboard-generator.ts
+++ b/ccw/src/core/dashboard-generator.ts
@@ -46,7 +46,8 @@ const MODULE_CSS_FILES = [
  '27-graph-explorer.css',
  '28-mcp-manager.css',
  '29-help.css',
-  '30-core-memory.css'
+  '30-core-memory.css',
  '31-api-settings.css'
 ];
 const MODULE_FILES = [
@@ -95,6 +96,7 @@ const MODULE_FILES = [
  'views/skills-manager.js',
  'views/rules-manager.js',
  'views/claude-manager.js',
  'views/api-settings.js',
  'views/help.js',
  'main.js'
 ];
--- a/ccw/src/core/routes/litellm-api-routes.ts
+++ b/ccw/src/core/routes/litellm-api-routes.ts
@@ -0,0 +1,485 @@
 // @ts-nocheck
 /**
 * LiteLLM API Routes Module
 * Handles LiteLLM provider management, endpoint configuration, and cache management
 */
 import type { IncomingMessage, ServerResponse } from 'http';
 import {
  getAllProviders,
  getProvider,
  addProvider,
  updateProvider,
  deleteProvider,
  getAllEndpoints,
  getEndpoint,
  addEndpoint,
  updateEndpoint,
  deleteEndpoint,
  getDefaultEndpoint,
  setDefaultEndpoint,
  getGlobalCacheSettings,
  updateGlobalCacheSettings,
  loadLiteLLMApiConfig,
  type ProviderCredential,
  type CustomEndpoint,
  type ProviderType,
 } from '../../config/litellm-api-config-manager.js';
 import { getContextCacheStore } from '../../tools/context-cache-store.js';
 import { getLiteLLMClient } from '../../tools/litellm-client.js';
 export interface RouteContext {
  pathname: string;
  url: URL;
  req: IncomingMessage;
  res: ServerResponse;
  initialPath: string;
  handlePostRequest: (req: IncomingMessage, res: ServerResponse, handler: (body: unknown) => Promise<any>) => void;
  broadcastToClients: (data: unknown) => void;
 }
 // ===========================
 // Model Information
 // ===========================
 interface ModelInfo {
  id: string;
  name: string;
  provider: ProviderType;
  description?: string;
 }
 const PROVIDER_MODELS: Record<ProviderType, ModelInfo[]> = {
  openai: [
    { id: 'gpt-4-turbo', name: 'GPT-4 Turbo', provider: 'openai', description: '128K context' },
    { id: 'gpt-4', name: 'GPT-4', provider: 'openai', description: '8K context' },
    { id: 'gpt-3.5-turbo', name: 'GPT-3.5 Turbo', provider: 'openai', description: '16K context' },
  ],
  anthropic: [
    { id: 'claude-3-opus-20240229', name: 'Claude 3 Opus', provider: 'anthropic', description: '200K context' },
    { id: 'claude-3-sonnet-20240229', name: 'Claude 3 Sonnet', provider: 'anthropic', description: '200K context' },
    { id: 'claude-3-haiku-20240307', name: 'Claude 3 Haiku', provider: 'anthropic', description: '200K context' },
  ],
  google: [
    { id: 'gemini-pro', name: 'Gemini Pro', provider: 'google', description: '32K context' },
    { id: 'gemini-pro-vision', name: 'Gemini Pro Vision', provider: 'google', description: '16K context' },
  ],
  ollama: [
    { id: 'llama2', name: 'Llama 2', provider: 'ollama', description: 'Local model' },
    { id: 'mistral', name: 'Mistral', provider: 'ollama', description: 'Local model' },
  ],
  azure: [],
  mistral: [
    { id: 'mistral-large-latest', name: 'Mistral Large', provider: 'mistral', description: '32K context' },
    { id: 'mistral-medium-latest', name: 'Mistral Medium', provider: 'mistral', description: '32K context' },
  ],
  deepseek: [
    { id: 'deepseek-chat', name: 'DeepSeek Chat', provider: 'deepseek', description: '64K context' },
    { id: 'deepseek-coder', name: 'DeepSeek Coder', provider: 'deepseek', description: '64K context' },
  ],
  custom: [],
 };
 /**
 * Handle LiteLLM API routes
 * @returns true if route was handled, false otherwise
 */
 export async function handleLiteLLMApiRoutes(ctx: RouteContext): Promise<boolean> {
  const { pathname, url, req, res, initialPath, handlePostRequest, broadcastToClients } = ctx;
  // ===========================
  // Provider Management Routes
  // ===========================
  // GET /api/litellm-api/providers - List all providers
  if (pathname === '/api/litellm-api/providers' && req.method === 'GET') {
    try {
      const providers = getAllProviders(initialPath);
      res.writeHead(200, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ providers, count: providers.length }));
    } catch (err) {
      res.writeHead(500, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ error: (err as Error).message }));
    }
    return true;
  }
  // POST /api/litellm-api/providers - Create provider
  if (pathname === '/api/litellm-api/providers' && req.method === 'POST') {
    handlePostRequest(req, res, async (body: unknown) => {
      const providerData = body as Omit<ProviderCredential, 'id' | 'createdAt' | 'updatedAt'>;
      if (!providerData.name || !providerData.type || !providerData.apiKey) {
        return { error: 'Provider name, type, and apiKey are required', status: 400 };
      }
      try {
        const provider = addProvider(initialPath, providerData);
        broadcastToClients({
          type: 'LITELLM_PROVIDER_CREATED',
          payload: { provider, timestamp: new Date().toISOString() }
        });
        return { success: true, provider };
      } catch (err) {
        return { error: (err as Error).message, status: 500 };
      }
    });
    return true;
  }
  // GET /api/litellm-api/providers/:id - Get provider by ID
  const providerGetMatch = pathname.match(/^\/api\/litellm-api\/providers\/([^/]+)$/);
  if (providerGetMatch && req.method === 'GET') {
    const providerId = providerGetMatch[1];
    try {
      const provider = getProvider(initialPath, providerId);
      if (!provider) {
        res.writeHead(404, { 'Content-Type': 'application/json' });
        res.end(JSON.stringify({ error: 'Provider not found' }));
        return true;
      }
      res.writeHead(200, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify(provider));
    } catch (err) {
      res.writeHead(500, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ error: (err as Error).message }));
    }
    return true;
  }
  // PUT /api/litellm-api/providers/:id - Update provider
  const providerUpdateMatch = pathname.match(/^\/api\/litellm-api\/providers\/([^/]+)$/);
  if (providerUpdateMatch && req.method === 'PUT') {
    const providerId = providerUpdateMatch[1];
    handlePostRequest(req, res, async (body: unknown) => {
      const updates = body as Partial<Omit<ProviderCredential, 'id' | 'createdAt' | 'updatedAt'>>;
      try {
        const provider = updateProvider(initialPath, providerId, updates);
        broadcastToClients({
          type: 'LITELLM_PROVIDER_UPDATED',
          payload: { provider, timestamp: new Date().toISOString() }
        });
        return { success: true, provider };
      } catch (err) {
        return { error: (err as Error).message, status: 404 };
      }
    });
    return true;
  }
  // DELETE /api/litellm-api/providers/:id - Delete provider
  const providerDeleteMatch = pathname.match(/^\/api\/litellm-api\/providers\/([^/]+)$/);
  if (providerDeleteMatch && req.method === 'DELETE') {
    const providerId = providerDeleteMatch[1];
    try {
      const success = deleteProvider(initialPath, providerId);
      if (!success) {
        res.writeHead(404, { 'Content-Type': 'application/json' });
        res.end(JSON.stringify({ error: 'Provider not found' }));
        return true;
      }
      broadcastToClients({
        type: 'LITELLM_PROVIDER_DELETED',
        payload: { providerId, timestamp: new Date().toISOString() }
      });
      res.writeHead(200, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ success: true, message: 'Provider deleted' }));
    } catch (err) {
      res.writeHead(500, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ error: (err as Error).message }));
    }
    return true;
  }
  // POST /api/litellm-api/providers/:id/test - Test provider connection
  const providerTestMatch = pathname.match(/^\/api\/litellm-api\/providers\/([^/]+)\/test$/);
  if (providerTestMatch && req.method === 'POST') {
    const providerId = providerTestMatch[1];
    try {
      const provider = getProvider(initialPath, providerId);
      if (!provider) {
        res.writeHead(404, { 'Content-Type': 'application/json' });
        res.end(JSON.stringify({ success: false, error: 'Provider not found' }));
        return true;
      }
      if (!provider.enabled) {
        res.writeHead(200, { 'Content-Type': 'application/json' });
        res.end(JSON.stringify({ success: false, error: 'Provider is disabled' }));
        return true;
      }
      // Test connection using litellm client
      const client = getLiteLLMClient();
      const available = await client.isAvailable();
      res.writeHead(200, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ success: available, provider: provider.type }));
    } catch (err) {
      res.writeHead(500, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ success: false, error: (err as Error).message }));
    }
    return true;
  }
  // ===========================
  // Endpoint Management Routes
  // ===========================
  // GET /api/litellm-api/endpoints - List all endpoints
  if (pathname === '/api/litellm-api/endpoints' && req.method === 'GET') {
    try {
      const endpoints = getAllEndpoints(initialPath);
      res.writeHead(200, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ endpoints, count: endpoints.length }));
    } catch (err) {
      res.writeHead(500, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ error: (err as Error).message }));
    }
    return true;
  }
  // POST /api/litellm-api/endpoints - Create endpoint
  if (pathname === '/api/litellm-api/endpoints' && req.method === 'POST') {
    handlePostRequest(req, res, async (body: unknown) => {
      const endpointData = body as Omit<CustomEndpoint, 'createdAt' | 'updatedAt'>;
      if (!endpointData.id || !endpointData.name || !endpointData.providerId || !endpointData.model) {
        return { error: 'Endpoint id, name, providerId, and model are required', status: 400 };
      }
      try {
        const endpoint = addEndpoint(initialPath, endpointData);
        broadcastToClients({
          type: 'LITELLM_ENDPOINT_CREATED',
          payload: { endpoint, timestamp: new Date().toISOString() }
        });
        return { success: true, endpoint };
      } catch (err) {
        return { error: (err as Error).message, status: 500 };
      }
    });
    return true;
  }
  // GET /api/litellm-api/endpoints/:id - Get endpoint by ID
  const endpointGetMatch = pathname.match(/^\/api\/litellm-api\/endpoints\/([^/]+)$/);
  if (endpointGetMatch && req.method === 'GET') {
    const endpointId = endpointGetMatch[1];
    try {
      const endpoint = getEndpoint(initialPath, endpointId);
      if (!endpoint) {
        res.writeHead(404, { 'Content-Type': 'application/json' });
        res.end(JSON.stringify({ error: 'Endpoint not found' }));
        return true;
      }
      res.writeHead(200, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify(endpoint));
    } catch (err) {
      res.writeHead(500, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ error: (err as Error).message }));
    }
    return true;
  }
  // PUT /api/litellm-api/endpoints/:id - Update endpoint
  const endpointUpdateMatch = pathname.match(/^\/api\/litellm-api\/endpoints\/([^/]+)$/);
  if (endpointUpdateMatch && req.method === 'PUT') {
    const endpointId = endpointUpdateMatch[1];
    handlePostRequest(req, res, async (body: unknown) => {
      const updates = body as Partial<Omit<CustomEndpoint, 'id' | 'createdAt' | 'updatedAt'>>;
      try {
        const endpoint = updateEndpoint(initialPath, endpointId, updates);
        broadcastToClients({
          type: 'LITELLM_ENDPOINT_UPDATED',
          payload: { endpoint, timestamp: new Date().toISOString() }
        });
        return { success: true, endpoint };
      } catch (err) {
        return { error: (err as Error).message, status: 404 };
      }
    });
    return true;
  }
  // DELETE /api/litellm-api/endpoints/:id - Delete endpoint
  const endpointDeleteMatch = pathname.match(/^\/api\/litellm-api\/endpoints\/([^/]+)$/);
  if (endpointDeleteMatch && req.method === 'DELETE') {
    const endpointId = endpointDeleteMatch[1];
    try {
      const success = deleteEndpoint(initialPath, endpointId);
      if (!success) {
        res.writeHead(404, { 'Content-Type': 'application/json' });
        res.end(JSON.stringify({ error: 'Endpoint not found' }));
        return true;
      }
      broadcastToClients({
        type: 'LITELLM_ENDPOINT_DELETED',
        payload: { endpointId, timestamp: new Date().toISOString() }
      });
      res.writeHead(200, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ success: true, message: 'Endpoint deleted' }));
    } catch (err) {
      res.writeHead(500, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ error: (err as Error).message }));
    }
    return true;
  }
  // ===========================
  // Model Discovery Routes
  // ===========================
  // GET /api/litellm-api/models/:providerType - Get available models for provider type
  const modelsMatch = pathname.match(/^\/api\/litellm-api\/models\/([^/]+)$/);
  if (modelsMatch && req.method === 'GET') {
    const providerType = modelsMatch[1] as ProviderType;
    try {
      const models = PROVIDER_MODELS[providerType];
      if (!models) {
        res.writeHead(404, { 'Content-Type': 'application/json' });
        res.end(JSON.stringify({ error: 'Provider type not found' }));
        return true;
      }
      res.writeHead(200, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ providerType, models, count: models.length }));
    } catch (err) {
      res.writeHead(500, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ error: (err as Error).message }));
    }
    return true;
  }
  // ===========================
  // Cache Management Routes
  // ===========================
  // GET /api/litellm-api/cache/stats - Get cache statistics
  if (pathname === '/api/litellm-api/cache/stats' && req.method === 'GET') {
    try {
      const cacheStore = getContextCacheStore();
      const stats = cacheStore.getStatus();
      res.writeHead(200, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify(stats));
    } catch (err) {
      res.writeHead(500, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ error: (err as Error).message }));
    }
    return true;
  }
  // POST /api/litellm-api/cache/clear - Clear cache
  if (pathname === '/api/litellm-api/cache/clear' && req.method === 'POST') {
    try {
      const cacheStore = getContextCacheStore();
      const result = cacheStore.clear();
      broadcastToClients({
        type: 'LITELLM_CACHE_CLEARED',
        payload: { removed: result.removed, timestamp: new Date().toISOString() }
      });
      res.writeHead(200, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ success: true, removed: result.removed }));
    } catch (err) {
      res.writeHead(500, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ error: (err as Error).message }));
    }
    return true;
  }
  // ===========================
  // Config Management Routes
  // ===========================
  // GET /api/litellm-api/config - Get full config
  if (pathname === '/api/litellm-api/config' && req.method === 'GET') {
    try {
      const config = loadLiteLLMApiConfig(initialPath);
      res.writeHead(200, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify(config));
    } catch (err) {
      res.writeHead(500, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ error: (err as Error).message }));
    }
    return true;
  }
  // PUT /api/litellm-api/config/cache - Update global cache settings
  if (pathname === '/api/litellm-api/config/cache' && req.method === 'PUT') {
    handlePostRequest(req, res, async (body: unknown) => {
      const settings = body as Partial<{ enabled: boolean; cacheDir: string; maxTotalSizeMB: number }>;
      try {
        updateGlobalCacheSettings(initialPath, settings);
        const updatedSettings = getGlobalCacheSettings(initialPath);
        broadcastToClients({
          type: 'LITELLM_CACHE_SETTINGS_UPDATED',
          payload: { settings: updatedSettings, timestamp: new Date().toISOString() }
        });
        return { success: true, settings: updatedSettings };
      } catch (err) {
        return { error: (err as Error).message, status: 500 };
      }
    });
    return true;
  }
  // PUT /api/litellm-api/config/default-endpoint - Set default endpoint
  if (pathname === '/api/litellm-api/config/default-endpoint' && req.method === 'PUT') {
    handlePostRequest(req, res, async (body: unknown) => {
      const { endpointId } = body as { endpointId?: string };
      try {
        setDefaultEndpoint(initialPath, endpointId);
        const defaultEndpoint = getDefaultEndpoint(initialPath);
        broadcastToClients({
          type: 'LITELLM_DEFAULT_ENDPOINT_UPDATED',
          payload: { endpointId, defaultEndpoint, timestamp: new Date().toISOString() }
        });
        return { success: true, defaultEndpoint };
      } catch (err) {
        return { error: (err as Error).message, status: 500 };
      }
    });
    return true;
  }
  return false;
 }
--- a/ccw/src/core/routes/litellm-routes.ts
+++ b/ccw/src/core/routes/litellm-routes.ts
@@ -0,0 +1,107 @@
 // @ts-nocheck
 /**
 * LiteLLM Routes Module
 * Handles all LiteLLM-related API endpoints
 */
 import type { IncomingMessage, ServerResponse } from 'http';
 import { getLiteLLMClient, getLiteLLMStatus, checkLiteLLMAvailable } from '../../tools/litellm-client.js';
 export interface RouteContext {
  pathname: string;
  url: URL;
  req: IncomingMessage;
  res: ServerResponse;
  initialPath: string;
  handlePostRequest: (req: IncomingMessage, res: ServerResponse, handler: (body: unknown) => Promise<any>) => void;
  broadcastToClients: (data: unknown) => void;
 }
 /**
 * Handle LiteLLM routes
 * @returns true if route was handled, false otherwise
 */
 export async function handleLiteLLMRoutes(ctx: RouteContext): Promise<boolean> {
  const { pathname, url, req, res, initialPath, handlePostRequest } = ctx;
  // API: LiteLLM Status - Check availability and version
  if (pathname === '/api/litellm/status') {
    try {
      const status = await getLiteLLMStatus();
      res.writeHead(200, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify(status));
    } catch (err) {
      res.writeHead(500, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ available: false, error: err.message }));
    }
    return true;
  }
  // API: LiteLLM Config - Get configuration
  if (pathname === '/api/litellm/config' && req.method === 'GET') {
    try {
      const client = getLiteLLMClient();
      const config = await client.getConfig();
      res.writeHead(200, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify(config));
    } catch (err) {
      res.writeHead(500, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ error: err.message }));
    }
    return true;
  }
  // API: LiteLLM Embed - Generate embeddings
  if (pathname === '/api/litellm/embed' && req.method === 'POST') {
    handlePostRequest(req, res, async (body) => {
      const { texts, model = 'default' } = body;
      if (!texts || !Array.isArray(texts)) {
        return { error: 'texts array is required', status: 400 };
      }
      if (texts.length === 0) {
        return { error: 'texts array cannot be empty', status: 400 };
      }
      try {
        const client = getLiteLLMClient();
        const result = await client.embed(texts, model);
        return { success: true, ...result };
      } catch (err) {
        return { error: err.message, status: 500 };
      }
    });
    return true;
  }
  // API: LiteLLM Chat - Chat with LLM
  if (pathname === '/api/litellm/chat' && req.method === 'POST') {
    handlePostRequest(req, res, async (body) => {
      const { message, messages, model = 'default' } = body;
      // Support both single message and messages array
      if (!message && (!messages || !Array.isArray(messages))) {
        return { error: 'message or messages array is required', status: 400 };
      }
      try {
        const client = getLiteLLMClient();
        if (messages && Array.isArray(messages)) {
          // Multi-turn chat
          const result = await client.chatMessages(messages, model);
          return { success: true, ...result };
        } else {
          // Single message chat
          const content = await client.chat(message, model);
          return { success: true, content, model };
        }
      } catch (err) {
        return { error: err.message, status: 500 };
      }
    });
    return true;
  }
  return false;
 }
--- a/ccw/src/core/server.ts
+++ b/ccw/src/core/server.ts
@@ -22,6 +22,8 @@ import { handleSessionRoutes } from './routes/session-routes.js';
 import { handleCcwRoutes } from './routes/ccw-routes.js';
 import { handleClaudeRoutes } from './routes/claude-routes.js';
 import { handleHelpRoutes } from './routes/help-routes.js';
 import { handleLiteLLMRoutes } from './routes/litellm-routes.js';
 import { handleLiteLLMApiRoutes } from './routes/litellm-api-routes.js';
 // Import WebSocket handling
 import { handleWebSocketUpgrade, broadcastToClients } from './websocket.js';
@@ -311,6 +313,16 @@ export async function startServer(options: ServerOptions = {}): Promise<http.Ser
        if (await handleCodexLensRoutes(routeContext)) return;
      }
      // LiteLLM routes (/api/litellm/*)
      if (pathname.startsWith('/api/litellm/')) {
        if (await handleLiteLLMRoutes(routeContext)) return;
      }
      // LiteLLM API routes (/api/litellm-api/*)
      if (pathname.startsWith('/api/litellm-api/')) {
        if (await handleLiteLLMApiRoutes(routeContext)) return;
      }
      // Graph routes (/api/graph/*)
      if (pathname.startsWith('/api/graph/')) {
        if (await handleGraphRoutes(routeContext)) return;
--- a/ccw/src/templates/dashboard-css/31-api-settings.css
+++ b/ccw/src/templates/dashboard-css/31-api-settings.css
@@ -0,0 +1,397 @@
 /* ========================================
 * API Settings Styles
 * ======================================== */
 /* Main Container */
 .api-settings-container {
  display: flex;
  flex-direction: column;
  gap: 1.5rem;
  padding: 1rem;
 }
 /* Section Styles */
 .api-settings-section {
  background: hsl(var(--card));
  border: 1px solid hsl(var(--border));
  border-radius: 0.75rem;
  padding: 1.25rem;
 }
 .section-header {
  display: flex;
  align-items: center;
  justify-content: space-between;
  margin-bottom: 1rem;
  padding-bottom: 0.75rem;
  border-bottom: 1px solid hsl(var(--border));
 }
 .section-header h3 {
  font-size: 1rem;
  font-weight: 600;
  color: hsl(var(--foreground));
  margin: 0;
 }
 /* Settings List */
 .api-settings-list {
  display: flex;
  flex-direction: column;
  gap: 0.75rem;
 }
 /* Settings Card */
 .api-settings-card {
  background: hsl(var(--background));
  border: 1px solid hsl(var(--border));
  border-radius: 0.5rem;
  padding: 1rem;
  transition: all 0.2s ease;
 }
 .api-settings-card:hover {
  border-color: hsl(var(--primary) / 0.3);
  box-shadow: 0 2px 8px hsl(var(--primary) / 0.1);
 }
 .api-settings-card.disabled {
  opacity: 0.6;
  background: hsl(var(--muted) / 0.3);
 }
 /* Card Header */
 .card-header {
  display: flex;
  align-items: center;
  justify-content: space-between;
  margin-bottom: 0.75rem;
 }
 .card-info {
  display: flex;
  align-items: center;
  gap: 0.75rem;
  flex: 1;
 }
 .card-info h4 {
  font-size: 0.9375rem;
  font-weight: 600;
  color: hsl(var(--foreground));
  margin: 0;
 }
 .card-actions {
  display: flex;
  align-items: center;
  gap: 0.5rem;
 }
 /* Card Body */
 .card-body {
  display: flex;
  flex-direction: column;
  gap: 0.75rem;
 }
 .card-meta {
  display: flex;
  flex-wrap: wrap;
  gap: 1rem;
  font-size: 0.8125rem;
  color: hsl(var(--muted-foreground));
 }
 .card-meta span {
  display: flex;
  align-items: center;
  gap: 0.375rem;
 }
 .card-meta i {
  font-size: 0.875rem;
 }
 /* Provider Type Badge */
 .provider-type-badge {
  display: inline-flex;
  align-items: center;
  padding: 0.25rem 0.625rem;
  font-size: 0.6875rem;
  font-weight: 600;
  text-transform: uppercase;
  background: hsl(var(--primary) / 0.1);
  color: hsl(var(--primary));
  border-radius: 9999px;
  letter-spacing: 0.03em;
 }
 /* Endpoint ID */
 .endpoint-id {
  font-family: 'SF Mono', 'Consolas', 'Liberation Mono', monospace;
  font-size: 0.75rem;
  padding: 0.25rem 0.5rem;
  background: hsl(var(--muted) / 0.5);
  border-radius: 0.25rem;
  color: hsl(var(--primary));
 }
 /* Usage Hint */
 .usage-hint {
  display: flex;
  align-items: center;
  gap: 0.5rem;
  padding: 0.625rem 0.75rem;
  background: hsl(var(--muted) / 0.3);
  border-radius: 0.375rem;
  font-size: 0.75rem;
  color: hsl(var(--muted-foreground));
  margin-top: 0.375rem;
 }
 .usage-hint code {
  font-family: 'SF Mono', 'Consolas', 'Liberation Mono', monospace;
  font-size: 0.6875rem;
  color: hsl(var(--foreground));
 }
 /* Status Badge */
 .status-badge {
  display: inline-flex;
  align-items: center;
  padding: 0.25rem 0.625rem;
  font-size: 0.6875rem;
  font-weight: 600;
  border-radius: 9999px;
 }
 .status-badge.status-enabled {
  background: hsl(142 76% 36% / 0.1);
  color: hsl(142 76% 36%);
 }
 .status-badge.status-disabled {
  background: hsl(var(--muted) / 0.5);
  color: hsl(var(--muted-foreground));
 }
 /* Empty State */
 .empty-state {
  display: flex;
  flex-direction: column;
  align-items: center;
  justify-content: center;
  padding: 2.5rem 1rem;
  text-align: center;
  color: hsl(var(--muted-foreground));
 }
 .empty-icon {
  font-size: 3rem;
  opacity: 0.3;
  margin-bottom: 0.75rem;
 }
 .empty-state p {
  font-size: 0.875rem;
  margin: 0;
 }
 /* Cache Settings Panel */
 .cache-settings-panel {
  padding: 1rem;
 }
 .cache-settings-content {
  display: flex;
  flex-direction: column;
  gap: 1rem;
 }
 .cache-stats {
  display: flex;
  flex-direction: column;
  gap: 0.75rem;
  padding: 1rem;
  background: hsl(var(--muted) / 0.3);
  border-radius: 0.5rem;
 }
 .stat-item {
  display: flex;
  justify-content: space-between;
  align-items: center;
  font-size: 0.8125rem;
 }
 .stat-label {
  color: hsl(var(--muted-foreground));
  font-weight: 500;
 }
 .stat-value {
  color: hsl(var(--foreground));
  font-weight: 600;
  font-family: 'SF Mono', 'Consolas', 'Liberation Mono', monospace;
 }
 /* Progress Bar */
 .progress-bar {
  width: 100%;
  height: 8px;
  background: hsl(var(--muted) / 0.5);
  border-radius: 9999px;
  overflow: hidden;
 }
 .progress-fill {
  height: 100%;
  background: hsl(var(--primary));
  border-radius: 9999px;
  transition: width 0.3s ease;
 }
 /* ========================================
 * Form Styles
 * ======================================== */
 .api-settings-form {
  display: flex;
  flex-direction: column;
  gap: 1rem;
 }
 .form-group {
  display: flex;
  flex-direction: column;
  gap: 0.5rem;
 }
 .form-group label {
  font-size: 0.8125rem;
  font-weight: 500;
  color: hsl(var(--foreground));
 }
 .form-hint {
  font-size: 0.75rem;
  color: hsl(var(--muted-foreground));
  font-style: italic;
 }
 .text-muted {
  color: hsl(var(--muted-foreground));
  font-weight: 400;
 }
 /* API Key Input Group */
 .api-key-input-group {
  display: flex;
  gap: 0.5rem;
 }
 .api-key-input-group input {
  flex: 1;
 }
 .api-key-input-group .btn-icon {
  flex-shrink: 0;
 }
 /* Checkbox Label */
 .checkbox-label {
  display: flex;
  align-items: center;
  gap: 0.5rem;
  font-size: 0.8125rem;
  color: hsl(var(--foreground));
  cursor: pointer;
 }
 .checkbox-label input[type="checkbox"] {
  width: 1rem;
  height: 1rem;
  cursor: pointer;
 }
 /* Fieldset */
 .form-fieldset {
  border: 1px solid hsl(var(--border));
  border-radius: 0.5rem;
  padding: 1rem;
  margin: 0;
 }
 .form-fieldset legend {
  font-size: 0.875rem;
  font-weight: 600;
  color: hsl(var(--foreground));
  padding: 0 0.5rem;
 }
 /* Modal Actions */
 .modal-actions {
  display: flex;
  gap: 0.75rem;
  justify-content: flex-end;
  margin-top: 1rem;
  padding-top: 1rem;
  border-top: 1px solid hsl(var(--border));
 }
 /* ========================================
 * Responsive Design
 * ======================================== */
@media (min-width: 768px) {
  .api-settings-container {
    padding: 1.5rem;
  }
  .card-meta {
    gap: 1.5rem;
  }
 }
@media (max-width: 640px) {
  .section-header {
    flex-direction: column;
    align-items: flex-start;
    gap: 0.75rem;
  }
  .card-header {
    flex-direction: column;
    align-items: flex-start;
    gap: 0.75rem;
  }
  .card-actions {
    align-self: flex-end;
  }
  .card-meta {
    flex-direction: column;
    gap: 0.5rem;
  }
  .modal-actions {
    flex-direction: column;
  }
  .modal-actions .btn {
    width: 100%;
  }
 }
 /* Error Message */
 .error-message {
  display: flex;
  align-items: center;
  justify-content: center;
  padding: 2rem;
  font-size: 0.875rem;
  color: hsl(var(--destructive));
  text-align: center;
 }
--- a/ccw/src/templates/dashboard-js/components/navigation.js
+++ b/ccw/src/templates/dashboard-js/components/navigation.js
@@ -149,6 +149,12 @@ function initNavigation() {
        } else {
          console.error('renderCodexLensManager not defined - please refresh the page');
        }
      } else if (currentView === 'api-settings') {
        if (typeof renderApiSettings === 'function') {
          renderApiSettings();
        } else {
          console.error('renderApiSettings not defined - please refresh the page');
        }
      }
    });
  });
@@ -191,6 +197,8 @@ function updateContentTitle() {
    titleEl.textContent = t('title.coreMemory');
  } else if (currentView === 'codexlens-manager') {
    titleEl.textContent = t('title.codexLensManager');
  } else if (currentView === 'api-settings') {
    titleEl.textContent = t('title.apiSettings');
  } else if (currentView === 'liteTasks') {
    const names = { 'lite-plan': t('title.litePlanSessions'), 'lite-fix': t('title.liteFixSessions') };
    titleEl.textContent = names[currentLiteType] || t('title.liteTasks');
--- a/ccw/src/templates/dashboard-js/i18n.js
+++ b/ccw/src/templates/dashboard-js/i18n.js
@@ -1331,6 +1331,62 @@ const i18n = {
    'claude.unsupportedFileType': 'Unsupported file type',
    'claude.loadFileError': 'Failed to load file',
    // API Settings
    'nav.apiSettings': 'API Settings',
    'title.apiSettings': 'API Settings',
    'apiSettings.providers': 'Providers',
    'apiSettings.customEndpoints': 'Custom Endpoints',
    'apiSettings.cacheSettings': 'Cache Settings',
    'apiSettings.addProvider': 'Add Provider',
    'apiSettings.editProvider': 'Edit Provider',
    'apiSettings.deleteProvider': 'Delete Provider',
    'apiSettings.addEndpoint': 'Add Endpoint',
    'apiSettings.editEndpoint': 'Edit Endpoint',
    'apiSettings.deleteEndpoint': 'Delete Endpoint',
    'apiSettings.providerType': 'Provider Type',
    'apiSettings.displayName': 'Display Name',
    'apiSettings.apiKey': 'API Key',
    'apiSettings.apiBaseUrl': 'API Base URL',
    'apiSettings.useEnvVar': 'Use environment variable',
    'apiSettings.enableProvider': 'Enable provider',
    'apiSettings.testConnection': 'Test Connection',
    'apiSettings.endpointId': 'Endpoint ID',
    'apiSettings.endpointIdHint': 'Usage: ccw cli -p "..." --model <endpoint-id>',
    'apiSettings.provider': 'Provider',
    'apiSettings.model': 'Model',
    'apiSettings.selectModel': 'Select model',
    'apiSettings.cacheStrategy': 'Cache Strategy',
    'apiSettings.enableContextCaching': 'Enable Context Caching',
    'apiSettings.cacheTTL': 'TTL (minutes)',
    'apiSettings.cacheMaxSize': 'Max Size (KB)',
    'apiSettings.autoCachePatterns': 'Auto-cache file patterns',
    'apiSettings.enableGlobalCaching': 'Enable Global Caching',
    'apiSettings.cacheUsed': 'Used',
    'apiSettings.cacheEntries': 'Entries',
    'apiSettings.clearCache': 'Clear Cache',
    'apiSettings.noProviders': 'No providers configured',
    'apiSettings.noEndpoints': 'No endpoints configured',
    'apiSettings.enabled': 'Enabled',
    'apiSettings.disabled': 'Disabled',
    'apiSettings.cacheEnabled': 'Cache Enabled',
    'apiSettings.cacheDisabled': 'Cache Disabled',
    'apiSettings.providerSaved': 'Provider saved successfully',
    'apiSettings.providerDeleted': 'Provider deleted successfully',
    'apiSettings.endpointSaved': 'Endpoint saved successfully',
    'apiSettings.endpointDeleted': 'Endpoint deleted successfully',
    'apiSettings.cacheCleared': 'Cache cleared successfully',
    'apiSettings.cacheSettingsUpdated': 'Cache settings updated',
    'apiSettings.confirmDeleteProvider': 'Are you sure you want to delete this provider?',
    'apiSettings.confirmDeleteEndpoint': 'Are you sure you want to delete this endpoint?',
    'apiSettings.confirmClearCache': 'Are you sure you want to clear the cache?',
    'apiSettings.connectionSuccess': 'Connection successful',
    'apiSettings.connectionFailed': 'Connection failed',
    'apiSettings.saveProviderFirst': 'Please save the provider first',
    'apiSettings.addProviderFirst': 'Please add a provider first',
    'apiSettings.failedToLoad': 'Failed to load API settings',
    'apiSettings.toggleVisibility': 'Toggle visibility',
    // Common
    'common.cancel': 'Cancel',
    'common.optional': '(Optional)',
@@ -2799,6 +2855,62 @@ const i18n = {
    'claudeManager.saved': 'File saved successfully',
    'claudeManager.saveError': 'Failed to save file',
    // API Settings
    'nav.apiSettings': 'API 设置',
    'title.apiSettings': 'API 设置',
    'apiSettings.providers': '提供商',
    'apiSettings.customEndpoints': '自定义端点',
    'apiSettings.cacheSettings': '缓存设置',
    'apiSettings.addProvider': '添加提供商',
    'apiSettings.editProvider': '编辑提供商',
    'apiSettings.deleteProvider': '删除提供商',
    'apiSettings.addEndpoint': '添加端点',
    'apiSettings.editEndpoint': '编辑端点',
    'apiSettings.deleteEndpoint': '删除端点',
    'apiSettings.providerType': '提供商类型',
    'apiSettings.displayName': '显示名称',
    'apiSettings.apiKey': 'API 密钥',
    'apiSettings.apiBaseUrl': 'API 基础 URL',
    'apiSettings.useEnvVar': '使用环境变量',
    'apiSettings.enableProvider': '启用提供商',
    'apiSettings.testConnection': '测试连接',
    'apiSettings.endpointId': '端点 ID',
    'apiSettings.endpointIdHint': '用法: ccw cli -p "..." --model <端点ID>',
    'apiSettings.provider': '提供商',
    'apiSettings.model': '模型',
    'apiSettings.selectModel': '选择模型',
    'apiSettings.cacheStrategy': '缓存策略',
    'apiSettings.enableContextCaching': '启用上下文缓存',
    'apiSettings.cacheTTL': 'TTL (分钟)',
    'apiSettings.cacheMaxSize': '最大大小 (KB)',
    'apiSettings.autoCachePatterns': '自动缓存文件模式',
    'apiSettings.enableGlobalCaching': '启用全局缓存',
    'apiSettings.cacheUsed': '已使用',
    'apiSettings.cacheEntries': '条目数',
    'apiSettings.clearCache': '清除缓存',
    'apiSettings.noProviders': '未配置提供商',
    'apiSettings.noEndpoints': '未配置端点',
    'apiSettings.enabled': '已启用',
    'apiSettings.disabled': '已禁用',
    'apiSettings.cacheEnabled': '缓存已启用',
    'apiSettings.cacheDisabled': '缓存已禁用',
    'apiSettings.providerSaved': '提供商保存成功',
    'apiSettings.providerDeleted': '提供商删除成功',
    'apiSettings.endpointSaved': '端点保存成功',
    'apiSettings.endpointDeleted': '端点删除成功',
    'apiSettings.cacheCleared': '缓存清除成功',
    'apiSettings.cacheSettingsUpdated': '缓存设置已更新',
    'apiSettings.confirmDeleteProvider': '确定要删除此提供商吗？',
    'apiSettings.confirmDeleteEndpoint': '确定要删除此端点吗？',
    'apiSettings.confirmClearCache': '确定要清除缓存吗？',
    'apiSettings.connectionSuccess': '连接成功',
    'apiSettings.connectionFailed': '连接失败',
    'apiSettings.saveProviderFirst': '请先保存提供商',
    'apiSettings.addProviderFirst': '请先添加提供商',
    'apiSettings.failedToLoad': '加载 API 设置失败',
    'apiSettings.toggleVisibility': '切换可见性',
    // Common
    'common.cancel': '取消',
    'common.optional': '(可选)',
--- a/ccw/src/templates/dashboard-js/views/api-settings.js
+++ b/ccw/src/templates/dashboard-js/views/api-settings.js
@@ -0,0 +1,815 @@
 // API Settings View
 // Manages LiteLLM API providers, custom endpoints, and cache settings
 // ========== State Management ==========
 var apiSettingsData = null;
 var providerModels = {};
 var currentModal = null;
 // ========== Data Loading ==========
 /**
 * Load API configuration
 */
 async function loadApiSettings() {
  try {
    var response = await fetch('/api/litellm-api/config');
    if (!response.ok) throw new Error('Failed to load API settings');
    apiSettingsData = await response.json();
    return apiSettingsData;
  } catch (err) {
    console.error('Failed to load API settings:', err);
    showRefreshToast(t('common.error') + ': ' + err.message, 'error');
    return null;
  }
 }
 /**
 * Load available models for a provider type
 */
 async function loadProviderModels(providerType) {
  try {
    var response = await fetch('/api/litellm-api/models/' + providerType);
    if (!response.ok) throw new Error('Failed to load models');
    var data = await response.json();
    providerModels[providerType] = data.models || [];
    return data.models;
  } catch (err) {
    console.error('Failed to load provider models:', err);
    return [];
  }
 }
 /**
 * Load cache statistics
 */
 async function loadCacheStats() {
  try {
    var response = await fetch('/api/litellm-api/cache/stats');
    if (!response.ok) throw new Error('Failed to load cache stats');
    return await response.json();
  } catch (err) {
    console.error('Failed to load cache stats:', err);
    return { enabled: false, totalSize: 0, maxSize: 104857600, entries: 0 };
  }
 }
 // ========== Provider Management ==========
 /**
 * Show add provider modal
 */
 async function showAddProviderModal() {
  var modalHtml = '<div class="generic-modal-overlay active" id="providerModal">' +
    '<div class="generic-modal">' +
    '<div class="generic-modal-header">' +
    '<h3 class="generic-modal-title">' + t('apiSettings.addProvider') + '</h3>' +
    '<button class="generic-modal-close" onclick="closeProviderModal()">&times;</button>' +
    '</div>' +
    '<div class="generic-modal-body">' +
    '<form id="providerForm" class="api-settings-form">' +
    '<div class="form-group">' +
    '<label for="provider-type">' + t('apiSettings.providerType') + '</label>' +
    '<select id="provider-type" class="cli-input" required>' +
    '<option value="openai">OpenAI</option>' +
    '<option value="anthropic">Anthropic</option>' +
    '<option value="google">Google</option>' +
    '<option value="ollama">Ollama</option>' +
    '<option value="azure">Azure</option>' +
    '<option value="mistral">Mistral AI</option>' +
    '<option value="deepseek">DeepSeek</option>' +
    '<option value="custom">Custom</option>' +
    '</select>' +
    '</div>' +
    '<div class="form-group">' +
    '<label for="provider-name">' + t('apiSettings.displayName') + '</label>' +
    '<input type="text" id="provider-name" class="cli-input" placeholder="My OpenAI" required />' +
    '</div>' +
    '<div class="form-group">' +
    '<label for="provider-apikey">' + t('apiSettings.apiKey') + '</label>' +
    '<div class="api-key-input-group">' +
    '<input type="password" id="provider-apikey" class="cli-input" placeholder="sk-..." required />' +
    '<button type="button" class="btn-icon" onclick="toggleApiKeyVisibility(\'provider-apikey\')" title="' + t('apiSettings.toggleVisibility') + '">' +
    '<i data-lucide="eye"></i>' +
    '</button>' +
    '</div>' +
    '<label class="checkbox-label">' +
    '<input type="checkbox" id="use-env-var" onchange="toggleEnvVarInput()" /> ' +
    t('apiSettings.useEnvVar') +
    '</label>' +
    '<input type="text" id="env-var-name" class="cli-input" placeholder="OPENAI_API_KEY" style="display:none; margin-top: 0.5rem;" />' +
    '</div>' +
    '<div class="form-group">' +
    '<label for="provider-apibase">' + t('apiSettings.apiBaseUrl') + ' <span class="text-muted">(' + t('common.optional') + ')</span></label>' +
    '<input type="text" id="provider-apibase" class="cli-input" placeholder="https://api.openai.com/v1" />' +
    '</div>' +
    '<div class="form-group">' +
    '<label class="checkbox-label">' +
    '<input type="checkbox" id="provider-enabled" checked /> ' +
    t('apiSettings.enableProvider') +
    '</label>' +
    '</div>' +
    '<div class="modal-actions">' +
    '<button type="button" class="btn btn-secondary" onclick="testProviderConnection()">' +
    '<i data-lucide="wifi"></i> ' + t('apiSettings.testConnection') +
    '</button>' +
    '<button type="button" class="btn btn-secondary" onclick="closeProviderModal()">' + t('common.cancel') + '</button>' +
    '<button type="submit" class="btn btn-primary">' +
    '<i data-lucide="save"></i> ' + t('common.save') +
    '</button>' +
    '</div>' +
    '</form>' +
    '</div>' +
    '</div>' +
    '</div>';
  document.body.insertAdjacentHTML('beforeend', modalHtml);
  document.getElementById('providerForm').addEventListener('submit', async function(e) {
    e.preventDefault();
    await saveProvider();
  });
  if (window.lucide) lucide.createIcons();
 }
 /**
 * Show edit provider modal
 */
 async function showEditProviderModal(providerId) {
  if (!apiSettingsData) return;
  var provider = apiSettingsData.providers?.find(function(p) { return p.id === providerId; });
  if (!provider) return;
  await showAddProviderModal();
  // Update modal title
  document.querySelector('#providerModal .generic-modal-title').textContent = t('apiSettings.editProvider');
  // Populate form
  document.getElementById('provider-type').value = provider.type;
  document.getElementById('provider-name').value = provider.name;
  document.getElementById('provider-apikey').value = provider.apiKey;
  if (provider.apiBase) {
    document.getElementById('provider-apibase').value = provider.apiBase;
  }
  document.getElementById('provider-enabled').checked = provider.enabled !== false;
  // Store provider ID for update
  document.getElementById('providerForm').dataset.providerId = providerId;
 }
 /**
 * Save provider (create or update)
 */
 async function saveProvider() {
  var form = document.getElementById('providerForm');
  var providerId = form.dataset.providerId;
  var useEnvVar = document.getElementById('use-env-var').checked;
  var apiKey = useEnvVar
    ? '${' + document.getElementById('env-var-name').value + '}'
    : document.getElementById('provider-apikey').value;
  var providerData = {
    type: document.getElementById('provider-type').value,
    name: document.getElementById('provider-name').value,
    apiKey: apiKey,
    apiBase: document.getElementById('provider-apibase').value || undefined,
    enabled: document.getElementById('provider-enabled').checked
  };
  try {
    var url = providerId
      ? '/api/litellm-api/providers/' + providerId
      : '/api/litellm-api/providers';
    var method = providerId ? 'PUT' : 'POST';
    var response = await fetch(url, {
      method: method,
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify(providerData)
    });
    if (!response.ok) throw new Error('Failed to save provider');
    var result = await response.json();
    showRefreshToast(t('apiSettings.providerSaved'), 'success');
    closeProviderModal();
    await renderApiSettings();
  } catch (err) {
    console.error('Failed to save provider:', err);
    showRefreshToast(t('common.error') + ': ' + err.message, 'error');
  }
 }
 /**
 * Delete provider
 */
 async function deleteProvider(providerId) {
  if (!confirm(t('apiSettings.confirmDeleteProvider'))) return;
  try {
    var response = await fetch('/api/litellm-api/providers/' + providerId, {
      method: 'DELETE'
    });
    if (!response.ok) throw new Error('Failed to delete provider');
    showRefreshToast(t('apiSettings.providerDeleted'), 'success');
    await renderApiSettings();
  } catch (err) {
    console.error('Failed to delete provider:', err);
    showRefreshToast(t('common.error') + ': ' + err.message, 'error');
  }
 }
 /**
 * Test provider connection
 */
 async function testProviderConnection() {
  var form = document.getElementById('providerForm');
  var providerId = form.dataset.providerId;
  if (!providerId) {
    showRefreshToast(t('apiSettings.saveProviderFirst'), 'warning');
    return;
  }
  try {
    var response = await fetch('/api/litellm-api/providers/' + providerId + '/test', {
      method: 'POST'
    });
    if (!response.ok) throw new Error('Failed to test provider');
    var result = await response.json();
    if (result.success) {
      showRefreshToast(t('apiSettings.connectionSuccess'), 'success');
    } else {
      showRefreshToast(t('apiSettings.connectionFailed') + ': ' + (result.error || 'Unknown error'), 'error');
    }
  } catch (err) {
    console.error('Failed to test provider:', err);
    showRefreshToast(t('common.error') + ': ' + err.message, 'error');
  }
 }
 /**
 * Close provider modal
 */
 function closeProviderModal() {
  var modal = document.getElementById('providerModal');
  if (modal) modal.remove();
 }
 /**
 * Toggle API key visibility
 */
 function toggleApiKeyVisibility(inputId) {
  var input = document.getElementById(inputId);
  var icon = event.target.closest('button').querySelector('i');
  if (input.type === 'password') {
    input.type = 'text';
    icon.setAttribute('data-lucide', 'eye-off');
  } else {
    input.type = 'password';
    icon.setAttribute('data-lucide', 'eye');
  }
  if (window.lucide) lucide.createIcons();
 }
 /**
 * Toggle environment variable input
 */
 function toggleEnvVarInput() {
  var useEnvVar = document.getElementById('use-env-var').checked;
  var apiKeyInput = document.getElementById('provider-apikey');
  var envVarInput = document.getElementById('env-var-name');
  if (useEnvVar) {
    apiKeyInput.style.display = 'none';
    apiKeyInput.required = false;
    envVarInput.style.display = 'block';
    envVarInput.required = true;
  } else {
    apiKeyInput.style.display = 'block';
    apiKeyInput.required = true;
    envVarInput.style.display = 'none';
    envVarInput.required = false;
  }
 }
 // ========== Endpoint Management ==========
 /**
 * Show add endpoint modal
 */
 async function showAddEndpointModal() {
  if (!apiSettingsData || !apiSettingsData.providers || apiSettingsData.providers.length === 0) {
    showRefreshToast(t('apiSettings.addProviderFirst'), 'warning');
    return;
  }
  var providerOptions = apiSettingsData.providers
    .filter(function(p) { return p.enabled !== false; })
    .map(function(p) {
      return '<option value="' + p.id + '">' + p.name + ' (' + p.type + ')</option>';
    })
    .join('');
  var modalHtml = '<div class="generic-modal-overlay active" id="endpointModal">' +
    '<div class="generic-modal">' +
    '<div class="generic-modal-header">' +
    '<h3 class="generic-modal-title">' + t('apiSettings.addEndpoint') + '</h3>' +
    '<button class="generic-modal-close" onclick="closeEndpointModal()">&times;</button>' +
    '</div>' +
    '<div class="generic-modal-body">' +
    '<form id="endpointForm" class="api-settings-form">' +
    '<div class="form-group">' +
    '<label for="endpoint-id">' + t('apiSettings.endpointId') + '</label>' +
    '<input type="text" id="endpoint-id" class="cli-input" placeholder="my-gpt4o" required />' +
    '<small class="form-hint">' + t('apiSettings.endpointIdHint') + '</small>' +
    '</div>' +
    '<div class="form-group">' +
    '<label for="endpoint-name">' + t('apiSettings.displayName') + '</label>' +
    '<input type="text" id="endpoint-name" class="cli-input" placeholder="GPT-4o for Code Review" required />' +
    '</div>' +
    '<div class="form-group">' +
    '<label for="endpoint-provider">' + t('apiSettings.provider') + '</label>' +
    '<select id="endpoint-provider" class="cli-input" onchange="loadModelsForProvider()" required>' +
    providerOptions +
    '</select>' +
    '</div>' +
    '<div class="form-group">' +
    '<label for="endpoint-model">' + t('apiSettings.model') + '</label>' +
    '<select id="endpoint-model" class="cli-input" required>' +
    '<option value="">' + t('apiSettings.selectModel') + '</option>' +
    '</select>' +
    '</div>' +
    '<fieldset class="form-fieldset">' +
    '<legend>' + t('apiSettings.cacheStrategy') + '</legend>' +
    '<label class="checkbox-label">' +
    '<input type="checkbox" id="cache-enabled" onchange="toggleCacheSettings()" /> ' +
    t('apiSettings.enableContextCaching') +
    '</label>' +
    '<div id="cache-settings" style="display:none;">' +
    '<div class="form-group">' +
    '<label for="cache-ttl">' + t('apiSettings.cacheTTL') + '</label>' +
    '<input type="number" id="cache-ttl" class="cli-input" value="60" min="1" />' +
    '</div>' +
    '<div class="form-group">' +
    '<label for="cache-maxsize">' + t('apiSettings.cacheMaxSize') + '</label>' +
    '<input type="number" id="cache-maxsize" class="cli-input" value="512" min="1" />' +
    '</div>' +
    '<div class="form-group">' +
    '<label for="cache-patterns">' + t('apiSettings.autoCachePatterns') + '</label>' +
    '<input type="text" id="cache-patterns" class="cli-input" placeholder="*.ts, *.md, CLAUDE.md" />' +
    '</div>' +
    '</div>' +
    '</fieldset>' +
    '<div class="modal-actions">' +
    '<button type="button" class="btn btn-secondary" onclick="closeEndpointModal()">' + t('common.cancel') + '</button>' +
    '<button type="submit" class="btn btn-primary">' +
    '<i data-lucide="save"></i> ' + t('common.save') +
    '</button>' +
    '</div>' +
    '</form>' +
    '</div>' +
    '</div>' +
    '</div>';
  document.body.insertAdjacentHTML('beforeend', modalHtml);
  document.getElementById('endpointForm').addEventListener('submit', async function(e) {
    e.preventDefault();
    await saveEndpoint();
  });
  // Load models for first provider
  await loadModelsForProvider();
  if (window.lucide) lucide.createIcons();
 }
 /**
 * Show edit endpoint modal
 */
 async function showEditEndpointModal(endpointId) {
  if (!apiSettingsData) return;
  var endpoint = apiSettingsData.endpoints?.find(function(e) { return e.id === endpointId; });
  if (!endpoint) return;
  await showAddEndpointModal();
  // Update modal title
  document.querySelector('#endpointModal .generic-modal-title').textContent = t('apiSettings.editEndpoint');
  // Populate form
  document.getElementById('endpoint-id').value = endpoint.id;
  document.getElementById('endpoint-id').disabled = true;
  document.getElementById('endpoint-name').value = endpoint.name;
  document.getElementById('endpoint-provider').value = endpoint.providerId;
  await loadModelsForProvider();
  document.getElementById('endpoint-model').value = endpoint.model;
  if (endpoint.cacheStrategy) {
    document.getElementById('cache-enabled').checked = endpoint.cacheStrategy.enabled;
    if (endpoint.cacheStrategy.enabled) {
      toggleCacheSettings();
      document.getElementById('cache-ttl').value = endpoint.cacheStrategy.ttlMinutes || 60;
      document.getElementById('cache-maxsize').value = endpoint.cacheStrategy.maxSizeKB || 512;
      document.getElementById('cache-patterns').value = endpoint.cacheStrategy.autoCachePatterns?.join(', ') || '';
    }
  }
  // Store endpoint ID for update
  document.getElementById('endpointForm').dataset.endpointId = endpointId;
 }
 /**
 * Save endpoint (create or update)
 */
 async function saveEndpoint() {
  var form = document.getElementById('endpointForm');
  var endpointId = form.dataset.endpointId || document.getElementById('endpoint-id').value;
  var cacheEnabled = document.getElementById('cache-enabled').checked;
  var cacheStrategy = cacheEnabled ? {
    enabled: true,
    ttlMinutes: parseInt(document.getElementById('cache-ttl').value) || 60,
    maxSizeKB: parseInt(document.getElementById('cache-maxsize').value) || 512,
    autoCachePatterns: document.getElementById('cache-patterns').value
      .split(',')
      .map(function(p) { return p.trim(); })
      .filter(function(p) { return p; })
  } : { enabled: false };
  var endpointData = {
    id: endpointId,
    name: document.getElementById('endpoint-name').value,
    providerId: document.getElementById('endpoint-provider').value,
    model: document.getElementById('endpoint-model').value,
    cacheStrategy: cacheStrategy
  };
  try {
    var url = form.dataset.endpointId
      ? '/api/litellm-api/endpoints/' + form.dataset.endpointId
      : '/api/litellm-api/endpoints';
    var method = form.dataset.endpointId ? 'PUT' : 'POST';
    var response = await fetch(url, {
      method: method,
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify(endpointData)
    });
    if (!response.ok) throw new Error('Failed to save endpoint');
    var result = await response.json();
    showRefreshToast(t('apiSettings.endpointSaved'), 'success');
    closeEndpointModal();
    await renderApiSettings();
  } catch (err) {
    console.error('Failed to save endpoint:', err);
    showRefreshToast(t('common.error') + ': ' + err.message, 'error');
  }
 }
 /**
 * Delete endpoint
 */
 async function deleteEndpoint(endpointId) {
  if (!confirm(t('apiSettings.confirmDeleteEndpoint'))) return;
  try {
    var response = await fetch('/api/litellm-api/endpoints/' + endpointId, {
      method: 'DELETE'
    });
    if (!response.ok) throw new Error('Failed to delete endpoint');
    showRefreshToast(t('apiSettings.endpointDeleted'), 'success');
    await renderApiSettings();
  } catch (err) {
    console.error('Failed to delete endpoint:', err);
    showRefreshToast(t('common.error') + ': ' + err.message, 'error');
  }
 }
 /**
 * Close endpoint modal
 */
 function closeEndpointModal() {
  var modal = document.getElementById('endpointModal');
  if (modal) modal.remove();
 }
 /**
 * Load models for selected provider
 */
 async function loadModelsForProvider() {
  var providerSelect = document.getElementById('endpoint-provider');
  var modelSelect = document.getElementById('endpoint-model');
  if (!providerSelect || !modelSelect) return;
  var providerId = providerSelect.value;
  var provider = apiSettingsData.providers.find(function(p) { return p.id === providerId; });
  if (!provider) return;
  // Load models for provider type
  var models = await loadProviderModels(provider.type);
  modelSelect.innerHTML = '<option value="">' + t('apiSettings.selectModel') + '</option>' +
    models.map(function(m) {
      var desc = m.description ? ' - ' + m.description : '';
      return '<option value="' + m.id + '">' + m.name + desc + '</option>';
    }).join('');
 }
 /**
 * Toggle cache settings visibility
 */
 function toggleCacheSettings() {
  var enabled = document.getElementById('cache-enabled').checked;
  var settings = document.getElementById('cache-settings');
  settings.style.display = enabled ? 'block' : 'none';
 }
 // ========== Cache Management ==========
 /**
 * Clear cache
 */
 async function clearCache() {
  if (!confirm(t('apiSettings.confirmClearCache'))) return;
  try {
    var response = await fetch('/api/litellm-api/cache/clear', {
      method: 'POST'
    });
    if (!response.ok) throw new Error('Failed to clear cache');
    var result = await response.json();
    showRefreshToast(t('apiSettings.cacheCleared') + ' (' + result.removed + ' entries)', 'success');
    await renderApiSettings();
  } catch (err) {
    console.error('Failed to clear cache:', err);
    showRefreshToast(t('common.error') + ': ' + err.message, 'error');
  }
 }
 /**
 * Toggle global cache
 */
 async function toggleGlobalCache() {
  var enabled = document.getElementById('global-cache-enabled').checked;
  try {
    var response = await fetch('/api/litellm-api/config/cache', {
      method: 'PUT',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ enabled: enabled })
    });
    if (!response.ok) throw new Error('Failed to update cache settings');
    showRefreshToast(t('apiSettings.cacheSettingsUpdated'), 'success');
  } catch (err) {
    console.error('Failed to update cache settings:', err);
    showRefreshToast(t('common.error') + ': ' + err.message, 'error');
    // Revert checkbox
    document.getElementById('global-cache-enabled').checked = !enabled;
  }
 }
 // ========== Rendering ==========
 /**
 * Render API Settings page
 */
 async function renderApiSettings() {
  var container = document.getElementById('mainContent');
  if (!container) return;
  // Hide stats grid and search
  var statsGrid = document.getElementById('statsGrid');
  var searchInput = document.getElementById('searchInput');
  if (statsGrid) statsGrid.style.display = 'none';
  if (searchInput) searchInput.parentElement.style.display = 'none';
  // Load data
  await loadApiSettings();
  var cacheStats = await loadCacheStats();
  if (!apiSettingsData) {
    container.innerHTML = '<div class="api-settings-container">' +
      '<div class="error-message">' + t('apiSettings.failedToLoad') + '</div>' +
      '</div>';
    return;
  }
  container.innerHTML = '<div class="api-settings-container">' +
    '<div class="api-settings-section">' +
    '<div class="section-header">' +
    '<h3>' + t('apiSettings.providers') + '</h3>' +
    '<button class="btn btn-primary" onclick="showAddProviderModal()">' +
    '<i data-lucide="plus"></i> ' + t('apiSettings.addProvider') +
    '</button>' +
    '</div>' +
    '<div id="providers-list" class="api-settings-list"></div>' +
    '</div>' +
    '<div class="api-settings-section">' +
    '<div class="section-header">' +
    '<h3>' + t('apiSettings.customEndpoints') + '</h3>' +
    '<button class="btn btn-primary" onclick="showAddEndpointModal()">' +
    '<i data-lucide="plus"></i> ' + t('apiSettings.addEndpoint') +
    '</button>' +
    '</div>' +
    '<div id="endpoints-list" class="api-settings-list"></div>' +
    '</div>' +
    '<div class="api-settings-section">' +
    '<div class="section-header">' +
    '<h3>' + t('apiSettings.cacheSettings') + '</h3>' +
    '</div>' +
    '<div id="cache-settings-panel" class="cache-settings-panel"></div>' +
    '</div>' +
    '</div>';
  renderProvidersList();
  renderEndpointsList();
  renderCacheSettings(cacheStats);
  if (window.lucide) lucide.createIcons();
 }
 /**
 * Render providers list
 */
 function renderProvidersList() {
  var container = document.getElementById('providers-list');
  if (!container) return;
  var providers = apiSettingsData.providers || [];
  if (providers.length === 0) {
    container.innerHTML = '<div class="empty-state">' +
      '<i data-lucide="cloud-off" class="empty-icon"></i>' +
      '<p>' + t('apiSettings.noProviders') + '</p>' +
      '</div>';
    if (window.lucide) lucide.createIcons();
    return;
  }
  container.innerHTML = providers.map(function(provider) {
    var statusClass = provider.enabled === false ? 'disabled' : 'enabled';
    var statusText = provider.enabled === false ? t('apiSettings.disabled') : t('apiSettings.enabled');
    return '<div class="api-settings-card provider-card ' + statusClass + '">' +
      '<div class="card-header">' +
      '<div class="card-info">' +
      '<h4>' + provider.name + '</h4>' +
      '<span class="provider-type-badge">' + provider.type + '</span>' +
      '</div>' +
      '<div class="card-actions">' +
      '<button class="btn-icon" onclick="showEditProviderModal(\'' + provider.id + '\')" title="' + t('common.edit') + '">' +
      '<i data-lucide="edit"></i>' +
      '</button>' +
      '<button class="btn-icon btn-danger" onclick="deleteProvider(\'' + provider.id + '\')" title="' + t('common.delete') + '">' +
      '<i data-lucide="trash-2"></i>' +
      '</button>' +
      '</div>' +
      '</div>' +
      '<div class="card-body">' +
      '<div class="card-meta">' +
      '<span><i data-lucide="key"></i> ' + maskApiKey(provider.apiKey) + '</span>' +
      (provider.apiBase ? '<span><i data-lucide="globe"></i> ' + provider.apiBase + '</span>' : '') +
      '<span class="status-badge status-' + statusClass + '">' + statusText + '</span>' +
      '</div>' +
      '</div>' +
      '</div>';
  }).join('');
  if (window.lucide) lucide.createIcons();
 }
 /**
 * Render endpoints list
 */
 function renderEndpointsList() {
  var container = document.getElementById('endpoints-list');
  if (!container) return;
  var endpoints = apiSettingsData.endpoints || [];
  if (endpoints.length === 0) {
    container.innerHTML = '<div class="empty-state">' +
      '<i data-lucide="layers-off" class="empty-icon"></i>' +
      '<p>' + t('apiSettings.noEndpoints') + '</p>' +
      '</div>';
    if (window.lucide) lucide.createIcons();
    return;
  }
  container.innerHTML = endpoints.map(function(endpoint) {
    var provider = apiSettingsData.providers.find(function(p) { return p.id === endpoint.providerId; });
    var providerName = provider ? provider.name : endpoint.providerId;
    var cacheStatus = endpoint.cacheStrategy?.enabled
      ? t('apiSettings.cacheEnabled') + ' (' + endpoint.cacheStrategy.ttlMinutes + ' min)'
      : t('apiSettings.cacheDisabled');
    return '<div class="api-settings-card endpoint-card">' +
      '<div class="card-header">' +
      '<div class="card-info">' +
      '<h4>' + endpoint.name + '</h4>' +
      '<code class="endpoint-id">' + endpoint.id + '</code>' +
      '</div>' +
      '<div class="card-actions">' +
      '<button class="btn-icon" onclick="showEditEndpointModal(\'' + endpoint.id + '\')" title="' + t('common.edit') + '">' +
      '<i data-lucide="edit"></i>' +
      '</button>' +
      '<button class="btn-icon btn-danger" onclick="deleteEndpoint(\'' + endpoint.id + '\')" title="' + t('common.delete') + '">' +
      '<i data-lucide="trash-2"></i>' +
      '</button>' +
      '</div>' +
      '</div>' +
      '<div class="card-body">' +
      '<div class="card-meta">' +
      '<span><i data-lucide="server"></i> ' + providerName + '</span>' +
      '<span><i data-lucide="cpu"></i> ' + endpoint.model + '</span>' +
      '<span><i data-lucide="database"></i> ' + cacheStatus + '</span>' +
      '</div>' +
      '<div class="usage-hint">' +
      '<i data-lucide="terminal"></i> ' +
      '<code>ccw cli -p "..." --model ' + endpoint.id + '</code>' +
      '</div>' +
      '</div>' +
      '</div>';
  }).join('');
  if (window.lucide) lucide.createIcons();
 }
 /**
 * Render cache settings panel
 */
 function renderCacheSettings(stats) {
  var container = document.getElementById('cache-settings-panel');
  if (!container) return;
  var globalSettings = apiSettingsData.globalCache || { enabled: false };
  var usedMB = (stats.totalSize / 1024 / 1024).toFixed(2);
  var maxMB = (stats.maxSize / 1024 / 1024).toFixed(0);
  var usagePercent = stats.maxSize > 0 ? ((stats.totalSize / stats.maxSize) * 100).toFixed(1) : 0;
  container.innerHTML = '<div class="cache-settings-content">' +
    '<label class="checkbox-label">' +
    '<input type="checkbox" id="global-cache-enabled" ' + (globalSettings.enabled ? 'checked' : '') + ' onchange="toggleGlobalCache()" /> ' +
    t('apiSettings.enableGlobalCaching') +
    '</label>' +
    '<div class="cache-stats">' +
    '<div class="stat-item">' +
    '<span class="stat-label">' + t('apiSettings.cacheUsed') + '</span>' +
    '<span class="stat-value">' + usedMB + ' MB / ' + maxMB + ' MB (' + usagePercent + '%)</span>' +
    '</div>' +
    '<div class="stat-item">' +
    '<span class="stat-label">' + t('apiSettings.cacheEntries') + '</span>' +
    '<span class="stat-value">' + stats.entries + '</span>' +
    '</div>' +
    '<div class="progress-bar">' +
    '<div class="progress-fill" style="width: ' + usagePercent + '%"></div>' +
    '</div>' +
    '</div>' +
    '<button class="btn btn-secondary" onclick="clearCache()">' +
    '<i data-lucide="trash-2"></i> ' + t('apiSettings.clearCache') +
    '</button>' +
    '</div>';
  if (window.lucide) lucide.createIcons();
 }
 // ========== Utility Functions ==========
 /**
 * Mask API key for display
 */
 function maskApiKey(apiKey) {
  if (!apiKey) return '';
  if (apiKey.startsWith('${')) return apiKey; // Environment variable
  if (apiKey.length <= 8) return '***';
  return apiKey.substring(0, 4) + '...' + apiKey.substring(apiKey.length - 4);
 }
--- a/ccw/src/templates/dashboard.html
+++ b/ccw/src/templates/dashboard.html
@@ -336,6 +336,10 @@
                <span class="nav-text flex-1" data-i18n="nav.codexLensManager">CodexLens</span>
                <span class="badge px-2 py-0.5 text-xs font-semibold rounded-full bg-hover text-muted-foreground" id="badgeCodexLens">-</span>
              </li>
              <li class="nav-item flex items-center gap-2 px-3 py-2.5 text-sm text-muted-foreground hover:bg-hover hover:text-foreground rounded cursor-pointer transition-colors" data-view="api-settings" data-tooltip="API Settings">
                <i data-lucide="settings" class="nav-icon"></i>
                <span class="nav-text flex-1" data-i18n="nav.apiSettings">API Settings</span>
              </li>
              <!-- Hidden: Code Graph Explorer (feature disabled)
              <li class="nav-item flex items-center gap-2 px-3 py-2.5 text-sm text-muted-foreground hover:bg-hover hover:text-foreground rounded cursor-pointer transition-colors" data-view="graph-explorer" data-tooltip="Code Graph Explorer">
                <i data-lucide="git-branch" class="nav-icon"></i>
--- a/ccw/src/tools/cli-executor.ts
+++ b/ccw/src/tools/cli-executor.ts
@@ -10,6 +10,10 @@ import { spawn, ChildProcess } from 'child_process';
 import { existsSync, mkdirSync, readFileSync, writeFileSync, unlinkSync, readdirSync, statSync } from 'fs';
 import { join, relative } from 'path';
 // LiteLLM integration
 import { executeLiteLLMEndpoint } from './litellm-executor.js';
 import { findEndpointById } from '../config/litellm-api-config-manager.js';
 // Native resume support
 import {
  trackNewSession,
@@ -592,6 +596,66 @@ async function executeCliTool(
  const workingDir = cd || process.cwd();
  ensureHistoryDir(workingDir); // Ensure history directory exists
  // NEW: Check if model is a custom LiteLLM endpoint ID
  if (model && !['gemini', 'qwen', 'codex'].includes(tool)) {
    const endpoint = findEndpointById(workingDir, model);
    if (endpoint) {
      // Route to LiteLLM executor
      if (onOutput) {
        onOutput({ type: 'stderr', data: `[Routing to LiteLLM endpoint: ${model}]\n` });
      }
      const result = await executeLiteLLMEndpoint({
        prompt,
        endpointId: model,
        baseDir: workingDir,
        cwd: cd,
        includeDirs: includeDirs ? includeDirs.split(',').map(d => d.trim()) : undefined,
        enableCache: true,
        onOutput: onOutput || undefined,
      });
      // Convert LiteLLM result to ExecutionOutput format
      const startTime = Date.now();
      const endTime = Date.now();
      const duration = endTime - startTime;
      const execution: ExecutionRecord = {
        id: customId || `${Date.now()}-litellm`,
        timestamp: new Date(startTime).toISOString(),
        tool: 'litellm',
        model: result.model,
        mode,
        prompt,
        status: result.success ? 'success' : 'error',
        exit_code: result.success ? 0 : 1,
        duration_ms: duration,
        output: {
          stdout: result.output,
          stderr: result.error || '',
          truncated: false,
        },
      };
      const conversation = convertToConversation(execution);
      // Try to save to history
      try {
        saveConversation(workingDir, conversation);
      } catch (err) {
        console.error('[CLI Executor] Failed to save LiteLLM history:', (err as Error).message);
      }
      return {
        success: result.success,
        execution,
        conversation,
        stdout: result.output,
        stderr: result.error || '',
      };
    }
  }
  // Get SQLite store for native session lookup
  const store = await getSqliteStore(workingDir);
--- a/ccw/src/tools/litellm-client.ts
+++ b/ccw/src/tools/litellm-client.ts
@@ -0,0 +1,246 @@
 /**
 * LiteLLM Client - Bridge between CCW and ccw-litellm Python package
 * Provides LLM chat and embedding capabilities via spawned Python process
 *
 * Features:
 * - Chat completions with multiple models
 * - Text embeddings generation
 * - Configuration management
 * - JSON protocol communication
 */
 import { spawn } from 'child_process';
 import { promisify } from 'util';
 export interface LiteLLMConfig {
  pythonPath?: string;  // Default 'python'
  configPath?: string;  // Configuration file path
  timeout?: number;     // Default 60000ms
 }
 export interface ChatMessage {
  role: 'system' | 'user' | 'assistant';
  content: string;
 }
 export interface ChatResponse {
  content: string;
  model: string;
  usage?: {
    prompt_tokens: number;
    completion_tokens: number;
    total_tokens: number;
  };
 }
 export interface EmbedResponse {
  vectors: number[][];
  dimensions: number;
  model: string;
 }
 export interface LiteLLMStatus {
  available: boolean;
  version?: string;
  error?: string;
 }
 export class LiteLLMClient {
  private pythonPath: string;
  private configPath?: string;
  private timeout: number;
  constructor(config: LiteLLMConfig = {}) {
    this.pythonPath = config.pythonPath || 'python';
    this.configPath = config.configPath;
    this.timeout = config.timeout || 60000;
  }
  /**
   * Execute Python ccw-litellm command
   */
  private async executePython(args: string[], options: { timeout?: number } = {}): Promise<string> {
    const timeout = options.timeout || this.timeout;
    return new Promise((resolve, reject) => {
      const proc = spawn(this.pythonPath, ['-m', 'ccw_litellm.cli', ...args], {
        stdio: ['pipe', 'pipe', 'pipe'],
        env: { ...process.env }
      });
      let stdout = '';
      let stderr = '';
      let timedOut = false;
      // Set up timeout
      const timeoutId = setTimeout(() => {
        timedOut = true;
        proc.kill('SIGTERM');
        reject(new Error(`Command timed out after ${timeout}ms`));
      }, timeout);
      proc.stdout.on('data', (data) => {
        stdout += data.toString();
      });
      proc.stderr.on('data', (data) => {
        stderr += data.toString();
      });
      proc.on('error', (error) => {
        clearTimeout(timeoutId);
        reject(new Error(`Failed to spawn Python process: ${error.message}`));
      });
      proc.on('close', (code) => {
        clearTimeout(timeoutId);
        if (timedOut) {
          return; // Already rejected
        }
        if (code === 0) {
          resolve(stdout.trim());
        } else {
          const errorMsg = stderr.trim() || `Process exited with code ${code}`;
          reject(new Error(errorMsg));
        }
      });
    });
  }
  /**
   * Check if ccw-litellm is available
   */
  async isAvailable(): Promise<boolean> {
    try {
      await this.executePython(['version'], { timeout: 5000 });
      return true;
    } catch {
      return false;
    }
  }
  /**
   * Get status information
   */
  async getStatus(): Promise<LiteLLMStatus> {
    try {
      const output = await this.executePython(['version'], { timeout: 5000 });
      return {
        available: true,
        version: output.trim()
      };
    } catch (error: any) {
      return {
        available: false,
        error: error.message
      };
    }
  }
  /**
   * Get current configuration
   */
  async getConfig(): Promise<any> {
    const output = await this.executePython(['config', '--json']);
    return JSON.parse(output);
  }
  /**
   * Generate embeddings for texts
   */
  async embed(texts: string[], model: string = 'default'): Promise<EmbedResponse> {
    if (!texts || texts.length === 0) {
      throw new Error('texts array cannot be empty');
    }
    const args = ['embed', '--model', model, '--output', 'json'];
    // Add texts as arguments
    for (const text of texts) {
      args.push(text);
    }
    const output = await this.executePython(args, { timeout: this.timeout * 2 });
    const vectors = JSON.parse(output);
    return {
      vectors,
      dimensions: vectors[0]?.length || 0,
      model
    };
  }
  /**
   * Chat with LLM
   */
  async chat(message: string, model: string = 'default'): Promise<string> {
    if (!message) {
      throw new Error('message cannot be empty');
    }
    const args = ['chat', '--model', model, message];
    return this.executePython(args, { timeout: this.timeout * 2 });
  }
  /**
   * Multi-turn chat with messages array
   */
  async chatMessages(messages: ChatMessage[], model: string = 'default'): Promise<ChatResponse> {
    if (!messages || messages.length === 0) {
      throw new Error('messages array cannot be empty');
    }
    // For now, just use the last user message
    // TODO: Implement full message history support in ccw-litellm
    const lastMessage = messages[messages.length - 1];
    const content = await this.chat(lastMessage.content, model);
    return {
      content,
      model,
      usage: undefined // TODO: Add usage tracking
    };
  }
 }
 // Singleton instance
 let _client: LiteLLMClient | null = null;
 /**
 * Get or create singleton LiteLLM client
 */
 export function getLiteLLMClient(config?: LiteLLMConfig): LiteLLMClient {
  if (!_client) {
    _client = new LiteLLMClient(config);
  }
  return _client;
 }
 /**
 * Check if LiteLLM is available
 */
 export async function checkLiteLLMAvailable(): Promise<boolean> {
  try {
    const client = getLiteLLMClient();
    return await client.isAvailable();
  } catch {
    return false;
  }
 }
 /**
 * Get LiteLLM status
 */
 export async function getLiteLLMStatus(): Promise<LiteLLMStatus> {
  try {
    const client = getLiteLLMClient();
    return await client.getStatus();
  } catch (error: any) {
    return {
      available: false,
      error: error.message
    };
  }
 }
--- a/ccw/src/tools/litellm-executor.ts
+++ b/ccw/src/tools/litellm-executor.ts
@@ -0,0 +1,241 @@
 /**
 * LiteLLM Executor - Execute LiteLLM endpoints with context caching
 * Integrates with context-cache for file packing and LiteLLM client for API calls
 */
 import { getLiteLLMClient } from './litellm-client.js';
 import { handler as contextCacheHandler } from './context-cache.js';
 import {
  findEndpointById,
  getProviderWithResolvedEnvVars,
 } from '../config/litellm-api-config-manager.js';
 import type { CustomEndpoint, ProviderCredential } from '../types/litellm-api-config.js';
 export interface LiteLLMExecutionOptions {
  prompt: string;
  endpointId: string; // Custom endpoint ID (e.g., "my-gpt4o")
  baseDir: string; // Project base directory
  cwd?: string; // Working directory for file resolution
  includeDirs?: string[]; // Additional directories for @patterns
  enableCache?: boolean; // Override endpoint cache setting
  onOutput?: (data: { type: string; data: string }) => void;
 }
 export interface LiteLLMExecutionResult {
  success: boolean;
  output: string;
  model: string;
  provider: string;
  cacheUsed: boolean;
  cachedFiles?: string[];
  error?: string;
 }
 /**
 * Extract @patterns from prompt text
 */
 export function extractPatterns(prompt: string): string[] {
  // Match @path patterns: @src/**/*.ts, @CLAUDE.md, @../shared/**/*
  const regex = /@([^\s]+)/g;
  const patterns: string[] = [];
  let match;
  while ((match = regex.exec(prompt)) !== null) {
    patterns.push('@' + match[1]);
  }
  return patterns;
 }
 /**
 * Execute LiteLLM endpoint with optional context caching
 */
 export async function executeLiteLLMEndpoint(
  options: LiteLLMExecutionOptions
 ): Promise<LiteLLMExecutionResult> {
  const { prompt, endpointId, baseDir, cwd, includeDirs, enableCache, onOutput } = options;
  // 1. Find endpoint configuration
  const endpoint = findEndpointById(baseDir, endpointId);
  if (!endpoint) {
    return {
      success: false,
      output: '',
      model: '',
      provider: '',
      cacheUsed: false,
      error: `Endpoint not found: ${endpointId}`,
    };
  }
  // 2. Get provider with resolved env vars
  const provider = getProviderWithResolvedEnvVars(baseDir, endpoint.providerId);
  if (!provider) {
    return {
      success: false,
      output: '',
      model: '',
      provider: '',
      cacheUsed: false,
      error: `Provider not found: ${endpoint.providerId}`,
    };
  }
  // Verify API key is available
  if (!provider.resolvedApiKey) {
    return {
      success: false,
      output: '',
      model: endpoint.model,
      provider: provider.type,
      cacheUsed: false,
      error: `API key not configured for provider: ${provider.name}`,
    };
  }
  // 3. Process context cache if enabled
  let finalPrompt = prompt;
  let cacheUsed = false;
  let cachedFiles: string[] = [];
  const shouldCache = enableCache ?? endpoint.cacheStrategy.enabled;
  if (shouldCache) {
    const patterns = extractPatterns(prompt);
    if (patterns.length > 0) {
      if (onOutput) {
        onOutput({ type: 'stderr', data: `[Context cache: Found ${patterns.length} @patterns]\n` });
      }
      // Pack files into cache
      const packResult = await contextCacheHandler({
        operation: 'pack',
        patterns,
        cwd: cwd || process.cwd(),
        include_dirs: includeDirs,
        ttl: endpoint.cacheStrategy.ttlMinutes * 60 * 1000,
        max_file_size: endpoint.cacheStrategy.maxSizeKB * 1024,
      });
      if (packResult.success && packResult.result) {
        const pack = packResult.result as any;
        if (onOutput) {
          onOutput({
            type: 'stderr',
            data: `[Context cache: Packed ${pack.files_packed} files, ${pack.total_bytes} bytes]\n`,
          });
        }
        // Read cached content
        const readResult = await contextCacheHandler({
          operation: 'read',
          session_id: pack.session_id,
          limit: endpoint.cacheStrategy.maxSizeKB * 1024,
        });
        if (readResult.success && readResult.result) {
          const read = readResult.result as any;
          // Prepend cached content to prompt
          finalPrompt = `${read.content}\n\n---\n\n${prompt}`;
          cacheUsed = true;
          cachedFiles = pack.files_packed ? Array(pack.files_packed).fill('...') : [];
          if (onOutput) {
            onOutput({ type: 'stderr', data: `[Context cache: Applied to prompt]\n` });
          }
        }
      } else if (packResult.error) {
        if (onOutput) {
          onOutput({ type: 'stderr', data: `[Context cache warning: ${packResult.error}]\n` });
        }
      }
    }
  }
  // 4. Call LiteLLM
  try {
    if (onOutput) {
      onOutput({
        type: 'stderr',
        data: `[LiteLLM: Calling ${provider.type}/${endpoint.model}]\n`,
      });
    }
    const client = getLiteLLMClient({
      pythonPath: 'python',
      timeout: 120000, // 2 minutes
    });
    // Configure provider credentials via environment
    // LiteLLM uses standard env vars like OPENAI_API_KEY, ANTHROPIC_API_KEY
    const envVarName = getProviderEnvVarName(provider.type);
    if (envVarName) {
      process.env[envVarName] = provider.resolvedApiKey;
    }
    // Set base URL if custom
    if (provider.apiBase) {
      const baseUrlEnvVar = getProviderBaseUrlEnvVarName(provider.type);
      if (baseUrlEnvVar) {
        process.env[baseUrlEnvVar] = provider.apiBase;
      }
    }
    // Use litellm-client to call chat
    const response = await client.chat(finalPrompt, endpoint.model);
    if (onOutput) {
      onOutput({ type: 'stdout', data: response });
    }
    return {
      success: true,
      output: response,
      model: endpoint.model,
      provider: provider.type,
      cacheUsed,
      cachedFiles,
    };
  } catch (error) {
    const errorMsg = (error as Error).message;
    if (onOutput) {
      onOutput({ type: 'stderr', data: `[LiteLLM error: ${errorMsg}]\n` });
    }
    return {
      success: false,
      output: '',
      model: endpoint.model,
      provider: provider.type,
      cacheUsed,
      error: errorMsg,
    };
  }
 }
 /**
 * Get environment variable name for provider API key
 */
 function getProviderEnvVarName(providerType: string): string | null {
  const envVarMap: Record<string, string> = {
    openai: 'OPENAI_API_KEY',
    anthropic: 'ANTHROPIC_API_KEY',
    google: 'GOOGLE_API_KEY',
    azure: 'AZURE_API_KEY',
    mistral: 'MISTRAL_API_KEY',
    deepseek: 'DEEPSEEK_API_KEY',
  };
  return envVarMap[providerType] || null;
 }
 /**
 * Get environment variable name for provider base URL
 */
 function getProviderBaseUrlEnvVarName(providerType: string): string | null {
  const envVarMap: Record<string, string> = {
    openai: 'OPENAI_API_BASE',
    anthropic: 'ANTHROPIC_API_BASE',
    azure: 'AZURE_API_BASE',
  };
  return envVarMap[providerType] || null;
 }
--- a/ccw/src/types/litellm-api-config.ts
+++ b/ccw/src/types/litellm-api-config.ts
@@ -0,0 +1,136 @@
 /**
 * LiteLLM API Configuration Type Definitions
 *
 * Defines types for provider credentials, cache strategies, custom endpoints,
 * and the overall configuration structure for LiteLLM API integration.
 */
 /**
 * Supported LLM provider types
 */
 export type ProviderType =
  | 'openai'
  | 'anthropic'
  | 'ollama'
  | 'azure'
  | 'google'
  | 'mistral'
  | 'deepseek'
  | 'custom';
 /**
 * Provider credential configuration
 * Stores API keys, base URLs, and provider metadata
 */
 export interface ProviderCredential {
  /** Unique identifier for this provider configuration */
  id: string;
  /** Display name for UI */
  name: string;
  /** Provider type */
  type: ProviderType;
  /** API key or environment variable reference (e.g., ${OPENAI_API_KEY}) */
  apiKey: string;
  /** Custom API base URL (optional, overrides provider default) */
  apiBase?: string;
  /** Whether this provider is enabled */
  enabled: boolean;
  /** Creation timestamp (ISO 8601) */
  createdAt: string;
  /** Last update timestamp (ISO 8601) */
  updatedAt: string;
 }
 /**
 * Cache strategy for prompt context optimization
 * Enables file-based caching to reduce token usage
 */
 export interface CacheStrategy {
  /** Whether caching is enabled for this endpoint */
  enabled: boolean;
  /** Time-to-live in minutes (default: 60) */
  ttlMinutes: number;
  /** Maximum cache size in KB (default: 512) */
  maxSizeKB: number;
  /** File patterns to cache (glob patterns like "*.md", "*.ts") */
  filePatterns: string[];
 }
 /**
 * Custom endpoint configuration
 * Maps CLI identifiers to specific models and caching strategies
 */
 export interface CustomEndpoint {
  /** Unique CLI identifier (used in --model flag, e.g., "my-gpt4o") */
  id: string;
  /** Display name for UI */
  name: string;
  /** Reference to provider credential ID */
  providerId: string;
  /** Model identifier (e.g., "gpt-4o", "claude-3-5-sonnet-20241022") */
  model: string;
  /** Optional description */
  description?: string;
  /** Cache strategy for this endpoint */
  cacheStrategy: CacheStrategy;
  /** Whether this endpoint is enabled */
  enabled: boolean;
  /** Creation timestamp (ISO 8601) */
  createdAt: string;
  /** Last update timestamp (ISO 8601) */
  updatedAt: string;
 }
 /**
 * Global cache settings
 * Applies to all endpoints unless overridden
 */
 export interface GlobalCacheSettings {
  /** Whether caching is globally enabled */
  enabled: boolean;
  /** Cache directory path (default: ~/.ccw/cache/context) */
  cacheDir: string;
  /** Maximum total cache size in MB (default: 100) */
  maxTotalSizeMB: number;
 }
 /**
 * Complete LiteLLM API configuration
 * Root configuration object stored in JSON file
 */
 export interface LiteLLMApiConfig {
  /** Configuration schema version */
  version: number;
  /** List of configured providers */
  providers: ProviderCredential[];
  /** List of custom endpoints */
  endpoints: CustomEndpoint[];
  /** Default endpoint ID (optional) */
  defaultEndpoint?: string;
  /** Global cache settings */
  globalCacheSettings: GlobalCacheSettings;
 }
--- a/ccw/tests/litellm-client.test.ts
+++ b/ccw/tests/litellm-client.test.ts
@@ -0,0 +1,96 @@
 /**
 * LiteLLM Client Tests
 * Tests for the LiteLLM TypeScript bridge
 */
 import { describe, it, expect, beforeEach } from '@jest/globals';
 import { LiteLLMClient, getLiteLLMClient, checkLiteLLMAvailable, getLiteLLMStatus } from '../src/tools/litellm-client';
 describe('LiteLLMClient', () => {
  let client: LiteLLMClient;
  beforeEach(() => {
    client = new LiteLLMClient({ timeout: 5000 });
  });
  describe('Constructor', () => {
    it('should create client with default config', () => {
      const defaultClient = new LiteLLMClient();
      expect(defaultClient).toBeDefined();
    });
    it('should create client with custom config', () => {
      const customClient = new LiteLLMClient({
        pythonPath: 'python3',
        timeout: 10000
      });
      expect(customClient).toBeDefined();
    });
  });
  describe('isAvailable', () => {
    it('should check if ccw-litellm is available', async () => {
      const available = await client.isAvailable();
      expect(typeof available).toBe('boolean');
    });
  });
  describe('getStatus', () => {
    it('should return status object', async () => {
      const status = await client.getStatus();
      expect(status).toHaveProperty('available');
      expect(typeof status.available).toBe('boolean');
    });
  });
  describe('embed', () => {
    it('should throw error for empty texts array', async () => {
      await expect(client.embed([])).rejects.toThrow('texts array cannot be empty');
    });
    it('should throw error for null texts', async () => {
      await expect(client.embed(null as any)).rejects.toThrow();
    });
  });
  describe('chat', () => {
    it('should throw error for empty message', async () => {
      await expect(client.chat('')).rejects.toThrow('message cannot be empty');
    });
  });
  describe('chatMessages', () => {
    it('should throw error for empty messages array', async () => {
      await expect(client.chatMessages([])).rejects.toThrow('messages array cannot be empty');
    });
    it('should throw error for null messages', async () => {
      await expect(client.chatMessages(null as any)).rejects.toThrow();
    });
  });
 });
 describe('Singleton Functions', () => {
  describe('getLiteLLMClient', () => {
    it('should return singleton instance', () => {
      const client1 = getLiteLLMClient();
      const client2 = getLiteLLMClient();
      expect(client1).toBe(client2);
    });
  });
  describe('checkLiteLLMAvailable', () => {
    it('should return boolean', async () => {
      const available = await checkLiteLLMAvailable();
      expect(typeof available).toBe('boolean');
    });
  });
  describe('getLiteLLMStatus', () => {
    it('should return status object', async () => {
      const status = await getLiteLLMStatus();
      expect(status).toHaveProperty('available');
      expect(typeof status.available).toBe('boolean');
    });
  });
 });
--- a/codex-lens/src/codexlens/cli/commands.py
+++ b/codex-lens/src/codexlens/cli/commands.py
@@ -106,7 +106,8 @@ def init(
    workers: Optional[int] = typer.Option(None, "--workers", "-w", min=1, max=16, help="Parallel worker processes (default: auto-detect based on CPU count, max 16)."),
    force: bool = typer.Option(False, "--force", "-f", help="Force full reindex (skip incremental mode)."),
    no_embeddings: bool = typer.Option(False, "--no-embeddings", help="Skip automatic embedding generation (if semantic deps installed)."),
-    embedding_model: str = typer.Option("code", "--embedding-model", help="Embedding model profile: fast, code, multilingual, balanced."),
+    embedding_backend: str = typer.Option("fastembed", "--embedding-backend", help="Embedding backend: fastembed (local) or litellm (remote API)."),
    embedding_model: str = typer.Option("code", "--embedding-model", help="Embedding model: profile name for fastembed (fast/code/multilingual/balanced) or model name for litellm (e.g. text-embedding-3-small)."),
    json_mode: bool = typer.Option(False, "--json", help="Output JSON response."),
    verbose: bool = typer.Option(False, "--verbose", "-v", help="Enable debug logging."),
 ) -> None:
@@ -120,6 +121,14 @@ def init(
    If semantic search dependencies are installed, automatically generates embeddings
    after indexing completes. Use --no-embeddings to skip this step.
    Embedding Backend Options:
      - fastembed: Local ONNX-based embeddings (default, no API calls)
      - litellm: Remote API embeddings via ccw-litellm (requires API keys)
    Embedding Model Options:
      - For fastembed backend: Use profile names (fast, code, multilingual, balanced)
      - For litellm backend: Use model names (e.g., text-embedding-3-small, text-embedding-ada-002)
    """
    _configure_logging(verbose, json_mode)
    config = Config()
@@ -171,11 +180,22 @@ def init(
                from codexlens.cli.embedding_manager import generate_embeddings_recursive, get_embeddings_status
                if SEMANTIC_AVAILABLE:
                    # Validate embedding backend
                    valid_backends = ["fastembed", "litellm"]
                    if embedding_backend not in valid_backends:
                        error_msg = f"Invalid embedding backend: {embedding_backend}. Must be one of: {', '.join(valid_backends)}"
                        if json_mode:
                            print_json(success=False, error=error_msg)
                        else:
                            console.print(f"[red]Error:[/red] {error_msg}")
                        raise typer.Exit(code=1)
                    # Use the index root directory (not the _index.db file)
                    index_root = Path(build_result.index_root)
                    if not json_mode:
                        console.print("\n[bold]Generating embeddings...[/bold]")
                        console.print(f"Backend: [cyan]{embedding_backend}[/cyan]")
                        console.print(f"Model: [cyan]{embedding_model}[/cyan]")
                    else:
                        # Output progress message for JSON mode (parsed by Node.js)
@@ -196,6 +216,7 @@ def init(
                    embed_result = generate_embeddings_recursive(
                        index_root,
                        embedding_backend=embedding_backend,
                        model_profile=embedding_model,
                        force=False,  # Don't force regenerate during init
                        chunk_size=2000,
@@ -1781,11 +1802,17 @@ def embeddings_generate(
        exists=True,
        help="Path to _index.db file or project directory.",
    ),
    backend: str = typer.Option(
        "fastembed",
        "--backend",
        "-b",
        help="Embedding backend: fastembed (local) or litellm (remote API).",
    ),
    model: str = typer.Option(
        "code",
        "--model",
        "-m",
-        help="Model profile: fast, code, multilingual, balanced.",
+        help="Model: profile name for fastembed (fast/code/multilingual/balanced) or model name for litellm (e.g. text-embedding-3-small).",
    ),
    force: bool = typer.Option(
        False,
@@ -1813,21 +1840,43 @@ def embeddings_generate(
    semantic search capabilities. Embeddings are stored in the same
    database as the FTS index.
-    Model Profiles:
+    Embedding Backend Options:
-      - fast: BAAI/bge-small-en-v1.5 (384 dims, ~80MB)
+      - fastembed: Local ONNX-based embeddings (default, no API calls)
-      - code: jinaai/jina-embeddings-v2-base-code (768 dims, ~150MB) [recommended]
+      - litellm: Remote API embeddings via ccw-litellm (requires API keys)
-      - multilingual: intfloat/multilingual-e5-large (1024 dims, ~1GB)
+
-      - balanced: mixedbread-ai/mxbai-embed-large-v1 (1024 dims, ~600MB)
+    Model Options:
      For fastembed backend (profiles):
        - fast: BAAI/bge-small-en-v1.5 (384 dims, ~80MB)
        - code: jinaai/jina-embeddings-v2-base-code (768 dims, ~150MB) [recommended]
        - multilingual: intfloat/multilingual-e5-large (1024 dims, ~1GB)
        - balanced: mixedbread-ai/mxbai-embed-large-v1 (1024 dims, ~600MB)
      For litellm backend (model names):
        - text-embedding-3-small, text-embedding-3-large (OpenAI)
        - text-embedding-ada-002 (OpenAI legacy)
        - Any model supported by ccw-litellm
    Examples:
-        codexlens embeddings-generate ~/projects/my-app              # Auto-find index for project
+        codexlens embeddings-generate ~/projects/my-app              # Auto-find index (fastembed, code profile)
        codexlens embeddings-generate ~/.codexlens/indexes/project/_index.db  # Specific index
-        codexlens embeddings-generate ~/projects/my-app --model fast --force  # Regenerate with fast model
+        codexlens embeddings-generate ~/projects/my-app --backend litellm --model text-embedding-3-small  # Use LiteLLM
        codexlens embeddings-generate ~/projects/my-app --model fast --force  # Regenerate with fast profile
    """
    _configure_logging(verbose, json_mode)
    from codexlens.cli.embedding_manager import generate_embeddings, generate_embeddings_recursive
    # Validate backend
    valid_backends = ["fastembed", "litellm"]
    if backend not in valid_backends:
        error_msg = f"Invalid backend: {backend}. Must be one of: {', '.join(valid_backends)}"
        if json_mode:
            print_json(success=False, error=error_msg)
        else:
            console.print(f"[red]Error:[/red] {error_msg}")
            console.print(f"[dim]Valid backends: {', '.join(valid_backends)}[/dim]")
        raise typer.Exit(code=1)
    # Resolve path
    target_path = path.expanduser().resolve()
@@ -1877,11 +1926,13 @@ def embeddings_generate(
        console.print(f"Mode: [yellow]Recursive[/yellow]")
    else:
        console.print(f"Index: [dim]{index_path}[/dim]")
    console.print(f"Backend: [cyan]{backend}[/cyan]")
    console.print(f"Model: [cyan]{model}[/cyan]\n")
    if use_recursive:
        result = generate_embeddings_recursive(
            index_root,
            embedding_backend=backend,
            model_profile=model,
            force=force,
            chunk_size=chunk_size,
@@ -1890,6 +1941,7 @@ def embeddings_generate(
    else:
        result = generate_embeddings(
            index_path,
            embedding_backend=backend,
            model_profile=model,
            force=force,
            chunk_size=chunk_size,
--- a/codex-lens/src/codexlens/cli/embedding_manager.py
+++ b/codex-lens/src/codexlens/cli/embedding_manager.py
@@ -191,6 +191,7 @@ def check_index_embeddings(index_path: Path) -> Dict[str, any]:
 def generate_embeddings(
    index_path: Path,
    embedding_backend: str = "fastembed",
    model_profile: str = "code",
    force: bool = False,
    chunk_size: int = 2000,
@@ -203,7 +204,9 @@ def generate_embeddings(
    Args:
        index_path: Path to _index.db file
-        model_profile: Model profile (fast, code, multilingual, balanced)
+        embedding_backend: Embedding backend to use (fastembed or litellm)
        model_profile: Model profile for fastembed (fast, code, multilingual, balanced)
                      or model name for litellm (e.g., text-embedding-3-small)
        force: If True, regenerate even if embeddings exist
        chunk_size: Maximum chunk size in characters
        progress_callback: Optional callback for progress updates
@@ -253,8 +256,22 @@ def generate_embeddings(
    # Initialize components
    try:
-        # Initialize embedder (singleton, reused throughout the function)
+        # Import factory function to support both backends
-        embedder = get_embedder(profile=model_profile)
+        from codexlens.semantic.factory import get_embedder as get_embedder_factory
        # Initialize embedder using factory (supports both fastembed and litellm)
        # For fastembed: model_profile is a profile name (fast/code/multilingual/balanced)
        # For litellm: model_profile is a model name (e.g., text-embedding-3-small)
        if embedding_backend == "fastembed":
            embedder = get_embedder_factory(backend="fastembed", profile=model_profile, use_gpu=True)
        elif embedding_backend == "litellm":
            embedder = get_embedder_factory(backend="litellm", model=model_profile)
        else:
            return {
                "success": False,
                "error": f"Invalid embedding backend: {embedding_backend}. Must be 'fastembed' or 'litellm'.",
            }
        # skip_token_count=True: Use fast estimation (len/4) instead of expensive tiktoken
        # This significantly reduces CPU usage with minimal impact on metadata accuracy
        chunker = Chunker(config=ChunkConfig(max_chunk_size=chunk_size, skip_token_count=True))
@@ -428,6 +445,7 @@ def find_all_indexes(scan_dir: Path) -> List[Path]:
 def generate_embeddings_recursive(
    index_root: Path,
    embedding_backend: str = "fastembed",
    model_profile: str = "code",
    force: bool = False,
    chunk_size: int = 2000,
@@ -437,7 +455,9 @@ def generate_embeddings_recursive(
    Args:
        index_root: Root index directory containing _index.db files
-        model_profile: Model profile (fast, code, multilingual, balanced)
+        embedding_backend: Embedding backend to use (fastembed or litellm)
        model_profile: Model profile for fastembed (fast, code, multilingual, balanced)
                      or model name for litellm (e.g., text-embedding-3-small)
        force: If True, regenerate even if embeddings exist
        chunk_size: Maximum chunk size in characters
        progress_callback: Optional callback for progress updates
@@ -474,6 +494,7 @@ def generate_embeddings_recursive(
        result = generate_embeddings(
            index_path,
            embedding_backend=embedding_backend,
            model_profile=model_profile,
            force=force,
            chunk_size=chunk_size,
--- a/codex-lens/src/codexlens/semantic/init.py
+++ b/codex-lens/src/codexlens/semantic/init.py
@@ -67,10 +67,29 @@ def check_gpu_available() -> tuple[bool, str]:
        return False, "GPU support module not available"
 # Export embedder components
 # BaseEmbedder is always available (abstract base class)
 from .base import BaseEmbedder
 # Factory function for creating embedders
 from .factory import get_embedder as get_embedder_factory
 # Optional: LiteLLMEmbedderWrapper (only if ccw-litellm is installed)
 try:
    from .litellm_embedder import LiteLLMEmbedderWrapper
    _LITELLM_AVAILABLE = True
 except ImportError:
    LiteLLMEmbedderWrapper = None
    _LITELLM_AVAILABLE = False
 __all__ = [
    "SEMANTIC_AVAILABLE",
    "SEMANTIC_BACKEND",
    "GPU_AVAILABLE",
    "check_semantic_available",
    "check_gpu_available",
    "BaseEmbedder",
    "get_embedder_factory",
    "LiteLLMEmbedderWrapper",
 ]
--- a/codex-lens/src/codexlens/semantic/base.py
+++ b/codex-lens/src/codexlens/semantic/base.py
@@ -0,0 +1,51 @@
 """Base class for embedders.
 Defines the interface that all embedders must implement.
 """
 from __future__ import annotations
 from abc import ABC, abstractmethod
 from typing import Iterable
 import numpy as np
 class BaseEmbedder(ABC):
    """Base class for all embedders.
    All embedder implementations must inherit from this class and implement
    the abstract methods to ensure a consistent interface.
    """
    @property
    @abstractmethod
    def embedding_dim(self) -> int:
        """Return embedding dimensions.
        Returns:
            int: Dimension of the embedding vectors.
        """
        ...
    @property
    @abstractmethod
    def model_name(self) -> str:
        """Return model name.
        Returns:
            str: Name or identifier of the underlying model.
        """
        ...
    @abstractmethod
    def embed_to_numpy(self, texts: str | Iterable[str]) -> np.ndarray:
        """Embed texts to numpy array.
        Args:
            texts: Single text or iterable of texts to embed.
        Returns:
            numpy.ndarray: Array of shape (n_texts, embedding_dim) containing embeddings.
        """
        ...
--- a/codex-lens/src/codexlens/semantic/embedder.py
+++ b/codex-lens/src/codexlens/semantic/embedder.py
@@ -14,6 +14,7 @@ from typing import Dict, Iterable, List, Optional
 import numpy as np
 from . import SEMANTIC_AVAILABLE
 from .base import BaseEmbedder
 from .gpu_support import get_optimal_providers, is_gpu_available, get_gpu_summary, get_selected_device_id
 logger = logging.getLogger(__name__)
@@ -84,7 +85,7 @@ def clear_embedder_cache() -> None:
        gc.collect()
-class Embedder:
+class Embedder(BaseEmbedder):
    """Generate embeddings for code chunks using fastembed (ONNX-based).
    Supported Model Profiles:
@@ -138,11 +139,11 @@ class Embedder:
        # Resolve model name from profile or use explicit name
        if model_name:
-            self.model_name = model_name
+            self._model_name = model_name
        elif profile and profile in self.MODELS:
-            self.model_name = self.MODELS[profile]
+            self._model_name = self.MODELS[profile]
        else:
-            self.model_name = self.DEFAULT_MODEL
+            self._model_name = self.DEFAULT_MODEL
        # Configure ONNX execution providers with device_id options for GPU selection
        # Using with_device_options=True ensures DirectML/CUDA device_id is passed correctly
@@ -154,10 +155,15 @@ class Embedder:
        self._use_gpu = use_gpu
        self._model = None
    @property
    def model_name(self) -> str:
        """Get model name."""
        return self._model_name
    @property
    def embedding_dim(self) -> int:
        """Get embedding dimension for current model."""
-        return self.MODEL_DIMS.get(self.model_name, 768)  # Default to 768 if unknown
+        return self.MODEL_DIMS.get(self._model_name, 768)  # Default to 768 if unknown
    @property
    def providers(self) -> List[str]:
--- a/codex-lens/src/codexlens/semantic/factory.py
+++ b/codex-lens/src/codexlens/semantic/factory.py
@@ -0,0 +1,61 @@
 """Factory for creating embedders.
 Provides a unified interface for instantiating different embedder backends.
 """
 from __future__ import annotations
 from typing import Any
 from .base import BaseEmbedder
 def get_embedder(
    backend: str = "fastembed",
    profile: str = "code",
    model: str = "default",
    use_gpu: bool = True,
    **kwargs: Any,
 ) -> BaseEmbedder:
    """Factory function to create embedder based on backend.
    Args:
        backend: Embedder backend to use. Options:
            - "fastembed": Use fastembed (ONNX-based) embedder (default)
            - "litellm": Use ccw-litellm embedder
        profile: Model profile for fastembed backend ("fast", "code", "multilingual", "balanced")
                Used only when backend="fastembed". Default: "code"
        model: Model identifier for litellm backend.
              Used only when backend="litellm". Default: "default"
        use_gpu: Whether to use GPU acceleration when available (default: True).
                Used only when backend="fastembed".
        **kwargs: Additional backend-specific arguments
    Returns:
        BaseEmbedder: Configured embedder instance
    Raises:
        ValueError: If backend is not recognized
        ImportError: If required backend dependencies are not installed
    Examples:
        Create fastembed embedder with code profile:
            >>> embedder = get_embedder(backend="fastembed", profile="code")
        Create fastembed embedder with fast profile and CPU only:
            >>> embedder = get_embedder(backend="fastembed", profile="fast", use_gpu=False)
        Create litellm embedder:
            >>> embedder = get_embedder(backend="litellm", model="text-embedding-3-small")
    """
    if backend == "fastembed":
        from .embedder import Embedder
        return Embedder(profile=profile, use_gpu=use_gpu, **kwargs)
    elif backend == "litellm":
        from .litellm_embedder import LiteLLMEmbedderWrapper
        return LiteLLMEmbedderWrapper(model=model, **kwargs)
    else:
        raise ValueError(
            f"Unknown backend: {backend}. "
            f"Supported backends: 'fastembed', 'litellm'"
        )
--- a/codex-lens/src/codexlens/semantic/litellm_embedder.py
+++ b/codex-lens/src/codexlens/semantic/litellm_embedder.py
@@ -0,0 +1,79 @@
 """LiteLLM embedder wrapper for CodexLens.
 Provides integration with ccw-litellm's LiteLLMEmbedder for embedding generation.
 """
 from __future__ import annotations
 from typing import Iterable
 import numpy as np
 from .base import BaseEmbedder
 class LiteLLMEmbedderWrapper(BaseEmbedder):
    """Wrapper for ccw-litellm LiteLLMEmbedder.
    This wrapper adapts the ccw-litellm LiteLLMEmbedder to the CodexLens
    BaseEmbedder interface, enabling seamless integration with CodexLens
    semantic search functionality.
    Args:
        model: Model identifier for LiteLLM (default: "default")
        **kwargs: Additional arguments passed to LiteLLMEmbedder
    Raises:
        ImportError: If ccw-litellm package is not installed
    """
    def __init__(self, model: str = "default", **kwargs) -> None:
        """Initialize LiteLLM embedder wrapper.
        Args:
            model: Model identifier for LiteLLM (default: "default")
            **kwargs: Additional arguments passed to LiteLLMEmbedder
        Raises:
            ImportError: If ccw-litellm package is not installed
        """
        try:
            from ccw_litellm import LiteLLMEmbedder
            self._embedder = LiteLLMEmbedder(model=model, **kwargs)
        except ImportError as e:
            raise ImportError(
                "ccw-litellm not installed. Install with: pip install ccw-litellm"
            ) from e
    @property
    def embedding_dim(self) -> int:
        """Return embedding dimensions from LiteLLMEmbedder.
        Returns:
            int: Dimension of the embedding vectors.
        """
        return self._embedder.dimensions
    @property
    def model_name(self) -> str:
        """Return model name from LiteLLMEmbedder.
        Returns:
            str: Name or identifier of the underlying model.
        """
        return self._embedder.model_name
    def embed_to_numpy(self, texts: str | Iterable[str]) -> np.ndarray:
        """Embed texts to numpy array using LiteLLMEmbedder.
        Args:
            texts: Single text or iterable of texts to embed.
        Returns:
            numpy.ndarray: Array of shape (n_texts, embedding_dim) containing embeddings.
        """
        if isinstance(texts, str):
            texts = [texts]
        else:
            texts = list(texts)
        return self._embedder.embed(texts)