feat: Add unified LiteLLM API management with dashboard UI and CLI integration

- Create ccw-litellm Python package with AbstractEmbedder and AbstractLLMClient interfaces - Add BaseEmbedder abstraction and factory pattern to codex-lens for pluggable backends - Implement API Settings dashboard page for provider credentials and custom endpoints - Add REST API routes for CRUD operations on providers and endpoints - Extend CLI with --model parameter for custom endpoint routing - Integrate existing context-cache for @pattern file resolution - Add provider model registry with predefined models per provider type - Include i18n translations (en/zh) for all new UI elements 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 01:50:27 +08:00 · 2025-12-23 20:36:32 +08:00
parent 5228581324
commit bf66b095c7
44 changed files with 4948 additions and 19 deletions
--- a/ccw-litellm/README.md
+++ b/ccw-litellm/README.md
@@ -0,0 +1,180 @@
+# ccw-litellm
+
+Unified LiteLLM interface layer shared by ccw and codex-lens projects.
+
+## Features
+
+- **Unified LLM Interface**: Abstract interface for LLM operations (chat, completion)
+- **Unified Embedding Interface**: Abstract interface for text embeddings
+- **Multi-Provider Support**: OpenAI, Anthropic, Azure, and more via LiteLLM
+- **Configuration Management**: YAML-based configuration with environment variable substitution
+- **Type Safety**: Full type annotations with Pydantic models
+
+## Installation
+
+```bash
+pip install -e .
+```
+
+## Quick Start
+
+### Configuration
+
+Create a configuration file at `~/.ccw/config/litellm-config.yaml`:
+
+```yaml
+version: 1
+default_provider: openai
+
+providers:
+  openai:
+    api_key: ${OPENAI_API_KEY}
+    api_base: https://api.openai.com/v1
+
+llm_models:
+  default:
+    provider: openai
+    model: gpt-4
+
+embedding_models:
+  default:
+    provider: openai
+    model: text-embedding-3-small
+    dimensions: 1536
+```
+
+### Usage
+
+#### LLM Client
+
+```python
+from ccw_litellm import LiteLLMClient, ChatMessage
+
+# Initialize client with default model
+client = LiteLLMClient(model="default")
+
+# Chat completion
+messages = [
+    ChatMessage(role="user", content="Hello, how are you?")
+]
+response = client.chat(messages)
+print(response.content)
+
+# Text completion
+response = client.complete("Once upon a time")
+print(response.content)
+```
+
+#### Embedder
+
+```python
+from ccw_litellm import LiteLLMEmbedder
+
+# Initialize embedder with default model
+embedder = LiteLLMEmbedder(model="default")
+
+# Embed single text
+vector = embedder.embed("Hello world")
+print(vector.shape)  # (1, 1536)
+
+# Embed multiple texts
+vectors = embedder.embed(["Text 1", "Text 2", "Text 3"])
+print(vectors.shape)  # (3, 1536)
+```
+
+#### Custom Configuration
+
+```python
+from ccw_litellm import LiteLLMClient, load_config
+
+# Load custom configuration
+config = load_config("/path/to/custom-config.yaml")
+
+# Use custom configuration
+client = LiteLLMClient(model="fast", config=config)
+```
+
+## Configuration Reference
+
+### Provider Configuration
+
+```yaml
+providers:
+  <provider_name>:
+    api_key: <api_key_or_${ENV_VAR}>
+    api_base: <base_url>
+```
+
+Supported providers: `openai`, `anthropic`, `azure`, `vertex_ai`, `bedrock`, etc.
+
+### LLM Model Configuration
+
+```yaml
+llm_models:
+  <model_name>:
+    provider: <provider_name>
+    model: <model_identifier>
+```
+
+### Embedding Model Configuration
+
+```yaml
+embedding_models:
+  <model_name>:
+    provider: <provider_name>
+    model: <model_identifier>
+    dimensions: <embedding_dimensions>
+```
+
+## Environment Variables
+
+The configuration supports environment variable substitution using the `${VAR}` or `${VAR:-default}` syntax:
+
+```yaml
+providers:
+  openai:
+    api_key: ${OPENAI_API_KEY}              # Required
+    api_base: ${OPENAI_API_BASE:-https://api.openai.com/v1}  # With default
+```
+
+## API Reference
+
+### Interfaces
+
+- `AbstractLLMClient`: Abstract base class for LLM clients
+- `AbstractEmbedder`: Abstract base class for embedders
+- `ChatMessage`: Message data class (role, content)
+- `LLMResponse`: Response data class (content, raw)
+
+### Implementations
+
+- `LiteLLMClient`: LiteLLM implementation of AbstractLLMClient
+- `LiteLLMEmbedder`: LiteLLM implementation of AbstractEmbedder
+
+### Configuration
+
+- `LiteLLMConfig`: Root configuration model
+- `ProviderConfig`: Provider configuration model
+- `LLMModelConfig`: LLM model configuration model
+- `EmbeddingModelConfig`: Embedding model configuration model
+- `load_config(path)`: Load configuration from YAML file
+- `get_config(path, reload)`: Get global configuration singleton
+- `reset_config()`: Reset global configuration (for testing)
+
+## Development
+
+### Running Tests
+
+```bash
+pytest tests/ -v
+```
+
+### Type Checking
+
+```bash
+mypy src/ccw_litellm
+```
+
+## License
+
+MIT
--- a/ccw-litellm/litellm-config.yaml.example
+++ b/ccw-litellm/litellm-config.yaml.example
@@ -0,0 +1,53 @@
+# LiteLLM Unified Configuration
+# Copy to ~/.ccw/config/litellm-config.yaml
+
+version: 1
+
+# Default provider for LLM calls
+default_provider: openai
+
+# Provider configurations
+providers:
+  openai:
+    api_key: ${OPENAI_API_KEY}
+    api_base: https://api.openai.com/v1
+    
+  anthropic:
+    api_key: ${ANTHROPIC_API_KEY}
+    
+  ollama:
+    api_base: http://localhost:11434
+    
+  azure:
+    api_key: ${AZURE_API_KEY}
+    api_base: ${AZURE_API_BASE}
+
+# LLM model configurations
+llm_models:
+  default:
+    provider: openai
+    model: gpt-4o
+  fast:
+    provider: openai
+    model: gpt-4o-mini
+  claude:
+    provider: anthropic
+    model: claude-sonnet-4-20250514
+  local:
+    provider: ollama
+    model: llama3.2
+
+# Embedding model configurations
+embedding_models:
+  default:
+    provider: openai
+    model: text-embedding-3-small
+    dimensions: 1536
+  large:
+    provider: openai
+    model: text-embedding-3-large
+    dimensions: 3072
+  ada:
+    provider: openai
+    model: text-embedding-ada-002
+    dimensions: 1536
--- a/ccw-litellm/pyproject.toml
+++ b/ccw-litellm/pyproject.toml
@@ -0,0 +1,35 @@
+[build-system]
+requires = ["setuptools>=61.0"]
+build-backend = "setuptools.build_meta"
+
+[project]
+name = "ccw-litellm"
+version = "0.1.0"
+description = "Unified LiteLLM interface layer shared by ccw and codex-lens"
+requires-python = ">=3.10"
+authors = [{ name = "ccw-litellm contributors" }]
+dependencies = [
+  "litellm>=1.0.0",
+  "pyyaml",
+  "numpy",
+  "pydantic>=2.0",
+]
+
+[project.optional-dependencies]
+dev = [
+  "pytest>=7.0",
+]
+
+[project.scripts]
+ccw-litellm = "ccw_litellm.cli:main"
+
+[tool.setuptools]
+package-dir = { "" = "src" }
+
+[tool.setuptools.packages.find]
+where = ["src"]
+include = ["ccw_litellm*"]
+
+[tool.pytest.ini_options]
+testpaths = ["tests"]
+addopts = "-q"
--- a/ccw-litellm/src/ccw_litellm.egg-info/PKG-INFO
+++ b/ccw-litellm/src/ccw_litellm.egg-info/PKG-INFO
@@ -0,0 +1,12 @@
+Metadata-Version: 2.4
+Name: ccw-litellm
+Version: 0.1.0
+Summary: Unified LiteLLM interface layer shared by ccw and codex-lens
+Author: ccw-litellm contributors
+Requires-Python: >=3.10
+Requires-Dist: litellm>=1.0.0
+Requires-Dist: pyyaml
+Requires-Dist: numpy
+Requires-Dist: pydantic>=2.0
+Provides-Extra: dev
+Requires-Dist: pytest>=7.0; extra == "dev"
--- a/ccw-litellm/src/ccw_litellm.egg-info/SOURCES.txt
+++ b/ccw-litellm/src/ccw_litellm.egg-info/SOURCES.txt
@@ -0,0 +1,17 @@
+pyproject.toml
+src/ccw_litellm/__init__.py
+src/ccw_litellm.egg-info/PKG-INFO
+src/ccw_litellm.egg-info/SOURCES.txt
+src/ccw_litellm.egg-info/dependency_links.txt
+src/ccw_litellm.egg-info/requires.txt
+src/ccw_litellm.egg-info/top_level.txt
+src/ccw_litellm/clients/__init__.py
+src/ccw_litellm/clients/litellm_embedder.py
+src/ccw_litellm/clients/litellm_llm.py
+src/ccw_litellm/config/__init__.py
+src/ccw_litellm/config/loader.py
+src/ccw_litellm/config/models.py
+src/ccw_litellm/interfaces/__init__.py
+src/ccw_litellm/interfaces/embedder.py
+src/ccw_litellm/interfaces/llm.py
+tests/test_interfaces.py
--- a/ccw-litellm/src/ccw_litellm.egg-info/dependency_links.txt
+++ b/ccw-litellm/src/ccw_litellm.egg-info/dependency_links.txt
@@ -0,0 +1 @@
+
--- a/ccw-litellm/src/ccw_litellm.egg-info/requires.txt
+++ b/ccw-litellm/src/ccw_litellm.egg-info/requires.txt
@@ -0,0 +1,7 @@
+litellm>=1.0.0
+pyyaml
+numpy
+pydantic>=2.0
+
+[dev]
+pytest>=7.0
--- a/ccw-litellm/src/ccw_litellm.egg-info/top_level.txt
+++ b/ccw-litellm/src/ccw_litellm.egg-info/top_level.txt
@@ -0,0 +1 @@
+ccw_litellm
--- a/ccw-litellm/src/ccw_litellm/init.py
+++ b/ccw-litellm/src/ccw_litellm/init.py
@@ -0,0 +1,47 @@
+"""ccw-litellm package.
+
+This package provides a small, stable interface layer around LiteLLM to share
+between the ccw and codex-lens projects.
+"""
+
+from __future__ import annotations
+
+from .clients import LiteLLMClient, LiteLLMEmbedder
+from .config import (
+    EmbeddingModelConfig,
+    LiteLLMConfig,
+    LLMModelConfig,
+    ProviderConfig,
+    get_config,
+    load_config,
+    reset_config,
+)
+from .interfaces import (
+    AbstractEmbedder,
+    AbstractLLMClient,
+    ChatMessage,
+    LLMResponse,
+)
+
+__version__ = "0.1.0"
+
+__all__ = [
+    "__version__",
+    # Abstract interfaces
+    "AbstractEmbedder",
+    "AbstractLLMClient",
+    "ChatMessage",
+    "LLMResponse",
+    # Client implementations
+    "LiteLLMClient",
+    "LiteLLMEmbedder",
+    # Configuration
+    "LiteLLMConfig",
+    "ProviderConfig",
+    "LLMModelConfig",
+    "EmbeddingModelConfig",
+    "load_config",
+    "get_config",
+    "reset_config",
+]
+
--- a/ccw-litellm/src/ccw_litellm/cli.py
+++ b/ccw-litellm/src/ccw_litellm/cli.py
@@ -0,0 +1,108 @@
+"""CLI entry point for ccw-litellm."""
+
+from __future__ import annotations
+
+import argparse
+import json
+import sys
+from pathlib import Path
+
+
+def main() -> int:
+    """Main CLI entry point."""
+    parser = argparse.ArgumentParser(
+        prog="ccw-litellm",
+        description="Unified LiteLLM interface for ccw and codex-lens",
+    )
+    subparsers = parser.add_subparsers(dest="command", help="Available commands")
+
+    # config command
+    config_parser = subparsers.add_parser("config", help="Show configuration")
+    config_parser.add_argument(
+        "--path",
+        type=Path,
+        help="Configuration file path",
+    )
+
+    # embed command
+    embed_parser = subparsers.add_parser("embed", help="Generate embeddings")
+    embed_parser.add_argument("texts", nargs="+", help="Texts to embed")
+    embed_parser.add_argument(
+        "--model",
+        default="default",
+        help="Embedding model name (default: default)",
+    )
+    embed_parser.add_argument(
+        "--output",
+        choices=["json", "shape"],
+        default="shape",
+        help="Output format (default: shape)",
+    )
+
+    # chat command
+    chat_parser = subparsers.add_parser("chat", help="Chat with LLM")
+    chat_parser.add_argument("message", help="Message to send")
+    chat_parser.add_argument(
+        "--model",
+        default="default",
+        help="LLM model name (default: default)",
+    )
+
+    # version command
+    subparsers.add_parser("version", help="Show version")
+
+    args = parser.parse_args()
+
+    if args.command == "version":
+        from . import __version__
+
+        print(f"ccw-litellm {__version__}")
+        return 0
+
+    if args.command == "config":
+        from .config import get_config
+
+        try:
+            config = get_config(config_path=args.path if hasattr(args, "path") else None)
+            print(config.model_dump_json(indent=2))
+        except Exception as e:
+            print(f"Error loading config: {e}", file=sys.stderr)
+            return 1
+        return 0
+
+    if args.command == "embed":
+        from .clients import LiteLLMEmbedder
+
+        try:
+            embedder = LiteLLMEmbedder(model=args.model)
+            vectors = embedder.embed(args.texts)
+
+            if args.output == "json":
+                print(json.dumps(vectors.tolist()))
+            else:
+                print(f"Shape: {vectors.shape}")
+                print(f"Dimensions: {embedder.dimensions}")
+        except Exception as e:
+            print(f"Error: {e}", file=sys.stderr)
+            return 1
+        return 0
+
+    if args.command == "chat":
+        from .clients import LiteLLMClient
+        from .interfaces import ChatMessage
+
+        try:
+            client = LiteLLMClient(model=args.model)
+            response = client.chat([ChatMessage(role="user", content=args.message)])
+            print(response.content)
+        except Exception as e:
+            print(f"Error: {e}", file=sys.stderr)
+            return 1
+        return 0
+
+    parser.print_help()
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- a/ccw-litellm/src/ccw_litellm/clients/init.py
+++ b/ccw-litellm/src/ccw_litellm/clients/init.py
@@ -0,0 +1,12 @@
+"""Client implementations for ccw-litellm."""
+
+from __future__ import annotations
+
+from .litellm_embedder import LiteLLMEmbedder
+from .litellm_llm import LiteLLMClient
+
+__all__ = [
+    "LiteLLMClient",
+    "LiteLLMEmbedder",
+]
+
--- a/ccw-litellm/src/ccw_litellm/clients/litellm_embedder.py
+++ b/ccw-litellm/src/ccw_litellm/clients/litellm_embedder.py
@@ -0,0 +1,170 @@
+"""LiteLLM embedder implementation for text embeddings."""
+
+from __future__ import annotations
+
+import logging
+from typing import Any, Sequence
+
+import litellm
+import numpy as np
+from numpy.typing import NDArray
+
+from ..config import LiteLLMConfig, get_config
+from ..interfaces.embedder import AbstractEmbedder
+
+logger = logging.getLogger(__name__)
+
+
+class LiteLLMEmbedder(AbstractEmbedder):
+    """LiteLLM embedder implementation.
+
+    Supports multiple embedding providers (OpenAI, etc.) through LiteLLM's unified interface.
+
+    Example:
+        embedder = LiteLLMEmbedder(model="default")
+        vectors = embedder.embed(["Hello world", "Another text"])
+        print(vectors.shape)  # (2, 1536)
+    """
+
+    def __init__(
+        self,
+        model: str = "default",
+        config: LiteLLMConfig | None = None,
+        **litellm_kwargs: Any,
+    ) -> None:
+        """Initialize LiteLLM embedder.
+
+        Args:
+            model: Model name from configuration (default: "default")
+            config: Configuration instance (default: use global config)
+            **litellm_kwargs: Additional arguments to pass to litellm.embedding()
+        """
+        self._config = config or get_config()
+        self._model_name = model
+        self._litellm_kwargs = litellm_kwargs
+
+        # Get embedding model configuration
+        try:
+            self._model_config = self._config.get_embedding_model(model)
+        except ValueError as e:
+            logger.error(f"Failed to get embedding model configuration: {e}")
+            raise
+
+        # Get provider configuration
+        try:
+            self._provider_config = self._config.get_provider(self._model_config.provider)
+        except ValueError as e:
+            logger.error(f"Failed to get provider configuration: {e}")
+            raise
+
+        # Set up LiteLLM environment
+        self._setup_litellm()
+
+    def _setup_litellm(self) -> None:
+        """Configure LiteLLM with provider settings."""
+        provider = self._model_config.provider
+
+        # Set API key
+        if self._provider_config.api_key:
+            litellm.api_key = self._provider_config.api_key
+            # Also set environment-specific keys
+            if provider == "openai":
+                litellm.openai_key = self._provider_config.api_key
+            elif provider == "anthropic":
+                litellm.anthropic_key = self._provider_config.api_key
+
+        # Set API base
+        if self._provider_config.api_base:
+            litellm.api_base = self._provider_config.api_base
+
+    def _format_model_name(self) -> str:
+        """Format model name for LiteLLM.
+
+        Returns:
+            Formatted model name (e.g., "text-embedding-3-small")
+        """
+        provider = self._model_config.provider
+        model = self._model_config.model
+
+        # For some providers, LiteLLM expects explicit prefix
+        if provider in ["azure", "vertex_ai", "bedrock"]:
+            return f"{provider}/{model}"
+
+        return model
+
+    @property
+    def dimensions(self) -> int:
+        """Embedding vector size."""
+        return self._model_config.dimensions
+
+    def embed(
+        self,
+        texts: str | Sequence[str],
+        *,
+        batch_size: int | None = None,
+        **kwargs: Any,
+    ) -> NDArray[np.floating]:
+        """Embed one or more texts.
+
+        Args:
+            texts: Single text or sequence of texts
+            batch_size: Batch size for processing (currently unused, LiteLLM handles batching)
+            **kwargs: Additional arguments for litellm.embedding()
+
+        Returns:
+            A numpy array of shape (n_texts, dimensions).
+
+        Raises:
+            Exception: If LiteLLM embedding fails
+        """
+        # Normalize input to list
+        if isinstance(texts, str):
+            text_list = [texts]
+            single_input = True
+        else:
+            text_list = list(texts)
+            single_input = False
+
+        if not text_list:
+            # Return empty array with correct shape
+            return np.empty((0, self.dimensions), dtype=np.float32)
+
+        # Merge kwargs
+        embedding_kwargs = {**self._litellm_kwargs, **kwargs}
+
+        try:
+            # Call LiteLLM embedding
+            response = litellm.embedding(
+                model=self._format_model_name(),
+                input=text_list,
+                **embedding_kwargs,
+            )
+
+            # Extract embeddings
+            embeddings = [item["embedding"] for item in response.data]
+
+            # Convert to numpy array
+            result = np.array(embeddings, dtype=np.float32)
+
+            # Validate dimensions
+            if result.shape[1] != self.dimensions:
+                logger.warning(
+                    f"Expected {self.dimensions} dimensions, got {result.shape[1]}. "
+                    f"Configuration may be incorrect."
+                )
+
+            return result
+
+        except Exception as e:
+            logger.error(f"LiteLLM embedding failed: {e}")
+            raise
+
+    @property
+    def model_name(self) -> str:
+        """Get configured model name."""
+        return self._model_name
+
+    @property
+    def provider(self) -> str:
+        """Get configured provider name."""
+        return self._model_config.provider
--- a/ccw-litellm/src/ccw_litellm/clients/litellm_llm.py
+++ b/ccw-litellm/src/ccw_litellm/clients/litellm_llm.py
@@ -0,0 +1,165 @@
+"""LiteLLM client implementation for LLM operations."""
+
+from __future__ import annotations
+
+import logging
+from typing import Any, Sequence
+
+import litellm
+
+from ..config import LiteLLMConfig, get_config
+from ..interfaces.llm import AbstractLLMClient, ChatMessage, LLMResponse
+
+logger = logging.getLogger(__name__)
+
+
+class LiteLLMClient(AbstractLLMClient):
+    """LiteLLM client implementation.
+
+    Supports multiple providers (OpenAI, Anthropic, etc.) through LiteLLM's unified interface.
+
+    Example:
+        client = LiteLLMClient(model="default")
+        response = client.chat([
+            ChatMessage(role="user", content="Hello!")
+        ])
+        print(response.content)
+    """
+
+    def __init__(
+        self,
+        model: str = "default",
+        config: LiteLLMConfig | None = None,
+        **litellm_kwargs: Any,
+    ) -> None:
+        """Initialize LiteLLM client.
+
+        Args:
+            model: Model name from configuration (default: "default")
+            config: Configuration instance (default: use global config)
+            **litellm_kwargs: Additional arguments to pass to litellm.completion()
+        """
+        self._config = config or get_config()
+        self._model_name = model
+        self._litellm_kwargs = litellm_kwargs
+
+        # Get model configuration
+        try:
+            self._model_config = self._config.get_llm_model(model)
+        except ValueError as e:
+            logger.error(f"Failed to get model configuration: {e}")
+            raise
+
+        # Get provider configuration
+        try:
+            self._provider_config = self._config.get_provider(self._model_config.provider)
+        except ValueError as e:
+            logger.error(f"Failed to get provider configuration: {e}")
+            raise
+
+        # Set up LiteLLM environment
+        self._setup_litellm()
+
+    def _setup_litellm(self) -> None:
+        """Configure LiteLLM with provider settings."""
+        provider = self._model_config.provider
+
+        # Set API key
+        if self._provider_config.api_key:
+            env_var = f"{provider.upper()}_API_KEY"
+            litellm.api_key = self._provider_config.api_key
+            # Also set environment-specific keys
+            if provider == "openai":
+                litellm.openai_key = self._provider_config.api_key
+            elif provider == "anthropic":
+                litellm.anthropic_key = self._provider_config.api_key
+
+        # Set API base
+        if self._provider_config.api_base:
+            litellm.api_base = self._provider_config.api_base
+
+    def _format_model_name(self) -> str:
+        """Format model name for LiteLLM.
+
+        Returns:
+            Formatted model name (e.g., "gpt-4", "claude-3-opus-20240229")
+        """
+        # LiteLLM expects model names in format: "provider/model" or just "model"
+        # If provider is explicit, use provider/model format
+        provider = self._model_config.provider
+        model = self._model_config.model
+
+        # For some providers, LiteLLM expects explicit prefix
+        if provider in ["anthropic", "azure", "vertex_ai", "bedrock"]:
+            return f"{provider}/{model}"
+
+        return model
+
+    def chat(
+        self,
+        messages: Sequence[ChatMessage],
+        **kwargs: Any,
+    ) -> LLMResponse:
+        """Chat completion for a sequence of messages.
+
+        Args:
+            messages: Sequence of chat messages
+            **kwargs: Additional arguments for litellm.completion()
+
+        Returns:
+            LLM response with content and raw response
+
+        Raises:
+            Exception: If LiteLLM completion fails
+        """
+        # Convert messages to LiteLLM format
+        litellm_messages = [
+            {"role": msg.role, "content": msg.content} for msg in messages
+        ]
+
+        # Merge kwargs
+        completion_kwargs = {**self._litellm_kwargs, **kwargs}
+
+        try:
+            # Call LiteLLM
+            response = litellm.completion(
+                model=self._format_model_name(),
+                messages=litellm_messages,
+                **completion_kwargs,
+            )
+
+            # Extract content
+            content = response.choices[0].message.content or ""
+
+            return LLMResponse(content=content, raw=response)
+
+        except Exception as e:
+            logger.error(f"LiteLLM completion failed: {e}")
+            raise
+
+    def complete(self, prompt: str, **kwargs: Any) -> LLMResponse:
+        """Text completion for a prompt.
+
+        Args:
+            prompt: Input prompt
+            **kwargs: Additional arguments for litellm.completion()
+
+        Returns:
+            LLM response with content and raw response
+
+        Raises:
+            Exception: If LiteLLM completion fails
+        """
+        # Convert to chat format (most modern models use chat interface)
+        messages = [ChatMessage(role="user", content=prompt)]
+        return self.chat(messages, **kwargs)
+
+    @property
+    def model_name(self) -> str:
+        """Get configured model name."""
+        return self._model_name
+
+    @property
+    def provider(self) -> str:
+        """Get configured provider name."""
+        return self._model_config.provider
--- a/ccw-litellm/src/ccw_litellm/config/init.py
+++ b/ccw-litellm/src/ccw_litellm/config/init.py
@@ -0,0 +1,22 @@
+"""Configuration management for LiteLLM integration."""
+
+from __future__ import annotations
+
+from .loader import get_config, load_config, reset_config
+from .models import (
+    EmbeddingModelConfig,
+    LiteLLMConfig,
+    LLMModelConfig,
+    ProviderConfig,
+)
+
+__all__ = [
+    "LiteLLMConfig",
+    "ProviderConfig",
+    "LLMModelConfig",
+    "EmbeddingModelConfig",
+    "load_config",
+    "get_config",
+    "reset_config",
+]
+
--- a/ccw-litellm/src/ccw_litellm/config/loader.py
+++ b/ccw-litellm/src/ccw_litellm/config/loader.py
@@ -0,0 +1,150 @@
+"""Configuration loader with environment variable substitution."""
+
+from __future__ import annotations
+
+import os
+import re
+from pathlib import Path
+from typing import Any
+
+import yaml
+
+from .models import LiteLLMConfig
+
+# Default configuration path
+DEFAULT_CONFIG_PATH = Path.home() / ".ccw" / "config" / "litellm-config.yaml"
+
+# Global configuration singleton
+_config_instance: LiteLLMConfig | None = None
+
+
+def _substitute_env_vars(value: Any) -> Any:
+    """Recursively substitute environment variables in configuration values.
+
+    Supports ${ENV_VAR} and ${ENV_VAR:-default} syntax.
+
+    Args:
+        value: Configuration value (str, dict, list, or primitive)
+
+    Returns:
+        Value with environment variables substituted
+    """
+    if isinstance(value, str):
+        # Pattern: ${VAR} or ${VAR:-default}
+        pattern = r"\$\{([^:}]+)(?::-(.*?))?\}"
+
+        def replace_var(match: re.Match) -> str:
+            var_name = match.group(1)
+            default_value = match.group(2) if match.group(2) is not None else ""
+            return os.environ.get(var_name, default_value)
+
+        return re.sub(pattern, replace_var, value)
+
+    if isinstance(value, dict):
+        return {k: _substitute_env_vars(v) for k, v in value.items()}
+
+    if isinstance(value, list):
+        return [_substitute_env_vars(item) for item in value]
+
+    return value
+
+
+def _get_default_config() -> dict[str, Any]:
+    """Get default configuration when no config file exists.
+
+    Returns:
+        Default configuration dictionary
+    """
+    return {
+        "version": 1,
+        "default_provider": "openai",
+        "providers": {
+            "openai": {
+                "api_key": "${OPENAI_API_KEY}",
+                "api_base": "https://api.openai.com/v1",
+            },
+        },
+        "llm_models": {
+            "default": {
+                "provider": "openai",
+                "model": "gpt-4",
+            },
+            "fast": {
+                "provider": "openai",
+                "model": "gpt-3.5-turbo",
+            },
+        },
+        "embedding_models": {
+            "default": {
+                "provider": "openai",
+                "model": "text-embedding-3-small",
+                "dimensions": 1536,
+            },
+        },
+    }
+
+
+def load_config(config_path: Path | str | None = None) -> LiteLLMConfig:
+    """Load LiteLLM configuration from YAML file.
+
+    Args:
+        config_path: Path to configuration file (default: ~/.ccw/config/litellm-config.yaml)
+
+    Returns:
+        Parsed and validated configuration
+
+    Raises:
+        FileNotFoundError: If config file not found and no default available
+        ValueError: If configuration is invalid
+    """
+    if config_path is None:
+        config_path = DEFAULT_CONFIG_PATH
+    else:
+        config_path = Path(config_path)
+
+    # Load configuration
+    if config_path.exists():
+        try:
+            with open(config_path, "r", encoding="utf-8") as f:
+                raw_config = yaml.safe_load(f)
+        except Exception as e:
+            raise ValueError(f"Failed to load configuration from {config_path}: {e}") from e
+    else:
+        # Use default configuration
+        raw_config = _get_default_config()
+
+    # Substitute environment variables
+    config_data = _substitute_env_vars(raw_config)
+
+    # Validate and parse with Pydantic
+    try:
+        return LiteLLMConfig.model_validate(config_data)
+    except Exception as e:
+        raise ValueError(f"Invalid configuration: {e}") from e
+
+
+def get_config(config_path: Path | str | None = None, reload: bool = False) -> LiteLLMConfig:
+    """Get global configuration singleton.
+
+    Args:
+        config_path: Path to configuration file (default: ~/.ccw/config/litellm-config.yaml)
+        reload: Force reload configuration from disk
+
+    Returns:
+        Global configuration instance
+    """
+    global _config_instance
+
+    if _config_instance is None or reload:
+        _config_instance = load_config(config_path)
+
+    return _config_instance
+
+
+def reset_config() -> None:
+    """Reset global configuration singleton.
+
+    Useful for testing.
+    """
+    global _config_instance
+    _config_instance = None
--- a/ccw-litellm/src/ccw_litellm/config/models.py
+++ b/ccw-litellm/src/ccw_litellm/config/models.py
@@ -0,0 +1,130 @@
+"""Pydantic configuration models for LiteLLM integration."""
+
+from __future__ import annotations
+
+from typing import Any
+
+from pydantic import BaseModel, Field
+
+
+class ProviderConfig(BaseModel):
+    """Provider API configuration.
+
+    Supports environment variable substitution in the format ${ENV_VAR}.
+    """
+
+    api_key: str | None = None
+    api_base: str | None = None
+
+    model_config = {"extra": "allow"}
+
+
+class LLMModelConfig(BaseModel):
+    """LLM model configuration."""
+
+    provider: str
+    model: str
+
+    model_config = {"extra": "allow"}
+
+
+class EmbeddingModelConfig(BaseModel):
+    """Embedding model configuration."""
+
+    provider: str  # "openai", "fastembed", "ollama", etc.
+    model: str
+    dimensions: int
+
+    model_config = {"extra": "allow"}
+
+
+class LiteLLMConfig(BaseModel):
+    """Root configuration for LiteLLM integration.
+
+    Example YAML:
+        version: 1
+        default_provider: openai
+        providers:
+          openai:
+            api_key: ${OPENAI_API_KEY}
+            api_base: https://api.openai.com/v1
+          anthropic:
+            api_key: ${ANTHROPIC_API_KEY}
+        llm_models:
+          default:
+            provider: openai
+            model: gpt-4
+          fast:
+            provider: openai
+            model: gpt-3.5-turbo
+        embedding_models:
+          default:
+            provider: openai
+            model: text-embedding-3-small
+            dimensions: 1536
+    """
+
+    version: int = 1
+    default_provider: str = "openai"
+    providers: dict[str, ProviderConfig] = Field(default_factory=dict)
+    llm_models: dict[str, LLMModelConfig] = Field(default_factory=dict)
+    embedding_models: dict[str, EmbeddingModelConfig] = Field(default_factory=dict)
+
+    model_config = {"extra": "allow"}
+
+    def get_llm_model(self, model: str = "default") -> LLMModelConfig:
+        """Get LLM model configuration by name.
+
+        Args:
+            model: Model name or "default"
+
+        Returns:
+            LLM model configuration
+
+        Raises:
+            ValueError: If model not found
+        """
+        if model not in self.llm_models:
+            raise ValueError(
+                f"LLM model '{model}' not found in configuration. "
+                f"Available models: {list(self.llm_models.keys())}"
+            )
+        return self.llm_models[model]
+
+    def get_embedding_model(self, model: str = "default") -> EmbeddingModelConfig:
+        """Get embedding model configuration by name.
+
+        Args:
+            model: Model name or "default"
+
+        Returns:
+            Embedding model configuration
+
+        Raises:
+            ValueError: If model not found
+        """
+        if model not in self.embedding_models:
+            raise ValueError(
+                f"Embedding model '{model}' not found in configuration. "
+                f"Available models: {list(self.embedding_models.keys())}"
+            )
+        return self.embedding_models[model]
+
+    def get_provider(self, provider: str) -> ProviderConfig:
+        """Get provider configuration by name.
+
+        Args:
+            provider: Provider name
+
+        Returns:
+            Provider configuration
+
+        Raises:
+            ValueError: If provider not found
+        """
+        if provider not in self.providers:
+            raise ValueError(
+                f"Provider '{provider}' not found in configuration. "
+                f"Available providers: {list(self.providers.keys())}"
+            )
+        return self.providers[provider]
--- a/ccw-litellm/src/ccw_litellm/interfaces/init.py
+++ b/ccw-litellm/src/ccw_litellm/interfaces/init.py
@@ -0,0 +1,14 @@
+"""Abstract interfaces for ccw-litellm."""
+
+from __future__ import annotations
+
+from .embedder import AbstractEmbedder
+from .llm import AbstractLLMClient, ChatMessage, LLMResponse
+
+__all__ = [
+    "AbstractEmbedder",
+    "AbstractLLMClient",
+    "ChatMessage",
+    "LLMResponse",
+]
+
--- a/ccw-litellm/src/ccw_litellm/interfaces/embedder.py
+++ b/ccw-litellm/src/ccw_litellm/interfaces/embedder.py
@@ -0,0 +1,52 @@
+from __future__ import annotations
+
+import asyncio
+from abc import ABC, abstractmethod
+from typing import Any, Sequence
+
+import numpy as np
+from numpy.typing import NDArray
+
+
+class AbstractEmbedder(ABC):
+    """Embedding interface compatible with fastembed-style embedders.
+
+    Implementers only need to provide the synchronous `embed` method; an
+    asynchronous `aembed` wrapper is provided for convenience.
+    """
+
+    @property
+    @abstractmethod
+    def dimensions(self) -> int:
+        """Embedding vector size."""
+
+    @abstractmethod
+    def embed(
+        self,
+        texts: str | Sequence[str],
+        *,
+        batch_size: int | None = None,
+        **kwargs: Any,
+    ) -> NDArray[np.floating]:
+        """Embed one or more texts.
+
+        Returns:
+            A numpy array of shape (n_texts, dimensions).
+        """
+
+    async def aembed(
+        self,
+        texts: str | Sequence[str],
+        *,
+        batch_size: int | None = None,
+        **kwargs: Any,
+    ) -> NDArray[np.floating]:
+        """Async wrapper around `embed` using a worker thread by default."""
+
+        return await asyncio.to_thread(
+            self.embed,
+            texts,
+            batch_size=batch_size,
+            **kwargs,
+        )
+
--- a/ccw-litellm/src/ccw_litellm/interfaces/llm.py
+++ b/ccw-litellm/src/ccw_litellm/interfaces/llm.py
@@ -0,0 +1,45 @@
+from __future__ import annotations
+
+import asyncio
+from abc import ABC, abstractmethod
+from dataclasses import dataclass
+from typing import Any, Literal, Sequence
+
+
+@dataclass(frozen=True, slots=True)
+class ChatMessage:
+    role: Literal["system", "user", "assistant", "tool"]
+    content: str
+
+
+@dataclass(frozen=True, slots=True)
+class LLMResponse:
+    content: str
+    raw: Any | None = None
+
+
+class AbstractLLMClient(ABC):
+    """LiteLLM-like client interface.
+
+    Implementers only need to provide synchronous methods; async wrappers are
+    provided via `asyncio.to_thread`.
+    """
+
+    @abstractmethod
+    def chat(self, messages: Sequence[ChatMessage], **kwargs: Any) -> LLMResponse:
+        """Chat completion for a sequence of messages."""
+
+    @abstractmethod
+    def complete(self, prompt: str, **kwargs: Any) -> LLMResponse:
+        """Text completion for a prompt."""
+
+    async def achat(self, messages: Sequence[ChatMessage], **kwargs: Any) -> LLMResponse:
+        """Async wrapper around `chat` using a worker thread by default."""
+
+        return await asyncio.to_thread(self.chat, messages, **kwargs)
+
+    async def acomplete(self, prompt: str, **kwargs: Any) -> LLMResponse:
+        """Async wrapper around `complete` using a worker thread by default."""
+
+        return await asyncio.to_thread(self.complete, prompt, **kwargs)
+
--- a/ccw-litellm/tests/conftest.py
+++ b/ccw-litellm/tests/conftest.py
@@ -0,0 +1,11 @@
+from __future__ import annotations
+
+import sys
+from pathlib import Path
+
+
+def pytest_configure() -> None:
+    project_root = Path(__file__).resolve().parents[1]
+    src_dir = project_root / "src"
+    sys.path.insert(0, str(src_dir))
+
--- a/ccw-litellm/tests/test_interfaces.py
+++ b/ccw-litellm/tests/test_interfaces.py
@@ -0,0 +1,64 @@
+from __future__ import annotations
+
+import asyncio
+from typing import Any, Sequence
+
+import numpy as np
+
+from ccw_litellm.interfaces import AbstractEmbedder, AbstractLLMClient, ChatMessage, LLMResponse
+
+
+class _DummyEmbedder(AbstractEmbedder):
+    @property
+    def dimensions(self) -> int:
+        return 3
+
+    def embed(
+        self,
+        texts: str | Sequence[str],
+        *,
+        batch_size: int | None = None,
+        **kwargs: Any,
+    ) -> np.ndarray:
+        if isinstance(texts, str):
+            texts = [texts]
+        _ = batch_size
+        _ = kwargs
+        return np.zeros((len(texts), self.dimensions), dtype=np.float32)
+
+
+class _DummyLLM(AbstractLLMClient):
+    def chat(self, messages: Sequence[ChatMessage], **kwargs: Any) -> LLMResponse:
+        _ = kwargs
+        return LLMResponse(content="".join(m.content for m in messages))
+
+    def complete(self, prompt: str, **kwargs: Any) -> LLMResponse:
+        _ = kwargs
+        return LLMResponse(content=prompt)
+
+
+def test_embed_sync_shape_and_dtype() -> None:
+    emb = _DummyEmbedder()
+    out = emb.embed(["a", "b"])
+    assert out.shape == (2, 3)
+    assert out.dtype == np.float32
+
+
+def test_embed_async_wrapper() -> None:
+    emb = _DummyEmbedder()
+    out = asyncio.run(emb.aembed("x"))
+    assert out.shape == (1, 3)
+
+
+def test_llm_sync() -> None:
+    llm = _DummyLLM()
+    out = llm.chat([ChatMessage(role="user", content="hi")])
+    assert out == LLMResponse(content="hi")
+
+
+def test_llm_async_wrappers() -> None:
+    llm = _DummyLLM()
+    out1 = asyncio.run(llm.achat([ChatMessage(role="user", content="a")]))
+    out2 = asyncio.run(llm.acomplete("b"))
+    assert out1.content == "a"
+    assert out2.content == "b"
--- a/ccw/src/config/litellm-api-config-manager.ts
+++ b/ccw/src/config/litellm-api-config-manager.ts
@@ -0,0 +1,360 @@
+/**
+ * LiteLLM API Configuration Manager
+ * Manages provider credentials, custom endpoints, and cache settings
+ */
+
+import { existsSync, readFileSync, writeFileSync } from 'fs';
+import { join } from 'path';
+import { StoragePaths, ensureStorageDir } from './storage-paths.js';
+import type {
+  LiteLLMApiConfig,
+  ProviderCredential,
+  CustomEndpoint,
+  GlobalCacheSettings,
+  ProviderType,
+  CacheStrategy,
+} from '../types/litellm-api-config.js';
+
+/**
+ * Default configuration
+ */
+function getDefaultConfig(): LiteLLMApiConfig {
+  return {
+    version: 1,
+    providers: [],
+    endpoints: [],
+    globalCacheSettings: {
+      enabled: true,
+      cacheDir: '~/.ccw/cache/context',
+      maxTotalSizeMB: 100,
+    },
+  };
+}
+
+/**
+ * Get config file path for a project
+ */
+function getConfigPath(baseDir: string): string {
+  const paths = StoragePaths.project(baseDir);
+  ensureStorageDir(paths.config);
+  return join(paths.config, 'litellm-api-config.json');
+}
+
+/**
+ * Load configuration from file
+ */
+export function loadLiteLLMApiConfig(baseDir: string): LiteLLMApiConfig {
+  const configPath = getConfigPath(baseDir);
+
+  if (!existsSync(configPath)) {
+    return getDefaultConfig();
+  }
+
+  try {
+    const content = readFileSync(configPath, 'utf-8');
+    return JSON.parse(content) as LiteLLMApiConfig;
+  } catch (error) {
+    console.error('[LiteLLM Config] Failed to load config:', error);
+    return getDefaultConfig();
+  }
+}
+
+/**
+ * Save configuration to file
+ */
+function saveConfig(baseDir: string, config: LiteLLMApiConfig): void {
+  const configPath = getConfigPath(baseDir);
+  writeFileSync(configPath, JSON.stringify(config, null, 2), 'utf-8');
+}
+
+/**
+ * Resolve environment variables in API key
+ * Supports ${ENV_VAR} syntax
+ */
+export function resolveEnvVar(value: string): string {
+  if (!value) return value;
+
+  const envVarMatch = value.match(/^\$\{(.+)\}$/);
+  if (envVarMatch) {
+    const envVarName = envVarMatch[1];
+    return process.env[envVarName] || '';
+  }
+
+  return value;
+}
+
+// ===========================
+// Provider Management
+// ===========================
+
+/**
+ * Get all providers
+ */
+export function getAllProviders(baseDir: string): ProviderCredential[] {
+  const config = loadLiteLLMApiConfig(baseDir);
+  return config.providers;
+}
+
+/**
+ * Get provider by ID
+ */
+export function getProvider(baseDir: string, providerId: string): ProviderCredential | null {
+  const config = loadLiteLLMApiConfig(baseDir);
+  return config.providers.find((p) => p.id === providerId) || null;
+}
+
+/**
+ * Get provider with resolved environment variables
+ */
+export function getProviderWithResolvedEnvVars(
+  baseDir: string,
+  providerId: string
+): (ProviderCredential & { resolvedApiKey: string }) | null {
+  const provider = getProvider(baseDir, providerId);
+  if (!provider) return null;
+
+  return {
+    ...provider,
+    resolvedApiKey: resolveEnvVar(provider.apiKey),
+  };
+}
+
+/**
+ * Add new provider
+ */
+export function addProvider(
+  baseDir: string,
+  providerData: Omit<ProviderCredential, 'id' | 'createdAt' | 'updatedAt'>
+): ProviderCredential {
+  const config = loadLiteLLMApiConfig(baseDir);
+
+  const provider: ProviderCredential = {
+    ...providerData,
+    id: `${providerData.type}-${Date.now()}`,
+    createdAt: new Date().toISOString(),
+    updatedAt: new Date().toISOString(),
+  };
+
+  config.providers.push(provider);
+  saveConfig(baseDir, config);
+
+  return provider;
+}
+
+/**
+ * Update provider
+ */
+export function updateProvider(
+  baseDir: string,
+  providerId: string,
+  updates: Partial<Omit<ProviderCredential, 'id' | 'createdAt' | 'updatedAt'>>
+): ProviderCredential {
+  const config = loadLiteLLMApiConfig(baseDir);
+  const providerIndex = config.providers.findIndex((p) => p.id === providerId);
+
+  if (providerIndex === -1) {
+    throw new Error(`Provider not found: ${providerId}`);
+  }
+
+  config.providers[providerIndex] = {
+    ...config.providers[providerIndex],
+    ...updates,
+    updatedAt: new Date().toISOString(),
+  };
+
+  saveConfig(baseDir, config);
+  return config.providers[providerIndex];
+}
+
+/**
+ * Delete provider
+ */
+export function deleteProvider(baseDir: string, providerId: string): boolean {
+  const config = loadLiteLLMApiConfig(baseDir);
+  const initialLength = config.providers.length;
+
+  config.providers = config.providers.filter((p) => p.id !== providerId);
+
+  if (config.providers.length === initialLength) {
+    return false;
+  }
+
+  // Also remove endpoints using this provider
+  config.endpoints = config.endpoints.filter((e) => e.providerId !== providerId);
+
+  saveConfig(baseDir, config);
+  return true;
+}
+
+// ===========================
+// Endpoint Management
+// ===========================
+
+/**
+ * Get all endpoints
+ */
+export function getAllEndpoints(baseDir: string): CustomEndpoint[] {
+  const config = loadLiteLLMApiConfig(baseDir);
+  return config.endpoints;
+}
+
+/**
+ * Get endpoint by ID
+ */
+export function getEndpoint(baseDir: string, endpointId: string): CustomEndpoint | null {
+  const config = loadLiteLLMApiConfig(baseDir);
+  return config.endpoints.find((e) => e.id === endpointId) || null;
+}
+
+/**
+ * Find endpoint by ID (alias for getEndpoint)
+ */
+export function findEndpointById(baseDir: string, endpointId: string): CustomEndpoint | null {
+  return getEndpoint(baseDir, endpointId);
+}
+
+/**
+ * Add new endpoint
+ */
+export function addEndpoint(
+  baseDir: string,
+  endpointData: Omit<CustomEndpoint, 'createdAt' | 'updatedAt'>
+): CustomEndpoint {
+  const config = loadLiteLLMApiConfig(baseDir);
+
+  // Check if ID already exists
+  if (config.endpoints.some((e) => e.id === endpointData.id)) {
+    throw new Error(`Endpoint ID already exists: ${endpointData.id}`);
+  }
+
+  // Verify provider exists
+  if (!config.providers.find((p) => p.id === endpointData.providerId)) {
+    throw new Error(`Provider not found: ${endpointData.providerId}`);
+  }
+
+  const endpoint: CustomEndpoint = {
+    ...endpointData,
+    createdAt: new Date().toISOString(),
+    updatedAt: new Date().toISOString(),
+  };
+
+  config.endpoints.push(endpoint);
+  saveConfig(baseDir, config);
+
+  return endpoint;
+}
+
+/**
+ * Update endpoint
+ */
+export function updateEndpoint(
+  baseDir: string,
+  endpointId: string,
+  updates: Partial<Omit<CustomEndpoint, 'id' | 'createdAt' | 'updatedAt'>>
+): CustomEndpoint {
+  const config = loadLiteLLMApiConfig(baseDir);
+  const endpointIndex = config.endpoints.findIndex((e) => e.id === endpointId);
+
+  if (endpointIndex === -1) {
+    throw new Error(`Endpoint not found: ${endpointId}`);
+  }
+
+  // Verify provider exists if updating providerId
+  if (updates.providerId && !config.providers.find((p) => p.id === updates.providerId)) {
+    throw new Error(`Provider not found: ${updates.providerId}`);
+  }
+
+  config.endpoints[endpointIndex] = {
+    ...config.endpoints[endpointIndex],
+    ...updates,
+    updatedAt: new Date().toISOString(),
+  };
+
+  saveConfig(baseDir, config);
+  return config.endpoints[endpointIndex];
+}
+
+/**
+ * Delete endpoint
+ */
+export function deleteEndpoint(baseDir: string, endpointId: string): boolean {
+  const config = loadLiteLLMApiConfig(baseDir);
+  const initialLength = config.endpoints.length;
+
+  config.endpoints = config.endpoints.filter((e) => e.id !== endpointId);
+
+  if (config.endpoints.length === initialLength) {
+    return false;
+  }
+
+  // Clear default endpoint if deleted
+  if (config.defaultEndpoint === endpointId) {
+    delete config.defaultEndpoint;
+  }
+
+  saveConfig(baseDir, config);
+  return true;
+}
+
+// ===========================
+// Default Endpoint Management
+// ===========================
+
+/**
+ * Get default endpoint
+ */
+export function getDefaultEndpoint(baseDir: string): string | undefined {
+  const config = loadLiteLLMApiConfig(baseDir);
+  return config.defaultEndpoint;
+}
+
+/**
+ * Set default endpoint
+ */
+export function setDefaultEndpoint(baseDir: string, endpointId?: string): void {
+  const config = loadLiteLLMApiConfig(baseDir);
+
+  if (endpointId) {
+    // Verify endpoint exists
+    if (!config.endpoints.find((e) => e.id === endpointId)) {
+      throw new Error(`Endpoint not found: ${endpointId}`);
+    }
+    config.defaultEndpoint = endpointId;
+  } else {
+    delete config.defaultEndpoint;
+  }
+
+  saveConfig(baseDir, config);
+}
+
+// ===========================
+// Cache Settings Management
+// ===========================
+
+/**
+ * Get global cache settings
+ */
+export function getGlobalCacheSettings(baseDir: string): GlobalCacheSettings {
+  const config = loadLiteLLMApiConfig(baseDir);
+  return config.globalCacheSettings;
+}
+
+/**
+ * Update global cache settings
+ */
+export function updateGlobalCacheSettings(
+  baseDir: string,
+  settings: Partial<GlobalCacheSettings>
+): void {
+  const config = loadLiteLLMApiConfig(baseDir);
+
+  config.globalCacheSettings = {
+    ...config.globalCacheSettings,
+    ...settings,
+  };
+
+  saveConfig(baseDir, config);
+}
+
+// Re-export types
+export type { ProviderCredential, CustomEndpoint, ProviderType, CacheStrategy };
--- a/ccw/src/config/provider-models.ts
+++ b/ccw/src/config/provider-models.ts
@@ -0,0 +1,259 @@
+/**
+ * Provider Model Presets
+ *
+ * Predefined model information for each supported LLM provider.
+ * Used for UI dropdowns and validation.
+ */
+
+import type { ProviderType } from '../types/litellm-api-config.js';
+
+/**
+ * Model information metadata
+ */
+export interface ModelInfo {
+  /** Model identifier (used in API calls) */
+  id: string;
+
+  /** Human-readable display name */
+  name: string;
+
+  /** Context window size in tokens */
+  contextWindow: number;
+
+  /** Whether this model supports prompt caching */
+  supportsCaching: boolean;
+}
+
+/**
+ * Predefined models for each provider
+ * Used for UI selection and validation
+ */
+export const PROVIDER_MODELS: Record<ProviderType, ModelInfo[]> = {
+  openai: [
+    {
+      id: 'gpt-4o',
+      name: 'GPT-4o',
+      contextWindow: 128000,
+      supportsCaching: true
+    },
+    {
+      id: 'gpt-4o-mini',
+      name: 'GPT-4o Mini',
+      contextWindow: 128000,
+      supportsCaching: true
+    },
+    {
+      id: 'o1',
+      name: 'O1',
+      contextWindow: 200000,
+      supportsCaching: true
+    },
+    {
+      id: 'o1-mini',
+      name: 'O1 Mini',
+      contextWindow: 128000,
+      supportsCaching: true
+    },
+    {
+      id: 'gpt-4-turbo',
+      name: 'GPT-4 Turbo',
+      contextWindow: 128000,
+      supportsCaching: false
+    }
+  ],
+
+  anthropic: [
+    {
+      id: 'claude-sonnet-4-20250514',
+      name: 'Claude Sonnet 4',
+      contextWindow: 200000,
+      supportsCaching: true
+    },
+    {
+      id: 'claude-3-5-sonnet-20241022',
+      name: 'Claude 3.5 Sonnet',
+      contextWindow: 200000,
+      supportsCaching: true
+    },
+    {
+      id: 'claude-3-5-haiku-20241022',
+      name: 'Claude 3.5 Haiku',
+      contextWindow: 200000,
+      supportsCaching: true
+    },
+    {
+      id: 'claude-3-opus-20240229',
+      name: 'Claude 3 Opus',
+      contextWindow: 200000,
+      supportsCaching: false
+    }
+  ],
+
+  ollama: [
+    {
+      id: 'llama3.2',
+      name: 'Llama 3.2',
+      contextWindow: 128000,
+      supportsCaching: false
+    },
+    {
+      id: 'llama3.1',
+      name: 'Llama 3.1',
+      contextWindow: 128000,
+      supportsCaching: false
+    },
+    {
+      id: 'qwen2.5-coder',
+      name: 'Qwen 2.5 Coder',
+      contextWindow: 32000,
+      supportsCaching: false
+    },
+    {
+      id: 'codellama',
+      name: 'Code Llama',
+      contextWindow: 16000,
+      supportsCaching: false
+    },
+    {
+      id: 'mistral',
+      name: 'Mistral',
+      contextWindow: 32000,
+      supportsCaching: false
+    }
+  ],
+
+  azure: [
+    {
+      id: 'gpt-4o',
+      name: 'GPT-4o (Azure)',
+      contextWindow: 128000,
+      supportsCaching: true
+    },
+    {
+      id: 'gpt-4o-mini',
+      name: 'GPT-4o Mini (Azure)',
+      contextWindow: 128000,
+      supportsCaching: true
+    },
+    {
+      id: 'gpt-4-turbo',
+      name: 'GPT-4 Turbo (Azure)',
+      contextWindow: 128000,
+      supportsCaching: false
+    },
+    {
+      id: 'gpt-35-turbo',
+      name: 'GPT-3.5 Turbo (Azure)',
+      contextWindow: 16000,
+      supportsCaching: false
+    }
+  ],
+
+  google: [
+    {
+      id: 'gemini-2.0-flash-exp',
+      name: 'Gemini 2.0 Flash Experimental',
+      contextWindow: 1048576,
+      supportsCaching: true
+    },
+    {
+      id: 'gemini-1.5-pro',
+      name: 'Gemini 1.5 Pro',
+      contextWindow: 2097152,
+      supportsCaching: true
+    },
+    {
+      id: 'gemini-1.5-flash',
+      name: 'Gemini 1.5 Flash',
+      contextWindow: 1048576,
+      supportsCaching: true
+    },
+    {
+      id: 'gemini-1.0-pro',
+      name: 'Gemini 1.0 Pro',
+      contextWindow: 32000,
+      supportsCaching: false
+    }
+  ],
+
+  mistral: [
+    {
+      id: 'mistral-large-latest',
+      name: 'Mistral Large',
+      contextWindow: 128000,
+      supportsCaching: false
+    },
+    {
+      id: 'mistral-medium-latest',
+      name: 'Mistral Medium',
+      contextWindow: 32000,
+      supportsCaching: false
+    },
+    {
+      id: 'mistral-small-latest',
+      name: 'Mistral Small',
+      contextWindow: 32000,
+      supportsCaching: false
+    },
+    {
+      id: 'codestral-latest',
+      name: 'Codestral',
+      contextWindow: 32000,
+      supportsCaching: false
+    }
+  ],
+
+  deepseek: [
+    {
+      id: 'deepseek-chat',
+      name: 'DeepSeek Chat',
+      contextWindow: 64000,
+      supportsCaching: false
+    },
+    {
+      id: 'deepseek-coder',
+      name: 'DeepSeek Coder',
+      contextWindow: 64000,
+      supportsCaching: false
+    }
+  ],
+
+  custom: [
+    {
+      id: 'custom-model',
+      name: 'Custom Model',
+      contextWindow: 128000,
+      supportsCaching: false
+    }
+  ]
+};
+
+/**
+ * Get models for a specific provider
+ * @param providerType - Provider type to get models for
+ * @returns Array of model information
+ */
+export function getModelsForProvider(providerType: ProviderType): ModelInfo[] {
+  return PROVIDER_MODELS[providerType] || [];
+}
+
+/**
+ * Get model information by ID within a provider
+ * @param providerType - Provider type
+ * @param modelId - Model identifier
+ * @returns Model information or undefined if not found
+ */
+export function getModelInfo(providerType: ProviderType, modelId: string): ModelInfo | undefined {
+  const models = PROVIDER_MODELS[providerType] || [];
+  return models.find(m => m.id === modelId);
+}
+
+/**
+ * Validate if a model ID is supported by a provider
+ * @param providerType - Provider type
+ * @param modelId - Model identifier to validate
+ * @returns true if model is valid for provider
+ */
+export function isValidModel(providerType: ProviderType, modelId: string): boolean {
+  return getModelInfo(providerType, modelId) !== undefined;
+}
--- a/ccw/src/core/dashboard-generator.ts
+++ b/ccw/src/core/dashboard-generator.ts
@@ -46,7 +46,8 @@ const MODULE_CSS_FILES = [
  '27-graph-explorer.css',
  '28-mcp-manager.css',
  '29-help.css',
-  '30-core-memory.css'
+  '30-core-memory.css',
+  '31-api-settings.css'
 ];

 const MODULE_FILES = [
@@ -95,6 +96,7 @@ const MODULE_FILES = [
  'views/skills-manager.js',
  'views/rules-manager.js',
  'views/claude-manager.js',
+  'views/api-settings.js',
  'views/help.js',
  'main.js'
 ];
--- a/ccw/src/core/routes/litellm-api-routes.ts
+++ b/ccw/src/core/routes/litellm-api-routes.ts
@@ -0,0 +1,485 @@
+// @ts-nocheck
+/**
+ * LiteLLM API Routes Module
+ * Handles LiteLLM provider management, endpoint configuration, and cache management
+ */
+import type { IncomingMessage, ServerResponse } from 'http';
+import {
+  getAllProviders,
+  getProvider,
+  addProvider,
+  updateProvider,
+  deleteProvider,
+  getAllEndpoints,
+  getEndpoint,
+  addEndpoint,
+  updateEndpoint,
+  deleteEndpoint,
+  getDefaultEndpoint,
+  setDefaultEndpoint,
+  getGlobalCacheSettings,
+  updateGlobalCacheSettings,
+  loadLiteLLMApiConfig,
+  type ProviderCredential,
+  type CustomEndpoint,
+  type ProviderType,
+} from '../../config/litellm-api-config-manager.js';
+import { getContextCacheStore } from '../../tools/context-cache-store.js';
+import { getLiteLLMClient } from '../../tools/litellm-client.js';
+
+export interface RouteContext {
+  pathname: string;
+  url: URL;
+  req: IncomingMessage;
+  res: ServerResponse;
+  initialPath: string;
+  handlePostRequest: (req: IncomingMessage, res: ServerResponse, handler: (body: unknown) => Promise<any>) => void;
+  broadcastToClients: (data: unknown) => void;
+}
+
+// ===========================
+// Model Information
+// ===========================
+
+interface ModelInfo {
+  id: string;
+  name: string;
+  provider: ProviderType;
+  description?: string;
+}
+
+const PROVIDER_MODELS: Record<ProviderType, ModelInfo[]> = {
+  openai: [
+    { id: 'gpt-4-turbo', name: 'GPT-4 Turbo', provider: 'openai', description: '128K context' },
+    { id: 'gpt-4', name: 'GPT-4', provider: 'openai', description: '8K context' },
+    { id: 'gpt-3.5-turbo', name: 'GPT-3.5 Turbo', provider: 'openai', description: '16K context' },
+  ],
+  anthropic: [
+    { id: 'claude-3-opus-20240229', name: 'Claude 3 Opus', provider: 'anthropic', description: '200K context' },
+    { id: 'claude-3-sonnet-20240229', name: 'Claude 3 Sonnet', provider: 'anthropic', description: '200K context' },
+    { id: 'claude-3-haiku-20240307', name: 'Claude 3 Haiku', provider: 'anthropic', description: '200K context' },
+  ],
+  google: [
+    { id: 'gemini-pro', name: 'Gemini Pro', provider: 'google', description: '32K context' },
+    { id: 'gemini-pro-vision', name: 'Gemini Pro Vision', provider: 'google', description: '16K context' },
+  ],
+  ollama: [
+    { id: 'llama2', name: 'Llama 2', provider: 'ollama', description: 'Local model' },
+    { id: 'mistral', name: 'Mistral', provider: 'ollama', description: 'Local model' },
+  ],
+  azure: [],
+  mistral: [
+    { id: 'mistral-large-latest', name: 'Mistral Large', provider: 'mistral', description: '32K context' },
+    { id: 'mistral-medium-latest', name: 'Mistral Medium', provider: 'mistral', description: '32K context' },
+  ],
+  deepseek: [
+    { id: 'deepseek-chat', name: 'DeepSeek Chat', provider: 'deepseek', description: '64K context' },
+    { id: 'deepseek-coder', name: 'DeepSeek Coder', provider: 'deepseek', description: '64K context' },
+  ],
+  custom: [],
+};
+
+/**
+ * Handle LiteLLM API routes
+ * @returns true if route was handled, false otherwise
+ */
+export async function handleLiteLLMApiRoutes(ctx: RouteContext): Promise<boolean> {
+  const { pathname, url, req, res, initialPath, handlePostRequest, broadcastToClients } = ctx;
+
+  // ===========================
+  // Provider Management Routes
+  // ===========================
+
+  // GET /api/litellm-api/providers - List all providers
+  if (pathname === '/api/litellm-api/providers' && req.method === 'GET') {
+    try {
+      const providers = getAllProviders(initialPath);
+      res.writeHead(200, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ providers, count: providers.length }));
+    } catch (err) {
+      res.writeHead(500, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ error: (err as Error).message }));
+    }
+    return true;
+  }
+
+  // POST /api/litellm-api/providers - Create provider
+  if (pathname === '/api/litellm-api/providers' && req.method === 'POST') {
+    handlePostRequest(req, res, async (body: unknown) => {
+      const providerData = body as Omit<ProviderCredential, 'id' | 'createdAt' | 'updatedAt'>;
+
+      if (!providerData.name || !providerData.type || !providerData.apiKey) {
+        return { error: 'Provider name, type, and apiKey are required', status: 400 };
+      }
+
+      try {
+        const provider = addProvider(initialPath, providerData);
+
+        broadcastToClients({
+          type: 'LITELLM_PROVIDER_CREATED',
+          payload: { provider, timestamp: new Date().toISOString() }
+        });
+
+        return { success: true, provider };
+      } catch (err) {
+        return { error: (err as Error).message, status: 500 };
+      }
+    });
+    return true;
+  }
+
+  // GET /api/litellm-api/providers/:id - Get provider by ID
+  const providerGetMatch = pathname.match(/^\/api\/litellm-api\/providers\/([^/]+)$/);
+  if (providerGetMatch && req.method === 'GET') {
+    const providerId = providerGetMatch[1];
+
+    try {
+      const provider = getProvider(initialPath, providerId);
+      if (!provider) {
+        res.writeHead(404, { 'Content-Type': 'application/json' });
+        res.end(JSON.stringify({ error: 'Provider not found' }));
+        return true;
+      }
+
+      res.writeHead(200, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify(provider));
+    } catch (err) {
+      res.writeHead(500, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ error: (err as Error).message }));
+    }
+    return true;
+  }
+
+  // PUT /api/litellm-api/providers/:id - Update provider
+  const providerUpdateMatch = pathname.match(/^\/api\/litellm-api\/providers\/([^/]+)$/);
+  if (providerUpdateMatch && req.method === 'PUT') {
+    const providerId = providerUpdateMatch[1];
+
+    handlePostRequest(req, res, async (body: unknown) => {
+      const updates = body as Partial<Omit<ProviderCredential, 'id' | 'createdAt' | 'updatedAt'>>;
+
+      try {
+        const provider = updateProvider(initialPath, providerId, updates);
+
+        broadcastToClients({
+          type: 'LITELLM_PROVIDER_UPDATED',
+          payload: { provider, timestamp: new Date().toISOString() }
+        });
+
+        return { success: true, provider };
+      } catch (err) {
+        return { error: (err as Error).message, status: 404 };
+      }
+    });
+    return true;
+  }
+
+  // DELETE /api/litellm-api/providers/:id - Delete provider
+  const providerDeleteMatch = pathname.match(/^\/api\/litellm-api\/providers\/([^/]+)$/);
+  if (providerDeleteMatch && req.method === 'DELETE') {
+    const providerId = providerDeleteMatch[1];
+
+    try {
+      const success = deleteProvider(initialPath, providerId);
+
+      if (!success) {
+        res.writeHead(404, { 'Content-Type': 'application/json' });
+        res.end(JSON.stringify({ error: 'Provider not found' }));
+        return true;
+      }
+
+      broadcastToClients({
+        type: 'LITELLM_PROVIDER_DELETED',
+        payload: { providerId, timestamp: new Date().toISOString() }
+      });
+
+      res.writeHead(200, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ success: true, message: 'Provider deleted' }));
+    } catch (err) {
+      res.writeHead(500, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ error: (err as Error).message }));
+    }
+    return true;
+  }
+
+  // POST /api/litellm-api/providers/:id/test - Test provider connection
+  const providerTestMatch = pathname.match(/^\/api\/litellm-api\/providers\/([^/]+)\/test$/);
+  if (providerTestMatch && req.method === 'POST') {
+    const providerId = providerTestMatch[1];
+
+    try {
+      const provider = getProvider(initialPath, providerId);
+
+      if (!provider) {
+        res.writeHead(404, { 'Content-Type': 'application/json' });
+        res.end(JSON.stringify({ success: false, error: 'Provider not found' }));
+        return true;
+      }
+
+      if (!provider.enabled) {
+        res.writeHead(200, { 'Content-Type': 'application/json' });
+        res.end(JSON.stringify({ success: false, error: 'Provider is disabled' }));
+        return true;
+      }
+
+      // Test connection using litellm client
+      const client = getLiteLLMClient();
+      const available = await client.isAvailable();
+
+      res.writeHead(200, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ success: available, provider: provider.type }));
+    } catch (err) {
+      res.writeHead(500, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ success: false, error: (err as Error).message }));
+    }
+    return true;
+  }
+
+  // ===========================
+  // Endpoint Management Routes
+  // ===========================
+
+  // GET /api/litellm-api/endpoints - List all endpoints
+  if (pathname === '/api/litellm-api/endpoints' && req.method === 'GET') {
+    try {
+      const endpoints = getAllEndpoints(initialPath);
+      res.writeHead(200, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ endpoints, count: endpoints.length }));
+    } catch (err) {
+      res.writeHead(500, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ error: (err as Error).message }));
+    }
+    return true;
+  }
+
+  // POST /api/litellm-api/endpoints - Create endpoint
+  if (pathname === '/api/litellm-api/endpoints' && req.method === 'POST') {
+    handlePostRequest(req, res, async (body: unknown) => {
+      const endpointData = body as Omit<CustomEndpoint, 'createdAt' | 'updatedAt'>;
+
+      if (!endpointData.id || !endpointData.name || !endpointData.providerId || !endpointData.model) {
+        return { error: 'Endpoint id, name, providerId, and model are required', status: 400 };
+      }
+
+      try {
+        const endpoint = addEndpoint(initialPath, endpointData);
+
+        broadcastToClients({
+          type: 'LITELLM_ENDPOINT_CREATED',
+          payload: { endpoint, timestamp: new Date().toISOString() }
+        });
+
+        return { success: true, endpoint };
+      } catch (err) {
+        return { error: (err as Error).message, status: 500 };
+      }
+    });
+    return true;
+  }
+
+  // GET /api/litellm-api/endpoints/:id - Get endpoint by ID
+  const endpointGetMatch = pathname.match(/^\/api\/litellm-api\/endpoints\/([^/]+)$/);
+  if (endpointGetMatch && req.method === 'GET') {
+    const endpointId = endpointGetMatch[1];
+
+    try {
+      const endpoint = getEndpoint(initialPath, endpointId);
+      if (!endpoint) {
+        res.writeHead(404, { 'Content-Type': 'application/json' });
+        res.end(JSON.stringify({ error: 'Endpoint not found' }));
+        return true;
+      }
+
+      res.writeHead(200, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify(endpoint));
+    } catch (err) {
+      res.writeHead(500, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ error: (err as Error).message }));
+    }
+    return true;
+  }
+
+  // PUT /api/litellm-api/endpoints/:id - Update endpoint
+  const endpointUpdateMatch = pathname.match(/^\/api\/litellm-api\/endpoints\/([^/]+)$/);
+  if (endpointUpdateMatch && req.method === 'PUT') {
+    const endpointId = endpointUpdateMatch[1];
+
+    handlePostRequest(req, res, async (body: unknown) => {
+      const updates = body as Partial<Omit<CustomEndpoint, 'id' | 'createdAt' | 'updatedAt'>>;
+
+      try {
+        const endpoint = updateEndpoint(initialPath, endpointId, updates);
+
+        broadcastToClients({
+          type: 'LITELLM_ENDPOINT_UPDATED',
+          payload: { endpoint, timestamp: new Date().toISOString() }
+        });
+
+        return { success: true, endpoint };
+      } catch (err) {
+        return { error: (err as Error).message, status: 404 };
+      }
+    });
+    return true;
+  }
+
+  // DELETE /api/litellm-api/endpoints/:id - Delete endpoint
+  const endpointDeleteMatch = pathname.match(/^\/api\/litellm-api\/endpoints\/([^/]+)$/);
+  if (endpointDeleteMatch && req.method === 'DELETE') {
+    const endpointId = endpointDeleteMatch[1];
+
+    try {
+      const success = deleteEndpoint(initialPath, endpointId);
+
+      if (!success) {
+        res.writeHead(404, { 'Content-Type': 'application/json' });
+        res.end(JSON.stringify({ error: 'Endpoint not found' }));
+        return true;
+      }
+
+      broadcastToClients({
+        type: 'LITELLM_ENDPOINT_DELETED',
+        payload: { endpointId, timestamp: new Date().toISOString() }
+      });
+
+      res.writeHead(200, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ success: true, message: 'Endpoint deleted' }));
+    } catch (err) {
+      res.writeHead(500, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ error: (err as Error).message }));
+    }
+    return true;
+  }
+
+  // ===========================
+  // Model Discovery Routes
+  // ===========================
+
+  // GET /api/litellm-api/models/:providerType - Get available models for provider type
+  const modelsMatch = pathname.match(/^\/api\/litellm-api\/models\/([^/]+)$/);
+  if (modelsMatch && req.method === 'GET') {
+    const providerType = modelsMatch[1] as ProviderType;
+
+    try {
+      const models = PROVIDER_MODELS[providerType];
+
+      if (!models) {
+        res.writeHead(404, { 'Content-Type': 'application/json' });
+        res.end(JSON.stringify({ error: 'Provider type not found' }));
+        return true;
+      }
+
+      res.writeHead(200, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ providerType, models, count: models.length }));
+    } catch (err) {
+      res.writeHead(500, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ error: (err as Error).message }));
+    }
+    return true;
+  }
+
+  // ===========================
+  // Cache Management Routes
+  // ===========================
+
+  // GET /api/litellm-api/cache/stats - Get cache statistics
+  if (pathname === '/api/litellm-api/cache/stats' && req.method === 'GET') {
+    try {
+      const cacheStore = getContextCacheStore();
+      const stats = cacheStore.getStatus();
+
+      res.writeHead(200, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify(stats));
+    } catch (err) {
+      res.writeHead(500, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ error: (err as Error).message }));
+    }
+    return true;
+  }
+
+  // POST /api/litellm-api/cache/clear - Clear cache
+  if (pathname === '/api/litellm-api/cache/clear' && req.method === 'POST') {
+    try {
+      const cacheStore = getContextCacheStore();
+      const result = cacheStore.clear();
+
+      broadcastToClients({
+        type: 'LITELLM_CACHE_CLEARED',
+        payload: { removed: result.removed, timestamp: new Date().toISOString() }
+      });
+
+      res.writeHead(200, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ success: true, removed: result.removed }));
+    } catch (err) {
+      res.writeHead(500, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ error: (err as Error).message }));
+    }
+    return true;
+  }
+
+  // ===========================
+  // Config Management Routes
+  // ===========================
+
+  // GET /api/litellm-api/config - Get full config
+  if (pathname === '/api/litellm-api/config' && req.method === 'GET') {
+    try {
+      const config = loadLiteLLMApiConfig(initialPath);
+
+      res.writeHead(200, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify(config));
+    } catch (err) {
+      res.writeHead(500, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ error: (err as Error).message }));
+    }
+    return true;
+  }
+
+  // PUT /api/litellm-api/config/cache - Update global cache settings
+  if (pathname === '/api/litellm-api/config/cache' && req.method === 'PUT') {
+    handlePostRequest(req, res, async (body: unknown) => {
+      const settings = body as Partial<{ enabled: boolean; cacheDir: string; maxTotalSizeMB: number }>;
+
+      try {
+        updateGlobalCacheSettings(initialPath, settings);
+
+        const updatedSettings = getGlobalCacheSettings(initialPath);
+
+        broadcastToClients({
+          type: 'LITELLM_CACHE_SETTINGS_UPDATED',
+          payload: { settings: updatedSettings, timestamp: new Date().toISOString() }
+        });
+
+        return { success: true, settings: updatedSettings };
+      } catch (err) {
+        return { error: (err as Error).message, status: 500 };
+      }
+    });
+    return true;
+  }
+
+  // PUT /api/litellm-api/config/default-endpoint - Set default endpoint
+  if (pathname === '/api/litellm-api/config/default-endpoint' && req.method === 'PUT') {
+    handlePostRequest(req, res, async (body: unknown) => {
+      const { endpointId } = body as { endpointId?: string };
+
+      try {
+        setDefaultEndpoint(initialPath, endpointId);
+
+        const defaultEndpoint = getDefaultEndpoint(initialPath);
+
+        broadcastToClients({
+          type: 'LITELLM_DEFAULT_ENDPOINT_UPDATED',
+          payload: { endpointId, defaultEndpoint, timestamp: new Date().toISOString() }
+        });
+
+        return { success: true, defaultEndpoint };
+      } catch (err) {
+        return { error: (err as Error).message, status: 500 };
+      }
+    });
+    return true;
+  }
+
+  return false;
+}
--- a/ccw/src/core/routes/litellm-routes.ts
+++ b/ccw/src/core/routes/litellm-routes.ts
@@ -0,0 +1,107 @@
+// @ts-nocheck
+/**
+ * LiteLLM Routes Module
+ * Handles all LiteLLM-related API endpoints
+ */
+import type { IncomingMessage, ServerResponse } from 'http';
+import { getLiteLLMClient, getLiteLLMStatus, checkLiteLLMAvailable } from '../../tools/litellm-client.js';
+
+export interface RouteContext {
+  pathname: string;
+  url: URL;
+  req: IncomingMessage;
+  res: ServerResponse;
+  initialPath: string;
+  handlePostRequest: (req: IncomingMessage, res: ServerResponse, handler: (body: unknown) => Promise<any>) => void;
+  broadcastToClients: (data: unknown) => void;
+}
+
+/**
+ * Handle LiteLLM routes
+ * @returns true if route was handled, false otherwise
+ */
+export async function handleLiteLLMRoutes(ctx: RouteContext): Promise<boolean> {
+  const { pathname, url, req, res, initialPath, handlePostRequest } = ctx;
+
+  // API: LiteLLM Status - Check availability and version
+  if (pathname === '/api/litellm/status') {
+    try {
+      const status = await getLiteLLMStatus();
+      res.writeHead(200, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify(status));
+    } catch (err) {
+      res.writeHead(500, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ available: false, error: err.message }));
+    }
+    return true;
+  }
+
+  // API: LiteLLM Config - Get configuration
+  if (pathname === '/api/litellm/config' && req.method === 'GET') {
+    try {
+      const client = getLiteLLMClient();
+      const config = await client.getConfig();
+      res.writeHead(200, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify(config));
+    } catch (err) {
+      res.writeHead(500, { 'Content-Type': 'application/json' });
+      res.end(JSON.stringify({ error: err.message }));
+    }
+    return true;
+  }
+
+  // API: LiteLLM Embed - Generate embeddings
+  if (pathname === '/api/litellm/embed' && req.method === 'POST') {
+    handlePostRequest(req, res, async (body) => {
+      const { texts, model = 'default' } = body;
+
+      if (!texts || !Array.isArray(texts)) {
+        return { error: 'texts array is required', status: 400 };
+      }
+
+      if (texts.length === 0) {
+        return { error: 'texts array cannot be empty', status: 400 };
+      }
+
+      try {
+        const client = getLiteLLMClient();
+        const result = await client.embed(texts, model);
+        return { success: true, ...result };
+      } catch (err) {
+        return { error: err.message, status: 500 };
+      }
+    });
+    return true;
+  }
+
+  // API: LiteLLM Chat - Chat with LLM
+  if (pathname === '/api/litellm/chat' && req.method === 'POST') {
+    handlePostRequest(req, res, async (body) => {
+      const { message, messages, model = 'default' } = body;
+
+      // Support both single message and messages array
+      if (!message && (!messages || !Array.isArray(messages))) {
+        return { error: 'message or messages array is required', status: 400 };
+      }
+
+      try {
+        const client = getLiteLLMClient();
+
+        if (messages && Array.isArray(messages)) {
+          // Multi-turn chat
+          const result = await client.chatMessages(messages, model);
+          return { success: true, ...result };
+        } else {
+          // Single message chat
+          const content = await client.chat(message, model);
+          return { success: true, content, model };
+        }
+      } catch (err) {
+        return { error: err.message, status: 500 };
+      }
+    });
+    return true;
+  }
+
+  return false;
+}
--- a/ccw/src/core/server.ts
+++ b/ccw/src/core/server.ts
@@ -22,6 +22,8 @@ import { handleSessionRoutes } from './routes/session-routes.js';
 import { handleCcwRoutes } from './routes/ccw-routes.js';
 import { handleClaudeRoutes } from './routes/claude-routes.js';
 import { handleHelpRoutes } from './routes/help-routes.js';
+import { handleLiteLLMRoutes } from './routes/litellm-routes.js';
+import { handleLiteLLMApiRoutes } from './routes/litellm-api-routes.js';

 // Import WebSocket handling
 import { handleWebSocketUpgrade, broadcastToClients } from './websocket.js';
@@ -311,6 +313,16 @@ export async function startServer(options: ServerOptions = {}): Promise<http.Ser
        if (await handleCodexLensRoutes(routeContext)) return;
      }

+      // LiteLLM routes (/api/litellm/*)
+      if (pathname.startsWith('/api/litellm/')) {
+        if (await handleLiteLLMRoutes(routeContext)) return;
+      }
+
+      // LiteLLM API routes (/api/litellm-api/*)
+      if (pathname.startsWith('/api/litellm-api/')) {
+        if (await handleLiteLLMApiRoutes(routeContext)) return;
+      }
+
      // Graph routes (/api/graph/*)
      if (pathname.startsWith('/api/graph/')) {
        if (await handleGraphRoutes(routeContext)) return;
--- a/ccw/src/templates/dashboard-css/31-api-settings.css
+++ b/ccw/src/templates/dashboard-css/31-api-settings.css
@@ -0,0 +1,397 @@
+/* ========================================
+ * API Settings Styles
+ * ======================================== */
+
+/* Main Container */
+.api-settings-container {
+  display: flex;
+  flex-direction: column;
+  gap: 1.5rem;
+  padding: 1rem;
+}
+
+/* Section Styles */
+.api-settings-section {
+  background: hsl(var(--card));
+  border: 1px solid hsl(var(--border));
+  border-radius: 0.75rem;
+  padding: 1.25rem;
+}
+
+.section-header {
+  display: flex;
+  align-items: center;
+  justify-content: space-between;
+  margin-bottom: 1rem;
+  padding-bottom: 0.75rem;
+  border-bottom: 1px solid hsl(var(--border));
+}
+
+.section-header h3 {
+  font-size: 1rem;
+  font-weight: 600;
+  color: hsl(var(--foreground));
+  margin: 0;
+}
+
+/* Settings List */
+.api-settings-list {
+  display: flex;
+  flex-direction: column;
+  gap: 0.75rem;
+}
+
+/* Settings Card */
+.api-settings-card {
+  background: hsl(var(--background));
+  border: 1px solid hsl(var(--border));
+  border-radius: 0.5rem;
+  padding: 1rem;
+  transition: all 0.2s ease;
+}
+
+.api-settings-card:hover {
+  border-color: hsl(var(--primary) / 0.3);
+  box-shadow: 0 2px 8px hsl(var(--primary) / 0.1);
+}
+
+.api-settings-card.disabled {
+  opacity: 0.6;
+  background: hsl(var(--muted) / 0.3);
+}
+
+/* Card Header */
+.card-header {
+  display: flex;
+  align-items: center;
+  justify-content: space-between;
+  margin-bottom: 0.75rem;
+}
+
+.card-info {
+  display: flex;
+  align-items: center;
+  gap: 0.75rem;
+  flex: 1;
+}
+
+.card-info h4 {
+  font-size: 0.9375rem;
+  font-weight: 600;
+  color: hsl(var(--foreground));
+  margin: 0;
+}
+
+.card-actions {
+  display: flex;
+  align-items: center;
+  gap: 0.5rem;
+}
+
+/* Card Body */
+.card-body {
+  display: flex;
+  flex-direction: column;
+  gap: 0.75rem;
+}
+
+.card-meta {
+  display: flex;
+  flex-wrap: wrap;
+  gap: 1rem;
+  font-size: 0.8125rem;
+  color: hsl(var(--muted-foreground));
+}
+
+.card-meta span {
+  display: flex;
+  align-items: center;
+  gap: 0.375rem;
+}
+
+.card-meta i {
+  font-size: 0.875rem;
+}
+
+/* Provider Type Badge */
+.provider-type-badge {
+  display: inline-flex;
+  align-items: center;
+  padding: 0.25rem 0.625rem;
+  font-size: 0.6875rem;
+  font-weight: 600;
+  text-transform: uppercase;
+  background: hsl(var(--primary) / 0.1);
+  color: hsl(var(--primary));
+  border-radius: 9999px;
+  letter-spacing: 0.03em;
+}
+
+/* Endpoint ID */
+.endpoint-id {
+  font-family: 'SF Mono', 'Consolas', 'Liberation Mono', monospace;
+  font-size: 0.75rem;
+  padding: 0.25rem 0.5rem;
+  background: hsl(var(--muted) / 0.5);
+  border-radius: 0.25rem;
+  color: hsl(var(--primary));
+}
+
+/* Usage Hint */
+.usage-hint {
+  display: flex;
+  align-items: center;
+  gap: 0.5rem;
+  padding: 0.625rem 0.75rem;
+  background: hsl(var(--muted) / 0.3);
+  border-radius: 0.375rem;
+  font-size: 0.75rem;
+  color: hsl(var(--muted-foreground));
+  margin-top: 0.375rem;
+}
+
+.usage-hint code {
+  font-family: 'SF Mono', 'Consolas', 'Liberation Mono', monospace;
+  font-size: 0.6875rem;
+  color: hsl(var(--foreground));
+}
+
+/* Status Badge */
+.status-badge {
+  display: inline-flex;
+  align-items: center;
+  padding: 0.25rem 0.625rem;
+  font-size: 0.6875rem;
+  font-weight: 600;
+  border-radius: 9999px;
+}
+
+.status-badge.status-enabled {
+  background: hsl(142 76% 36% / 0.1);
+  color: hsl(142 76% 36%);
+}
+
+.status-badge.status-disabled {
+  background: hsl(var(--muted) / 0.5);
+  color: hsl(var(--muted-foreground));
+}
+
+/* Empty State */
+.empty-state {
+  display: flex;
+  flex-direction: column;
+  align-items: center;
+  justify-content: center;
+  padding: 2.5rem 1rem;
+  text-align: center;
+  color: hsl(var(--muted-foreground));
+}
+
+.empty-icon {
+  font-size: 3rem;
+  opacity: 0.3;
+  margin-bottom: 0.75rem;
+}
+
+.empty-state p {
+  font-size: 0.875rem;
+  margin: 0;
+}
+
+/* Cache Settings Panel */
+.cache-settings-panel {
+  padding: 1rem;
+}
+
+.cache-settings-content {
+  display: flex;
+  flex-direction: column;
+  gap: 1rem;
+}
+
+.cache-stats {
+  display: flex;
+  flex-direction: column;
+  gap: 0.75rem;
+  padding: 1rem;
+  background: hsl(var(--muted) / 0.3);
+  border-radius: 0.5rem;
+}
+
+.stat-item {
+  display: flex;
+  justify-content: space-between;
+  align-items: center;
+  font-size: 0.8125rem;
+}
+
+.stat-label {
+  color: hsl(var(--muted-foreground));
+  font-weight: 500;
+}
+
+.stat-value {
+  color: hsl(var(--foreground));
+  font-weight: 600;
+  font-family: 'SF Mono', 'Consolas', 'Liberation Mono', monospace;
+}
+
+/* Progress Bar */
+.progress-bar {
+  width: 100%;
+  height: 8px;
+  background: hsl(var(--muted) / 0.5);
+  border-radius: 9999px;
+  overflow: hidden;
+}
+
+.progress-fill {
+  height: 100%;
+  background: hsl(var(--primary));
+  border-radius: 9999px;
+  transition: width 0.3s ease;
+}
+
+/* ========================================
+ * Form Styles
+ * ======================================== */
+
+.api-settings-form {
+  display: flex;
+  flex-direction: column;
+  gap: 1rem;
+}
+
+.form-group {
+  display: flex;
+  flex-direction: column;
+  gap: 0.5rem;
+}
+
+.form-group label {
+  font-size: 0.8125rem;
+  font-weight: 500;
+  color: hsl(var(--foreground));
+}
+
+.form-hint {
+  font-size: 0.75rem;
+  color: hsl(var(--muted-foreground));
+  font-style: italic;
+}
+
+.text-muted {
+  color: hsl(var(--muted-foreground));
+  font-weight: 400;
+}
+
+/* API Key Input Group */
+.api-key-input-group {
+  display: flex;
+  gap: 0.5rem;
+}
+
+.api-key-input-group input {
+  flex: 1;
+}
+
+.api-key-input-group .btn-icon {
+  flex-shrink: 0;
+}
+
+/* Checkbox Label */
+.checkbox-label {
+  display: flex;
+  align-items: center;
+  gap: 0.5rem;
+  font-size: 0.8125rem;
+  color: hsl(var(--foreground));
+  cursor: pointer;
+}
+
+.checkbox-label input[type="checkbox"] {
+  width: 1rem;
+  height: 1rem;
+  cursor: pointer;
+}
+
+/* Fieldset */
+.form-fieldset {
+  border: 1px solid hsl(var(--border));
+  border-radius: 0.5rem;
+  padding: 1rem;
+  margin: 0;
+}
+
+.form-fieldset legend {
+  font-size: 0.875rem;
+  font-weight: 600;
+  color: hsl(var(--foreground));
+  padding: 0 0.5rem;
+}
+
+/* Modal Actions */
+.modal-actions {
+  display: flex;
+  gap: 0.75rem;
+  justify-content: flex-end;
+  margin-top: 1rem;
+  padding-top: 1rem;
+  border-top: 1px solid hsl(var(--border));
+}
+
+/* ========================================
+ * Responsive Design
+ * ======================================== */
+
+@media (min-width: 768px) {
+  .api-settings-container {
+    padding: 1.5rem;
+  }
+
+  .card-meta {
+    gap: 1.5rem;
+  }
+}
+
+@media (max-width: 640px) {
+  .section-header {
+    flex-direction: column;
+    align-items: flex-start;
+    gap: 0.75rem;
+  }
+
+  .card-header {
+    flex-direction: column;
+    align-items: flex-start;
+    gap: 0.75rem;
+  }
+
+  .card-actions {
+    align-self: flex-end;
+  }
+
+  .card-meta {
+    flex-direction: column;
+    gap: 0.5rem;
+  }
+
+  .modal-actions {
+    flex-direction: column;
+  }
+
+  .modal-actions .btn {
+    width: 100%;
+  }
+}
+
+/* Error Message */
+.error-message {
+  display: flex;
+  align-items: center;
+  justify-content: center;
+  padding: 2rem;
+  font-size: 0.875rem;
+  color: hsl(var(--destructive));
+  text-align: center;
+}
--- a/ccw/src/templates/dashboard-js/components/navigation.js
+++ b/ccw/src/templates/dashboard-js/components/navigation.js
@@ -149,6 +149,12 @@ function initNavigation() {
        } else {
          console.error('renderCodexLensManager not defined - please refresh the page');
        }
+      } else if (currentView === 'api-settings') {
+        if (typeof renderApiSettings === 'function') {
+          renderApiSettings();
+        } else {
+          console.error('renderApiSettings not defined - please refresh the page');
+        }
      }
    });
  });
@@ -191,6 +197,8 @@ function updateContentTitle() {
    titleEl.textContent = t('title.coreMemory');
  } else if (currentView === 'codexlens-manager') {
    titleEl.textContent = t('title.codexLensManager');
+  } else if (currentView === 'api-settings') {
+    titleEl.textContent = t('title.apiSettings');
  } else if (currentView === 'liteTasks') {
    const names = { 'lite-plan': t('title.litePlanSessions'), 'lite-fix': t('title.liteFixSessions') };
    titleEl.textContent = names[currentLiteType] || t('title.liteTasks');
--- a/ccw/src/templates/dashboard-js/i18n.js
+++ b/ccw/src/templates/dashboard-js/i18n.js
@@ -1331,6 +1331,62 @@ const i18n = {
    'claude.unsupportedFileType': 'Unsupported file type',
    'claude.loadFileError': 'Failed to load file',

+
+    // API Settings
+    'nav.apiSettings': 'API Settings',
+    'title.apiSettings': 'API Settings',
+    'apiSettings.providers': 'Providers',
+    'apiSettings.customEndpoints': 'Custom Endpoints',
+    'apiSettings.cacheSettings': 'Cache Settings',
+    'apiSettings.addProvider': 'Add Provider',
+    'apiSettings.editProvider': 'Edit Provider',
+    'apiSettings.deleteProvider': 'Delete Provider',
+    'apiSettings.addEndpoint': 'Add Endpoint',
+    'apiSettings.editEndpoint': 'Edit Endpoint',
+    'apiSettings.deleteEndpoint': 'Delete Endpoint',
+    'apiSettings.providerType': 'Provider Type',
+    'apiSettings.displayName': 'Display Name',
+    'apiSettings.apiKey': 'API Key',
+    'apiSettings.apiBaseUrl': 'API Base URL',
+    'apiSettings.useEnvVar': 'Use environment variable',
+    'apiSettings.enableProvider': 'Enable provider',
+    'apiSettings.testConnection': 'Test Connection',
+    'apiSettings.endpointId': 'Endpoint ID',
+    'apiSettings.endpointIdHint': 'Usage: ccw cli -p "..." --model <endpoint-id>',
+    'apiSettings.provider': 'Provider',
+    'apiSettings.model': 'Model',
+    'apiSettings.selectModel': 'Select model',
+    'apiSettings.cacheStrategy': 'Cache Strategy',
+    'apiSettings.enableContextCaching': 'Enable Context Caching',
+    'apiSettings.cacheTTL': 'TTL (minutes)',
+    'apiSettings.cacheMaxSize': 'Max Size (KB)',
+    'apiSettings.autoCachePatterns': 'Auto-cache file patterns',
+    'apiSettings.enableGlobalCaching': 'Enable Global Caching',
+    'apiSettings.cacheUsed': 'Used',
+    'apiSettings.cacheEntries': 'Entries',
+    'apiSettings.clearCache': 'Clear Cache',
+    'apiSettings.noProviders': 'No providers configured',
+    'apiSettings.noEndpoints': 'No endpoints configured',
+    'apiSettings.enabled': 'Enabled',
+    'apiSettings.disabled': 'Disabled',
+    'apiSettings.cacheEnabled': 'Cache Enabled',
+    'apiSettings.cacheDisabled': 'Cache Disabled',
+    'apiSettings.providerSaved': 'Provider saved successfully',
+    'apiSettings.providerDeleted': 'Provider deleted successfully',
+    'apiSettings.endpointSaved': 'Endpoint saved successfully',
+    'apiSettings.endpointDeleted': 'Endpoint deleted successfully',
+    'apiSettings.cacheCleared': 'Cache cleared successfully',
+    'apiSettings.cacheSettingsUpdated': 'Cache settings updated',
+    'apiSettings.confirmDeleteProvider': 'Are you sure you want to delete this provider?',
+    'apiSettings.confirmDeleteEndpoint': 'Are you sure you want to delete this endpoint?',
+    'apiSettings.confirmClearCache': 'Are you sure you want to clear the cache?',
+    'apiSettings.connectionSuccess': 'Connection successful',
+    'apiSettings.connectionFailed': 'Connection failed',
+    'apiSettings.saveProviderFirst': 'Please save the provider first',
+    'apiSettings.addProviderFirst': 'Please add a provider first',
+    'apiSettings.failedToLoad': 'Failed to load API settings',
+    'apiSettings.toggleVisibility': 'Toggle visibility',
+
    // Common
    'common.cancel': 'Cancel',
    'common.optional': '(Optional)',
@@ -2799,6 +2855,62 @@ const i18n = {
    'claudeManager.saved': 'File saved successfully',
    'claudeManager.saveError': 'Failed to save file',

+
+    // API Settings
+    'nav.apiSettings': 'API 设置',
+    'title.apiSettings': 'API 设置',
+    'apiSettings.providers': '提供商',
+    'apiSettings.customEndpoints': '自定义端点',
+    'apiSettings.cacheSettings': '缓存设置',
+    'apiSettings.addProvider': '添加提供商',
+    'apiSettings.editProvider': '编辑提供商',
+    'apiSettings.deleteProvider': '删除提供商',
+    'apiSettings.addEndpoint': '添加端点',
+    'apiSettings.editEndpoint': '编辑端点',
+    'apiSettings.deleteEndpoint': '删除端点',
+    'apiSettings.providerType': '提供商类型',
+    'apiSettings.displayName': '显示名称',
+    'apiSettings.apiKey': 'API 密钥',
+    'apiSettings.apiBaseUrl': 'API 基础 URL',
+    'apiSettings.useEnvVar': '使用环境变量',
+    'apiSettings.enableProvider': '启用提供商',
+    'apiSettings.testConnection': '测试连接',
+    'apiSettings.endpointId': '端点 ID',
+    'apiSettings.endpointIdHint': '用法: ccw cli -p "..." --model <端点ID>',
+    'apiSettings.provider': '提供商',
+    'apiSettings.model': '模型',
+    'apiSettings.selectModel': '选择模型',
+    'apiSettings.cacheStrategy': '缓存策略',
+    'apiSettings.enableContextCaching': '启用上下文缓存',
+    'apiSettings.cacheTTL': 'TTL (分钟)',
+    'apiSettings.cacheMaxSize': '最大大小 (KB)',
+    'apiSettings.autoCachePatterns': '自动缓存文件模式',
+    'apiSettings.enableGlobalCaching': '启用全局缓存',
+    'apiSettings.cacheUsed': '已使用',
+    'apiSettings.cacheEntries': '条目数',
+    'apiSettings.clearCache': '清除缓存',
+    'apiSettings.noProviders': '未配置提供商',
+    'apiSettings.noEndpoints': '未配置端点',
+    'apiSettings.enabled': '已启用',
+    'apiSettings.disabled': '已禁用',
+    'apiSettings.cacheEnabled': '缓存已启用',
+    'apiSettings.cacheDisabled': '缓存已禁用',
+    'apiSettings.providerSaved': '提供商保存成功',
+    'apiSettings.providerDeleted': '提供商删除成功',
+    'apiSettings.endpointSaved': '端点保存成功',
+    'apiSettings.endpointDeleted': '端点删除成功',
+    'apiSettings.cacheCleared': '缓存清除成功',
+    'apiSettings.cacheSettingsUpdated': '缓存设置已更新',
+    'apiSettings.confirmDeleteProvider': '确定要删除此提供商吗？',
+    'apiSettings.confirmDeleteEndpoint': '确定要删除此端点吗？',
+    'apiSettings.confirmClearCache': '确定要清除缓存吗？',
+    'apiSettings.connectionSuccess': '连接成功',
+    'apiSettings.connectionFailed': '连接失败',
+    'apiSettings.saveProviderFirst': '请先保存提供商',
+    'apiSettings.addProviderFirst': '请先添加提供商',
+    'apiSettings.failedToLoad': '加载 API 设置失败',
+    'apiSettings.toggleVisibility': '切换可见性',
+
    // Common
    'common.cancel': '取消',
    'common.optional': '(可选)',
--- a/ccw/src/templates/dashboard-js/views/api-settings.js
+++ b/ccw/src/templates/dashboard-js/views/api-settings.js
@@ -0,0 +1,815 @@
+// API Settings View
+// Manages LiteLLM API providers, custom endpoints, and cache settings
+
+// ========== State Management ==========
+var apiSettingsData = null;
+var providerModels = {};
+var currentModal = null;
+
+// ========== Data Loading ==========
+
+/**
+ * Load API configuration
+ */
+async function loadApiSettings() {
+  try {
+    var response = await fetch('/api/litellm-api/config');
+    if (!response.ok) throw new Error('Failed to load API settings');
+    apiSettingsData = await response.json();
+    return apiSettingsData;
+  } catch (err) {
+    console.error('Failed to load API settings:', err);
+    showRefreshToast(t('common.error') + ': ' + err.message, 'error');
+    return null;
+  }
+}
+
+/**
+ * Load available models for a provider type
+ */
+async function loadProviderModels(providerType) {
+  try {
+    var response = await fetch('/api/litellm-api/models/' + providerType);
+    if (!response.ok) throw new Error('Failed to load models');
+    var data = await response.json();
+    providerModels[providerType] = data.models || [];
+    return data.models;
+  } catch (err) {
+    console.error('Failed to load provider models:', err);
+    return [];
+  }
+}
+
+/**
+ * Load cache statistics
+ */
+async function loadCacheStats() {
+  try {
+    var response = await fetch('/api/litellm-api/cache/stats');
+    if (!response.ok) throw new Error('Failed to load cache stats');
+    return await response.json();
+  } catch (err) {
+    console.error('Failed to load cache stats:', err);
+    return { enabled: false, totalSize: 0, maxSize: 104857600, entries: 0 };
+  }
+}
+
+// ========== Provider Management ==========
+
+/**
+ * Show add provider modal
+ */
+async function showAddProviderModal() {
+  var modalHtml = '<div class="generic-modal-overlay active" id="providerModal">' +
+    '<div class="generic-modal">' +
+    '<div class="generic-modal-header">' +
+    '<h3 class="generic-modal-title">' + t('apiSettings.addProvider') + '</h3>' +
+    '<button class="generic-modal-close" onclick="closeProviderModal()">&times;</button>' +
+    '</div>' +
+    '<div class="generic-modal-body">' +
+    '<form id="providerForm" class="api-settings-form">' +
+    '<div class="form-group">' +
+    '<label for="provider-type">' + t('apiSettings.providerType') + '</label>' +
+    '<select id="provider-type" class="cli-input" required>' +
+    '<option value="openai">OpenAI</option>' +
+    '<option value="anthropic">Anthropic</option>' +
+    '<option value="google">Google</option>' +
+    '<option value="ollama">Ollama</option>' +
+    '<option value="azure">Azure</option>' +
+    '<option value="mistral">Mistral AI</option>' +
+    '<option value="deepseek">DeepSeek</option>' +
+    '<option value="custom">Custom</option>' +
+    '</select>' +
+    '</div>' +
+    '<div class="form-group">' +
+    '<label for="provider-name">' + t('apiSettings.displayName') + '</label>' +
+    '<input type="text" id="provider-name" class="cli-input" placeholder="My OpenAI" required />' +
+    '</div>' +
+    '<div class="form-group">' +
+    '<label for="provider-apikey">' + t('apiSettings.apiKey') + '</label>' +
+    '<div class="api-key-input-group">' +
+    '<input type="password" id="provider-apikey" class="cli-input" placeholder="sk-..." required />' +
+    '<button type="button" class="btn-icon" onclick="toggleApiKeyVisibility(\'provider-apikey\')" title="' + t('apiSettings.toggleVisibility') + '">' +
+    '<i data-lucide="eye"></i>' +
+    '</button>' +
+    '</div>' +
+    '<label class="checkbox-label">' +
+    '<input type="checkbox" id="use-env-var" onchange="toggleEnvVarInput()" /> ' +
+    t('apiSettings.useEnvVar') +
+    '</label>' +
+    '<input type="text" id="env-var-name" class="cli-input" placeholder="OPENAI_API_KEY" style="display:none; margin-top: 0.5rem;" />' +
+    '</div>' +
+    '<div class="form-group">' +
+    '<label for="provider-apibase">' + t('apiSettings.apiBaseUrl') + ' <span class="text-muted">(' + t('common.optional') + ')</span></label>' +
+    '<input type="text" id="provider-apibase" class="cli-input" placeholder="https://api.openai.com/v1" />' +
+    '</div>' +
+    '<div class="form-group">' +
+    '<label class="checkbox-label">' +
+    '<input type="checkbox" id="provider-enabled" checked /> ' +
+    t('apiSettings.enableProvider') +
+    '</label>' +
+    '</div>' +
+    '<div class="modal-actions">' +
+    '<button type="button" class="btn btn-secondary" onclick="testProviderConnection()">' +
+    '<i data-lucide="wifi"></i> ' + t('apiSettings.testConnection') +
+    '</button>' +
+    '<button type="button" class="btn btn-secondary" onclick="closeProviderModal()">' + t('common.cancel') + '</button>' +
+    '<button type="submit" class="btn btn-primary">' +
+    '<i data-lucide="save"></i> ' + t('common.save') +
+    '</button>' +
+    '</div>' +
+    '</form>' +
+    '</div>' +
+    '</div>' +
+    '</div>';
+
+  document.body.insertAdjacentHTML('beforeend', modalHtml);
+
+  document.getElementById('providerForm').addEventListener('submit', async function(e) {
+    e.preventDefault();
+    await saveProvider();
+  });
+
+  if (window.lucide) lucide.createIcons();
+}
+
+/**
+ * Show edit provider modal
+ */
+async function showEditProviderModal(providerId) {
+  if (!apiSettingsData) return;
+
+  var provider = apiSettingsData.providers?.find(function(p) { return p.id === providerId; });
+  if (!provider) return;
+
+  await showAddProviderModal();
+
+  // Update modal title
+  document.querySelector('#providerModal .generic-modal-title').textContent = t('apiSettings.editProvider');
+
+  // Populate form
+  document.getElementById('provider-type').value = provider.type;
+  document.getElementById('provider-name').value = provider.name;
+  document.getElementById('provider-apikey').value = provider.apiKey;
+  if (provider.apiBase) {
+    document.getElementById('provider-apibase').value = provider.apiBase;
+  }
+  document.getElementById('provider-enabled').checked = provider.enabled !== false;
+
+  // Store provider ID for update
+  document.getElementById('providerForm').dataset.providerId = providerId;
+}
+
+/**
+ * Save provider (create or update)
+ */
+async function saveProvider() {
+  var form = document.getElementById('providerForm');
+  var providerId = form.dataset.providerId;
+
+  var useEnvVar = document.getElementById('use-env-var').checked;
+  var apiKey = useEnvVar
+    ? '${' + document.getElementById('env-var-name').value + '}'
+    : document.getElementById('provider-apikey').value;
+
+  var providerData = {
+    type: document.getElementById('provider-type').value,
+    name: document.getElementById('provider-name').value,
+    apiKey: apiKey,
+    apiBase: document.getElementById('provider-apibase').value || undefined,
+    enabled: document.getElementById('provider-enabled').checked
+  };
+
+  try {
+    var url = providerId
+      ? '/api/litellm-api/providers/' + providerId
+      : '/api/litellm-api/providers';
+    var method = providerId ? 'PUT' : 'POST';
+
+    var response = await fetch(url, {
+      method: method,
+      headers: { 'Content-Type': 'application/json' },
+      body: JSON.stringify(providerData)
+    });
+
+    if (!response.ok) throw new Error('Failed to save provider');
+
+    var result = await response.json();
+    showRefreshToast(t('apiSettings.providerSaved'), 'success');
+
+    closeProviderModal();
+    await renderApiSettings();
+  } catch (err) {
+    console.error('Failed to save provider:', err);
+    showRefreshToast(t('common.error') + ': ' + err.message, 'error');
+  }
+}
+
+/**
+ * Delete provider
+ */
+async function deleteProvider(providerId) {
+  if (!confirm(t('apiSettings.confirmDeleteProvider'))) return;
+
+  try {
+    var response = await fetch('/api/litellm-api/providers/' + providerId, {
+      method: 'DELETE'
+    });
+
+    if (!response.ok) throw new Error('Failed to delete provider');
+
+    showRefreshToast(t('apiSettings.providerDeleted'), 'success');
+    await renderApiSettings();
+  } catch (err) {
+    console.error('Failed to delete provider:', err);
+    showRefreshToast(t('common.error') + ': ' + err.message, 'error');
+  }
+}
+
+/**
+ * Test provider connection
+ */
+async function testProviderConnection() {
+  var form = document.getElementById('providerForm');
+  var providerId = form.dataset.providerId;
+
+  if (!providerId) {
+    showRefreshToast(t('apiSettings.saveProviderFirst'), 'warning');
+    return;
+  }
+
+  try {
+    var response = await fetch('/api/litellm-api/providers/' + providerId + '/test', {
+      method: 'POST'
+    });
+
+    if (!response.ok) throw new Error('Failed to test provider');
+
+    var result = await response.json();
+
+    if (result.success) {
+      showRefreshToast(t('apiSettings.connectionSuccess'), 'success');
+    } else {
+      showRefreshToast(t('apiSettings.connectionFailed') + ': ' + (result.error || 'Unknown error'), 'error');
+    }
+  } catch (err) {
+    console.error('Failed to test provider:', err);
+    showRefreshToast(t('common.error') + ': ' + err.message, 'error');
+  }
+}
+
+/**
+ * Close provider modal
+ */
+function closeProviderModal() {
+  var modal = document.getElementById('providerModal');
+  if (modal) modal.remove();
+}
+
+/**
+ * Toggle API key visibility
+ */
+function toggleApiKeyVisibility(inputId) {
+  var input = document.getElementById(inputId);
+  var icon = event.target.closest('button').querySelector('i');
+
+  if (input.type === 'password') {
+    input.type = 'text';
+    icon.setAttribute('data-lucide', 'eye-off');
+  } else {
+    input.type = 'password';
+    icon.setAttribute('data-lucide', 'eye');
+  }
+
+  if (window.lucide) lucide.createIcons();
+}
+
+/**
+ * Toggle environment variable input
+ */
+function toggleEnvVarInput() {
+  var useEnvVar = document.getElementById('use-env-var').checked;
+  var apiKeyInput = document.getElementById('provider-apikey');
+  var envVarInput = document.getElementById('env-var-name');
+
+  if (useEnvVar) {
+    apiKeyInput.style.display = 'none';
+    apiKeyInput.required = false;
+    envVarInput.style.display = 'block';
+    envVarInput.required = true;
+  } else {
+    apiKeyInput.style.display = 'block';
+    apiKeyInput.required = true;
+    envVarInput.style.display = 'none';
+    envVarInput.required = false;
+  }
+}
+
+// ========== Endpoint Management ==========
+
+/**
+ * Show add endpoint modal
+ */
+async function showAddEndpointModal() {
+  if (!apiSettingsData || !apiSettingsData.providers || apiSettingsData.providers.length === 0) {
+    showRefreshToast(t('apiSettings.addProviderFirst'), 'warning');
+    return;
+  }
+
+  var providerOptions = apiSettingsData.providers
+    .filter(function(p) { return p.enabled !== false; })
+    .map(function(p) {
+      return '<option value="' + p.id + '">' + p.name + ' (' + p.type + ')</option>';
+    })
+    .join('');
+
+  var modalHtml = '<div class="generic-modal-overlay active" id="endpointModal">' +
+    '<div class="generic-modal">' +
+    '<div class="generic-modal-header">' +
+    '<h3 class="generic-modal-title">' + t('apiSettings.addEndpoint') + '</h3>' +
+    '<button class="generic-modal-close" onclick="closeEndpointModal()">&times;</button>' +
+    '</div>' +
+    '<div class="generic-modal-body">' +
+    '<form id="endpointForm" class="api-settings-form">' +
+    '<div class="form-group">' +
+    '<label for="endpoint-id">' + t('apiSettings.endpointId') + '</label>' +
+    '<input type="text" id="endpoint-id" class="cli-input" placeholder="my-gpt4o" required />' +
+    '<small class="form-hint">' + t('apiSettings.endpointIdHint') + '</small>' +
+    '</div>' +
+    '<div class="form-group">' +
+    '<label for="endpoint-name">' + t('apiSettings.displayName') + '</label>' +
+    '<input type="text" id="endpoint-name" class="cli-input" placeholder="GPT-4o for Code Review" required />' +
+    '</div>' +
+    '<div class="form-group">' +
+    '<label for="endpoint-provider">' + t('apiSettings.provider') + '</label>' +
+    '<select id="endpoint-provider" class="cli-input" onchange="loadModelsForProvider()" required>' +
+    providerOptions +
+    '</select>' +
+    '</div>' +
+    '<div class="form-group">' +
+    '<label for="endpoint-model">' + t('apiSettings.model') + '</label>' +
+    '<select id="endpoint-model" class="cli-input" required>' +
+    '<option value="">' + t('apiSettings.selectModel') + '</option>' +
+    '</select>' +
+    '</div>' +
+    '<fieldset class="form-fieldset">' +
+    '<legend>' + t('apiSettings.cacheStrategy') + '</legend>' +
+    '<label class="checkbox-label">' +
+    '<input type="checkbox" id="cache-enabled" onchange="toggleCacheSettings()" /> ' +
+    t('apiSettings.enableContextCaching') +
+    '</label>' +
+    '<div id="cache-settings" style="display:none;">' +
+    '<div class="form-group">' +
+    '<label for="cache-ttl">' + t('apiSettings.cacheTTL') + '</label>' +
+    '<input type="number" id="cache-ttl" class="cli-input" value="60" min="1" />' +
+    '</div>' +
+    '<div class="form-group">' +
+    '<label for="cache-maxsize">' + t('apiSettings.cacheMaxSize') + '</label>' +
+    '<input type="number" id="cache-maxsize" class="cli-input" value="512" min="1" />' +
+    '</div>' +
+    '<div class="form-group">' +
+    '<label for="cache-patterns">' + t('apiSettings.autoCachePatterns') + '</label>' +
+    '<input type="text" id="cache-patterns" class="cli-input" placeholder="*.ts, *.md, CLAUDE.md" />' +
+    '</div>' +
+    '</div>' +
+    '</fieldset>' +
+    '<div class="modal-actions">' +
+    '<button type="button" class="btn btn-secondary" onclick="closeEndpointModal()">' + t('common.cancel') + '</button>' +
+    '<button type="submit" class="btn btn-primary">' +
+    '<i data-lucide="save"></i> ' + t('common.save') +
+    '</button>' +
+    '</div>' +
+    '</form>' +
+    '</div>' +
+    '</div>' +
+    '</div>';
+
+  document.body.insertAdjacentHTML('beforeend', modalHtml);
+
+  document.getElementById('endpointForm').addEventListener('submit', async function(e) {
+    e.preventDefault();
+    await saveEndpoint();
+  });
+
+  // Load models for first provider
+  await loadModelsForProvider();
+
+  if (window.lucide) lucide.createIcons();
+}
+
+/**
+ * Show edit endpoint modal
+ */
+async function showEditEndpointModal(endpointId) {
+  if (!apiSettingsData) return;
+
+  var endpoint = apiSettingsData.endpoints?.find(function(e) { return e.id === endpointId; });
+  if (!endpoint) return;
+
+  await showAddEndpointModal();
+
+  // Update modal title
+  document.querySelector('#endpointModal .generic-modal-title').textContent = t('apiSettings.editEndpoint');
+
+  // Populate form
+  document.getElementById('endpoint-id').value = endpoint.id;
+  document.getElementById('endpoint-id').disabled = true;
+  document.getElementById('endpoint-name').value = endpoint.name;
+  document.getElementById('endpoint-provider').value = endpoint.providerId;
+
+  await loadModelsForProvider();
+  document.getElementById('endpoint-model').value = endpoint.model;
+
+  if (endpoint.cacheStrategy) {
+    document.getElementById('cache-enabled').checked = endpoint.cacheStrategy.enabled;
+    if (endpoint.cacheStrategy.enabled) {
+      toggleCacheSettings();
+      document.getElementById('cache-ttl').value = endpoint.cacheStrategy.ttlMinutes || 60;
+      document.getElementById('cache-maxsize').value = endpoint.cacheStrategy.maxSizeKB || 512;
+      document.getElementById('cache-patterns').value = endpoint.cacheStrategy.autoCachePatterns?.join(', ') || '';
+    }
+  }
+
+  // Store endpoint ID for update
+  document.getElementById('endpointForm').dataset.endpointId = endpointId;
+}
+
+/**
+ * Save endpoint (create or update)
+ */
+async function saveEndpoint() {
+  var form = document.getElementById('endpointForm');
+  var endpointId = form.dataset.endpointId || document.getElementById('endpoint-id').value;
+
+  var cacheEnabled = document.getElementById('cache-enabled').checked;
+  var cacheStrategy = cacheEnabled ? {
+    enabled: true,
+    ttlMinutes: parseInt(document.getElementById('cache-ttl').value) || 60,
+    maxSizeKB: parseInt(document.getElementById('cache-maxsize').value) || 512,
+    autoCachePatterns: document.getElementById('cache-patterns').value
+      .split(',')
+      .map(function(p) { return p.trim(); })
+      .filter(function(p) { return p; })
+  } : { enabled: false };
+
+  var endpointData = {
+    id: endpointId,
+    name: document.getElementById('endpoint-name').value,
+    providerId: document.getElementById('endpoint-provider').value,
+    model: document.getElementById('endpoint-model').value,
+    cacheStrategy: cacheStrategy
+  };
+
+  try {
+    var url = form.dataset.endpointId
+      ? '/api/litellm-api/endpoints/' + form.dataset.endpointId
+      : '/api/litellm-api/endpoints';
+    var method = form.dataset.endpointId ? 'PUT' : 'POST';
+
+    var response = await fetch(url, {
+      method: method,
+      headers: { 'Content-Type': 'application/json' },
+      body: JSON.stringify(endpointData)
+    });
+
+    if (!response.ok) throw new Error('Failed to save endpoint');
+
+    var result = await response.json();
+    showRefreshToast(t('apiSettings.endpointSaved'), 'success');
+
+    closeEndpointModal();
+    await renderApiSettings();
+  } catch (err) {
+    console.error('Failed to save endpoint:', err);
+    showRefreshToast(t('common.error') + ': ' + err.message, 'error');
+  }
+}
+
+/**
+ * Delete endpoint
+ */
+async function deleteEndpoint(endpointId) {
+  if (!confirm(t('apiSettings.confirmDeleteEndpoint'))) return;
+
+  try {
+    var response = await fetch('/api/litellm-api/endpoints/' + endpointId, {
+      method: 'DELETE'
+    });
+
+    if (!response.ok) throw new Error('Failed to delete endpoint');
+
+    showRefreshToast(t('apiSettings.endpointDeleted'), 'success');
+    await renderApiSettings();
+  } catch (err) {
+    console.error('Failed to delete endpoint:', err);
+    showRefreshToast(t('common.error') + ': ' + err.message, 'error');
+  }
+}
+
+/**
+ * Close endpoint modal
+ */
+function closeEndpointModal() {
+  var modal = document.getElementById('endpointModal');
+  if (modal) modal.remove();
+}
+
+/**
+ * Load models for selected provider
+ */
+async function loadModelsForProvider() {
+  var providerSelect = document.getElementById('endpoint-provider');
+  var modelSelect = document.getElementById('endpoint-model');
+
+  if (!providerSelect || !modelSelect) return;
+
+  var providerId = providerSelect.value;
+  var provider = apiSettingsData.providers.find(function(p) { return p.id === providerId; });
+
+  if (!provider) return;
+
+  // Load models for provider type
+  var models = await loadProviderModels(provider.type);
+
+  modelSelect.innerHTML = '<option value="">' + t('apiSettings.selectModel') + '</option>' +
+    models.map(function(m) {
+      var desc = m.description ? ' - ' + m.description : '';
+      return '<option value="' + m.id + '">' + m.name + desc + '</option>';
+    }).join('');
+}
+
+/**
+ * Toggle cache settings visibility
+ */
+function toggleCacheSettings() {
+  var enabled = document.getElementById('cache-enabled').checked;
+  var settings = document.getElementById('cache-settings');
+  settings.style.display = enabled ? 'block' : 'none';
+}
+
+// ========== Cache Management ==========
+
+/**
+ * Clear cache
+ */
+async function clearCache() {
+  if (!confirm(t('apiSettings.confirmClearCache'))) return;
+
+  try {
+    var response = await fetch('/api/litellm-api/cache/clear', {
+      method: 'POST'
+    });
+
+    if (!response.ok) throw new Error('Failed to clear cache');
+
+    var result = await response.json();
+    showRefreshToast(t('apiSettings.cacheCleared') + ' (' + result.removed + ' entries)', 'success');
+
+    await renderApiSettings();
+  } catch (err) {
+    console.error('Failed to clear cache:', err);
+    showRefreshToast(t('common.error') + ': ' + err.message, 'error');
+  }
+}
+
+/**
+ * Toggle global cache
+ */
+async function toggleGlobalCache() {
+  var enabled = document.getElementById('global-cache-enabled').checked;
+
+  try {
+    var response = await fetch('/api/litellm-api/config/cache', {
+      method: 'PUT',
+      headers: { 'Content-Type': 'application/json' },
+      body: JSON.stringify({ enabled: enabled })
+    });
+
+    if (!response.ok) throw new Error('Failed to update cache settings');
+
+    showRefreshToast(t('apiSettings.cacheSettingsUpdated'), 'success');
+  } catch (err) {
+    console.error('Failed to update cache settings:', err);
+    showRefreshToast(t('common.error') + ': ' + err.message, 'error');
+    // Revert checkbox
+    document.getElementById('global-cache-enabled').checked = !enabled;
+  }
+}
+
+// ========== Rendering ==========
+
+/**
+ * Render API Settings page
+ */
+async function renderApiSettings() {
+  var container = document.getElementById('mainContent');
+  if (!container) return;
+
+  // Hide stats grid and search
+  var statsGrid = document.getElementById('statsGrid');
+  var searchInput = document.getElementById('searchInput');
+  if (statsGrid) statsGrid.style.display = 'none';
+  if (searchInput) searchInput.parentElement.style.display = 'none';
+
+  // Load data
+  await loadApiSettings();
+  var cacheStats = await loadCacheStats();
+
+  if (!apiSettingsData) {
+    container.innerHTML = '<div class="api-settings-container">' +
+      '<div class="error-message">' + t('apiSettings.failedToLoad') + '</div>' +
+      '</div>';
+    return;
+  }
+
+  container.innerHTML = '<div class="api-settings-container">' +
+    '<div class="api-settings-section">' +
+    '<div class="section-header">' +
+    '<h3>' + t('apiSettings.providers') + '</h3>' +
+    '<button class="btn btn-primary" onclick="showAddProviderModal()">' +
+    '<i data-lucide="plus"></i> ' + t('apiSettings.addProvider') +
+    '</button>' +
+    '</div>' +
+    '<div id="providers-list" class="api-settings-list"></div>' +
+    '</div>' +
+    '<div class="api-settings-section">' +
+    '<div class="section-header">' +
+    '<h3>' + t('apiSettings.customEndpoints') + '</h3>' +
+    '<button class="btn btn-primary" onclick="showAddEndpointModal()">' +
+    '<i data-lucide="plus"></i> ' + t('apiSettings.addEndpoint') +
+    '</button>' +
+    '</div>' +
+    '<div id="endpoints-list" class="api-settings-list"></div>' +
+    '</div>' +
+    '<div class="api-settings-section">' +
+    '<div class="section-header">' +
+    '<h3>' + t('apiSettings.cacheSettings') + '</h3>' +
+    '</div>' +
+    '<div id="cache-settings-panel" class="cache-settings-panel"></div>' +
+    '</div>' +
+    '</div>';
+
+  renderProvidersList();
+  renderEndpointsList();
+  renderCacheSettings(cacheStats);
+
+  if (window.lucide) lucide.createIcons();
+}
+
+/**
+ * Render providers list
+ */
+function renderProvidersList() {
+  var container = document.getElementById('providers-list');
+  if (!container) return;
+
+  var providers = apiSettingsData.providers || [];
+
+  if (providers.length === 0) {
+    container.innerHTML = '<div class="empty-state">' +
+      '<i data-lucide="cloud-off" class="empty-icon"></i>' +
+      '<p>' + t('apiSettings.noProviders') + '</p>' +
+      '</div>';
+    if (window.lucide) lucide.createIcons();
+    return;
+  }
+
+  container.innerHTML = providers.map(function(provider) {
+    var statusClass = provider.enabled === false ? 'disabled' : 'enabled';
+    var statusText = provider.enabled === false ? t('apiSettings.disabled') : t('apiSettings.enabled');
+
+    return '<div class="api-settings-card provider-card ' + statusClass + '">' +
+      '<div class="card-header">' +
+      '<div class="card-info">' +
+      '<h4>' + provider.name + '</h4>' +
+      '<span class="provider-type-badge">' + provider.type + '</span>' +
+      '</div>' +
+      '<div class="card-actions">' +
+      '<button class="btn-icon" onclick="showEditProviderModal(\'' + provider.id + '\')" title="' + t('common.edit') + '">' +
+      '<i data-lucide="edit"></i>' +
+      '</button>' +
+      '<button class="btn-icon btn-danger" onclick="deleteProvider(\'' + provider.id + '\')" title="' + t('common.delete') + '">' +
+      '<i data-lucide="trash-2"></i>' +
+      '</button>' +
+      '</div>' +
+      '</div>' +
+      '<div class="card-body">' +
+      '<div class="card-meta">' +
+      '<span><i data-lucide="key"></i> ' + maskApiKey(provider.apiKey) + '</span>' +
+      (provider.apiBase ? '<span><i data-lucide="globe"></i> ' + provider.apiBase + '</span>' : '') +
+      '<span class="status-badge status-' + statusClass + '">' + statusText + '</span>' +
+      '</div>' +
+      '</div>' +
+      '</div>';
+  }).join('');
+
+  if (window.lucide) lucide.createIcons();
+}
+
+/**
+ * Render endpoints list
+ */
+function renderEndpointsList() {
+  var container = document.getElementById('endpoints-list');
+  if (!container) return;
+
+  var endpoints = apiSettingsData.endpoints || [];
+
+  if (endpoints.length === 0) {
+    container.innerHTML = '<div class="empty-state">' +
+      '<i data-lucide="layers-off" class="empty-icon"></i>' +
+      '<p>' + t('apiSettings.noEndpoints') + '</p>' +
+      '</div>';
+    if (window.lucide) lucide.createIcons();
+    return;
+  }
+
+  container.innerHTML = endpoints.map(function(endpoint) {
+    var provider = apiSettingsData.providers.find(function(p) { return p.id === endpoint.providerId; });
+    var providerName = provider ? provider.name : endpoint.providerId;
+
+    var cacheStatus = endpoint.cacheStrategy?.enabled
+      ? t('apiSettings.cacheEnabled') + ' (' + endpoint.cacheStrategy.ttlMinutes + ' min)'
+      : t('apiSettings.cacheDisabled');
+
+    return '<div class="api-settings-card endpoint-card">' +
+      '<div class="card-header">' +
+      '<div class="card-info">' +
+      '<h4>' + endpoint.name + '</h4>' +
+      '<code class="endpoint-id">' + endpoint.id + '</code>' +
+      '</div>' +
+      '<div class="card-actions">' +
+      '<button class="btn-icon" onclick="showEditEndpointModal(\'' + endpoint.id + '\')" title="' + t('common.edit') + '">' +
+      '<i data-lucide="edit"></i>' +
+      '</button>' +
+      '<button class="btn-icon btn-danger" onclick="deleteEndpoint(\'' + endpoint.id + '\')" title="' + t('common.delete') + '">' +
+      '<i data-lucide="trash-2"></i>' +
+      '</button>' +
+      '</div>' +
+      '</div>' +
+      '<div class="card-body">' +
+      '<div class="card-meta">' +
+      '<span><i data-lucide="server"></i> ' + providerName + '</span>' +
+      '<span><i data-lucide="cpu"></i> ' + endpoint.model + '</span>' +
+      '<span><i data-lucide="database"></i> ' + cacheStatus + '</span>' +
+      '</div>' +
+      '<div class="usage-hint">' +
+      '<i data-lucide="terminal"></i> ' +
+      '<code>ccw cli -p "..." --model ' + endpoint.id + '</code>' +
+      '</div>' +
+      '</div>' +
+      '</div>';
+  }).join('');
+
+  if (window.lucide) lucide.createIcons();
+}
+
+/**
+ * Render cache settings panel
+ */
+function renderCacheSettings(stats) {
+  var container = document.getElementById('cache-settings-panel');
+  if (!container) return;
+
+  var globalSettings = apiSettingsData.globalCache || { enabled: false };
+  var usedMB = (stats.totalSize / 1024 / 1024).toFixed(2);
+  var maxMB = (stats.maxSize / 1024 / 1024).toFixed(0);
+  var usagePercent = stats.maxSize > 0 ? ((stats.totalSize / stats.maxSize) * 100).toFixed(1) : 0;
+
+  container.innerHTML = '<div class="cache-settings-content">' +
+    '<label class="checkbox-label">' +
+    '<input type="checkbox" id="global-cache-enabled" ' + (globalSettings.enabled ? 'checked' : '') + ' onchange="toggleGlobalCache()" /> ' +
+    t('apiSettings.enableGlobalCaching') +
+    '</label>' +
+    '<div class="cache-stats">' +
+    '<div class="stat-item">' +
+    '<span class="stat-label">' + t('apiSettings.cacheUsed') + '</span>' +
+    '<span class="stat-value">' + usedMB + ' MB / ' + maxMB + ' MB (' + usagePercent + '%)</span>' +
+    '</div>' +
+    '<div class="stat-item">' +
+    '<span class="stat-label">' + t('apiSettings.cacheEntries') + '</span>' +
+    '<span class="stat-value">' + stats.entries + '</span>' +
+    '</div>' +
+    '<div class="progress-bar">' +
+    '<div class="progress-fill" style="width: ' + usagePercent + '%"></div>' +
+    '</div>' +
+    '</div>' +
+    '<button class="btn btn-secondary" onclick="clearCache()">' +
+    '<i data-lucide="trash-2"></i> ' + t('apiSettings.clearCache') +
+    '</button>' +
+    '</div>';
+
+  if (window.lucide) lucide.createIcons();
+}
+
+// ========== Utility Functions ==========
+
+/**
+ * Mask API key for display
+ */
+function maskApiKey(apiKey) {
+  if (!apiKey) return '';
+  if (apiKey.startsWith('${')) return apiKey; // Environment variable
+  if (apiKey.length <= 8) return '***';
+  return apiKey.substring(0, 4) + '...' + apiKey.substring(apiKey.length - 4);
+}
--- a/ccw/src/templates/dashboard.html
+++ b/ccw/src/templates/dashboard.html
@@ -336,6 +336,10 @@
                <span class="nav-text flex-1" data-i18n="nav.codexLensManager">CodexLens</span>
                <span class="badge px-2 py-0.5 text-xs font-semibold rounded-full bg-hover text-muted-foreground" id="badgeCodexLens">-</span>
              </li>
+              <li class="nav-item flex items-center gap-2 px-3 py-2.5 text-sm text-muted-foreground hover:bg-hover hover:text-foreground rounded cursor-pointer transition-colors" data-view="api-settings" data-tooltip="API Settings">
+                <i data-lucide="settings" class="nav-icon"></i>
+                <span class="nav-text flex-1" data-i18n="nav.apiSettings">API Settings</span>
+              </li>
              <!-- Hidden: Code Graph Explorer (feature disabled)
              <li class="nav-item flex items-center gap-2 px-3 py-2.5 text-sm text-muted-foreground hover:bg-hover hover:text-foreground rounded cursor-pointer transition-colors" data-view="graph-explorer" data-tooltip="Code Graph Explorer">
                <i data-lucide="git-branch" class="nav-icon"></i>
--- a/ccw/src/tools/cli-executor.ts
+++ b/ccw/src/tools/cli-executor.ts
@@ -10,6 +10,10 @@ import { spawn, ChildProcess } from 'child_process';
 import { existsSync, mkdirSync, readFileSync, writeFileSync, unlinkSync, readdirSync, statSync } from 'fs';
 import { join, relative } from 'path';

+// LiteLLM integration
+import { executeLiteLLMEndpoint } from './litellm-executor.js';
+import { findEndpointById } from '../config/litellm-api-config-manager.js';
+
 // Native resume support
 import {
  trackNewSession,
@@ -592,6 +596,66 @@ async function executeCliTool(
  const workingDir = cd || process.cwd();
  ensureHistoryDir(workingDir); // Ensure history directory exists

+  // NEW: Check if model is a custom LiteLLM endpoint ID
+  if (model && !['gemini', 'qwen', 'codex'].includes(tool)) {
+    const endpoint = findEndpointById(workingDir, model);
+    if (endpoint) {
+      // Route to LiteLLM executor
+      if (onOutput) {
+        onOutput({ type: 'stderr', data: `[Routing to LiteLLM endpoint: ${model}]\n` });
+      }
+
+      const result = await executeLiteLLMEndpoint({
+        prompt,
+        endpointId: model,
+        baseDir: workingDir,
+        cwd: cd,
+        includeDirs: includeDirs ? includeDirs.split(',').map(d => d.trim()) : undefined,
+        enableCache: true,
+        onOutput: onOutput || undefined,
+      });
+
+      // Convert LiteLLM result to ExecutionOutput format
+      const startTime = Date.now();
+      const endTime = Date.now();
+      const duration = endTime - startTime;
+
+      const execution: ExecutionRecord = {
+        id: customId || `${Date.now()}-litellm`,
+        timestamp: new Date(startTime).toISOString(),
+        tool: 'litellm',
+        model: result.model,
+        mode,
+        prompt,
+        status: result.success ? 'success' : 'error',
+        exit_code: result.success ? 0 : 1,
+        duration_ms: duration,
+        output: {
+          stdout: result.output,
+          stderr: result.error || '',
+          truncated: false,
+        },
+      };
+
+      const conversation = convertToConversation(execution);
+
+      // Try to save to history
+      try {
+        saveConversation(workingDir, conversation);
+      } catch (err) {
+        console.error('[CLI Executor] Failed to save LiteLLM history:', (err as Error).message);
+      }
+
+      return {
+        success: result.success,
+        execution,
+        conversation,
+        stdout: result.output,
+        stderr: result.error || '',
+      };
+    }
+  }
+
  // Get SQLite store for native session lookup
  const store = await getSqliteStore(workingDir);

--- a/ccw/src/tools/litellm-client.ts
+++ b/ccw/src/tools/litellm-client.ts
@@ -0,0 +1,246 @@
+/**
+ * LiteLLM Client - Bridge between CCW and ccw-litellm Python package
+ * Provides LLM chat and embedding capabilities via spawned Python process
+ *
+ * Features:
+ * - Chat completions with multiple models
+ * - Text embeddings generation
+ * - Configuration management
+ * - JSON protocol communication
+ */
+
+import { spawn } from 'child_process';
+import { promisify } from 'util';
+
+export interface LiteLLMConfig {
+  pythonPath?: string;  // Default 'python'
+  configPath?: string;  // Configuration file path
+  timeout?: number;     // Default 60000ms
+}
+
+export interface ChatMessage {
+  role: 'system' | 'user' | 'assistant';
+  content: string;
+}
+
+export interface ChatResponse {
+  content: string;
+  model: string;
+  usage?: {
+    prompt_tokens: number;
+    completion_tokens: number;
+    total_tokens: number;
+  };
+}
+
+export interface EmbedResponse {
+  vectors: number[][];
+  dimensions: number;
+  model: string;
+}
+
+export interface LiteLLMStatus {
+  available: boolean;
+  version?: string;
+  error?: string;
+}
+
+export class LiteLLMClient {
+  private pythonPath: string;
+  private configPath?: string;
+  private timeout: number;
+
+  constructor(config: LiteLLMConfig = {}) {
+    this.pythonPath = config.pythonPath || 'python';
+    this.configPath = config.configPath;
+    this.timeout = config.timeout || 60000;
+  }
+
+  /**
+   * Execute Python ccw-litellm command
+   */
+  private async executePython(args: string[], options: { timeout?: number } = {}): Promise<string> {
+    const timeout = options.timeout || this.timeout;
+
+    return new Promise((resolve, reject) => {
+      const proc = spawn(this.pythonPath, ['-m', 'ccw_litellm.cli', ...args], {
+        stdio: ['pipe', 'pipe', 'pipe'],
+        env: { ...process.env }
+      });
+
+      let stdout = '';
+      let stderr = '';
+      let timedOut = false;
+
+      // Set up timeout
+      const timeoutId = setTimeout(() => {
+        timedOut = true;
+        proc.kill('SIGTERM');
+        reject(new Error(`Command timed out after ${timeout}ms`));
+      }, timeout);
+
+      proc.stdout.on('data', (data) => {
+        stdout += data.toString();
+      });
+
+      proc.stderr.on('data', (data) => {
+        stderr += data.toString();
+      });
+
+      proc.on('error', (error) => {
+        clearTimeout(timeoutId);
+        reject(new Error(`Failed to spawn Python process: ${error.message}`));
+      });
+
+      proc.on('close', (code) => {
+        clearTimeout(timeoutId);
+
+        if (timedOut) {
+          return; // Already rejected
+        }
+
+        if (code === 0) {
+          resolve(stdout.trim());
+        } else {
+          const errorMsg = stderr.trim() || `Process exited with code ${code}`;
+          reject(new Error(errorMsg));
+        }
+      });
+    });
+  }
+
+  /**
+   * Check if ccw-litellm is available
+   */
+  async isAvailable(): Promise<boolean> {
+    try {
+      await this.executePython(['version'], { timeout: 5000 });
+      return true;
+    } catch {
+      return false;
+    }
+  }
+
+  /**
+   * Get status information
+   */
+  async getStatus(): Promise<LiteLLMStatus> {
+    try {
+      const output = await this.executePython(['version'], { timeout: 5000 });
+      return {
+        available: true,
+        version: output.trim()
+      };
+    } catch (error: any) {
+      return {
+        available: false,
+        error: error.message
+      };
+    }
+  }
+
+  /**
+   * Get current configuration
+   */
+  async getConfig(): Promise<any> {
+    const output = await this.executePython(['config', '--json']);
+    return JSON.parse(output);
+  }
+
+  /**
+   * Generate embeddings for texts
+   */
+  async embed(texts: string[], model: string = 'default'): Promise<EmbedResponse> {
+    if (!texts || texts.length === 0) {
+      throw new Error('texts array cannot be empty');
+    }
+
+    const args = ['embed', '--model', model, '--output', 'json'];
+
+    // Add texts as arguments
+    for (const text of texts) {
+      args.push(text);
+    }
+
+    const output = await this.executePython(args, { timeout: this.timeout * 2 });
+    const vectors = JSON.parse(output);
+
+    return {
+      vectors,
+      dimensions: vectors[0]?.length || 0,
+      model
+    };
+  }
+
+  /**
+   * Chat with LLM
+   */
+  async chat(message: string, model: string = 'default'): Promise<string> {
+    if (!message) {
+      throw new Error('message cannot be empty');
+    }
+
+    const args = ['chat', '--model', model, message];
+    return this.executePython(args, { timeout: this.timeout * 2 });
+  }
+
+  /**
+   * Multi-turn chat with messages array
+   */
+  async chatMessages(messages: ChatMessage[], model: string = 'default'): Promise<ChatResponse> {
+    if (!messages || messages.length === 0) {
+      throw new Error('messages array cannot be empty');
+    }
+
+    // For now, just use the last user message
+    // TODO: Implement full message history support in ccw-litellm
+    const lastMessage = messages[messages.length - 1];
+    const content = await this.chat(lastMessage.content, model);
+
+    return {
+      content,
+      model,
+      usage: undefined // TODO: Add usage tracking
+    };
+  }
+}
+
+// Singleton instance
+let _client: LiteLLMClient | null = null;
+
+/**
+ * Get or create singleton LiteLLM client
+ */
+export function getLiteLLMClient(config?: LiteLLMConfig): LiteLLMClient {
+  if (!_client) {
+    _client = new LiteLLMClient(config);
+  }
+  return _client;
+}
+
+/**
+ * Check if LiteLLM is available
+ */
+export async function checkLiteLLMAvailable(): Promise<boolean> {
+  try {
+    const client = getLiteLLMClient();
+    return await client.isAvailable();
+  } catch {
+    return false;
+  }
+}
+
+/**
+ * Get LiteLLM status
+ */
+export async function getLiteLLMStatus(): Promise<LiteLLMStatus> {
+  try {
+    const client = getLiteLLMClient();
+    return await client.getStatus();
+  } catch (error: any) {
+    return {
+      available: false,
+      error: error.message
+    };
+  }
+}
--- a/ccw/src/tools/litellm-executor.ts
+++ b/ccw/src/tools/litellm-executor.ts
@@ -0,0 +1,241 @@
+/**
+ * LiteLLM Executor - Execute LiteLLM endpoints with context caching
+ * Integrates with context-cache for file packing and LiteLLM client for API calls
+ */
+
+import { getLiteLLMClient } from './litellm-client.js';
+import { handler as contextCacheHandler } from './context-cache.js';
+import {
+  findEndpointById,
+  getProviderWithResolvedEnvVars,
+} from '../config/litellm-api-config-manager.js';
+import type { CustomEndpoint, ProviderCredential } from '../types/litellm-api-config.js';
+
+export interface LiteLLMExecutionOptions {
+  prompt: string;
+  endpointId: string; // Custom endpoint ID (e.g., "my-gpt4o")
+  baseDir: string; // Project base directory
+  cwd?: string; // Working directory for file resolution
+  includeDirs?: string[]; // Additional directories for @patterns
+  enableCache?: boolean; // Override endpoint cache setting
+  onOutput?: (data: { type: string; data: string }) => void;
+}
+
+export interface LiteLLMExecutionResult {
+  success: boolean;
+  output: string;
+  model: string;
+  provider: string;
+  cacheUsed: boolean;
+  cachedFiles?: string[];
+  error?: string;
+}
+
+/**
+ * Extract @patterns from prompt text
+ */
+export function extractPatterns(prompt: string): string[] {
+  // Match @path patterns: @src/**/*.ts, @CLAUDE.md, @../shared/**/*
+  const regex = /@([^\s]+)/g;
+  const patterns: string[] = [];
+  let match;
+  while ((match = regex.exec(prompt)) !== null) {
+    patterns.push('@' + match[1]);
+  }
+  return patterns;
+}
+
+/**
+ * Execute LiteLLM endpoint with optional context caching
+ */
+export async function executeLiteLLMEndpoint(
+  options: LiteLLMExecutionOptions
+): Promise<LiteLLMExecutionResult> {
+  const { prompt, endpointId, baseDir, cwd, includeDirs, enableCache, onOutput } = options;
+
+  // 1. Find endpoint configuration
+  const endpoint = findEndpointById(baseDir, endpointId);
+  if (!endpoint) {
+    return {
+      success: false,
+      output: '',
+      model: '',
+      provider: '',
+      cacheUsed: false,
+      error: `Endpoint not found: ${endpointId}`,
+    };
+  }
+
+  // 2. Get provider with resolved env vars
+  const provider = getProviderWithResolvedEnvVars(baseDir, endpoint.providerId);
+  if (!provider) {
+    return {
+      success: false,
+      output: '',
+      model: '',
+      provider: '',
+      cacheUsed: false,
+      error: `Provider not found: ${endpoint.providerId}`,
+    };
+  }
+
+  // Verify API key is available
+  if (!provider.resolvedApiKey) {
+    return {
+      success: false,
+      output: '',
+      model: endpoint.model,
+      provider: provider.type,
+      cacheUsed: false,
+      error: `API key not configured for provider: ${provider.name}`,
+    };
+  }
+
+  // 3. Process context cache if enabled
+  let finalPrompt = prompt;
+  let cacheUsed = false;
+  let cachedFiles: string[] = [];
+
+  const shouldCache = enableCache ?? endpoint.cacheStrategy.enabled;
+  if (shouldCache) {
+    const patterns = extractPatterns(prompt);
+    if (patterns.length > 0) {
+      if (onOutput) {
+        onOutput({ type: 'stderr', data: `[Context cache: Found ${patterns.length} @patterns]\n` });
+      }
+
+      // Pack files into cache
+      const packResult = await contextCacheHandler({
+        operation: 'pack',
+        patterns,
+        cwd: cwd || process.cwd(),
+        include_dirs: includeDirs,
+        ttl: endpoint.cacheStrategy.ttlMinutes * 60 * 1000,
+        max_file_size: endpoint.cacheStrategy.maxSizeKB * 1024,
+      });
+
+      if (packResult.success && packResult.result) {
+        const pack = packResult.result as any;
+
+        if (onOutput) {
+          onOutput({
+            type: 'stderr',
+            data: `[Context cache: Packed ${pack.files_packed} files, ${pack.total_bytes} bytes]\n`,
+          });
+        }
+
+        // Read cached content
+        const readResult = await contextCacheHandler({
+          operation: 'read',
+          session_id: pack.session_id,
+          limit: endpoint.cacheStrategy.maxSizeKB * 1024,
+        });
+
+        if (readResult.success && readResult.result) {
+          const read = readResult.result as any;
+          // Prepend cached content to prompt
+          finalPrompt = `${read.content}\n\n---\n\n${prompt}`;
+          cacheUsed = true;
+          cachedFiles = pack.files_packed ? Array(pack.files_packed).fill('...') : [];
+
+          if (onOutput) {
+            onOutput({ type: 'stderr', data: `[Context cache: Applied to prompt]\n` });
+          }
+        }
+      } else if (packResult.error) {
+        if (onOutput) {
+          onOutput({ type: 'stderr', data: `[Context cache warning: ${packResult.error}]\n` });
+        }
+      }
+    }
+  }
+
+  // 4. Call LiteLLM
+  try {
+    if (onOutput) {
+      onOutput({
+        type: 'stderr',
+        data: `[LiteLLM: Calling ${provider.type}/${endpoint.model}]\n`,
+      });
+    }
+
+    const client = getLiteLLMClient({
+      pythonPath: 'python',
+      timeout: 120000, // 2 minutes
+    });
+
+    // Configure provider credentials via environment
+    // LiteLLM uses standard env vars like OPENAI_API_KEY, ANTHROPIC_API_KEY
+    const envVarName = getProviderEnvVarName(provider.type);
+    if (envVarName) {
+      process.env[envVarName] = provider.resolvedApiKey;
+    }
+
+    // Set base URL if custom
+    if (provider.apiBase) {
+      const baseUrlEnvVar = getProviderBaseUrlEnvVarName(provider.type);
+      if (baseUrlEnvVar) {
+        process.env[baseUrlEnvVar] = provider.apiBase;
+      }
+    }
+
+    // Use litellm-client to call chat
+    const response = await client.chat(finalPrompt, endpoint.model);
+
+    if (onOutput) {
+      onOutput({ type: 'stdout', data: response });
+    }
+
+    return {
+      success: true,
+      output: response,
+      model: endpoint.model,
+      provider: provider.type,
+      cacheUsed,
+      cachedFiles,
+    };
+  } catch (error) {
+    const errorMsg = (error as Error).message;
+    if (onOutput) {
+      onOutput({ type: 'stderr', data: `[LiteLLM error: ${errorMsg}]\n` });
+    }
+
+    return {
+      success: false,
+      output: '',
+      model: endpoint.model,
+      provider: provider.type,
+      cacheUsed,
+      error: errorMsg,
+    };
+  }
+}
+
+/**
+ * Get environment variable name for provider API key
+ */
+function getProviderEnvVarName(providerType: string): string | null {
+  const envVarMap: Record<string, string> = {
+    openai: 'OPENAI_API_KEY',
+    anthropic: 'ANTHROPIC_API_KEY',
+    google: 'GOOGLE_API_KEY',
+    azure: 'AZURE_API_KEY',
+    mistral: 'MISTRAL_API_KEY',
+    deepseek: 'DEEPSEEK_API_KEY',
+  };
+
+  return envVarMap[providerType] || null;
+}
+
+/**
+ * Get environment variable name for provider base URL
+ */
+function getProviderBaseUrlEnvVarName(providerType: string): string | null {
+  const envVarMap: Record<string, string> = {
+    openai: 'OPENAI_API_BASE',
+    anthropic: 'ANTHROPIC_API_BASE',
+    azure: 'AZURE_API_BASE',
+  };
+
+  return envVarMap[providerType] || null;
+}
--- a/ccw/src/types/litellm-api-config.ts
+++ b/ccw/src/types/litellm-api-config.ts
@@ -0,0 +1,136 @@
+/**
+ * LiteLLM API Configuration Type Definitions
+ *
+ * Defines types for provider credentials, cache strategies, custom endpoints,
+ * and the overall configuration structure for LiteLLM API integration.
+ */
+
+/**
+ * Supported LLM provider types
+ */
+export type ProviderType =
+  | 'openai'
+  | 'anthropic'
+  | 'ollama'
+  | 'azure'
+  | 'google'
+  | 'mistral'
+  | 'deepseek'
+  | 'custom';
+
+/**
+ * Provider credential configuration
+ * Stores API keys, base URLs, and provider metadata
+ */
+export interface ProviderCredential {
+  /** Unique identifier for this provider configuration */
+  id: string;
+
+  /** Display name for UI */
+  name: string;
+
+  /** Provider type */
+  type: ProviderType;
+
+  /** API key or environment variable reference (e.g., ${OPENAI_API_KEY}) */
+  apiKey: string;
+
+  /** Custom API base URL (optional, overrides provider default) */
+  apiBase?: string;
+
+  /** Whether this provider is enabled */
+  enabled: boolean;
+
+  /** Creation timestamp (ISO 8601) */
+  createdAt: string;
+
+  /** Last update timestamp (ISO 8601) */
+  updatedAt: string;
+}
+
+/**
+ * Cache strategy for prompt context optimization
+ * Enables file-based caching to reduce token usage
+ */
+export interface CacheStrategy {
+  /** Whether caching is enabled for this endpoint */
+  enabled: boolean;
+
+  /** Time-to-live in minutes (default: 60) */
+  ttlMinutes: number;
+
+  /** Maximum cache size in KB (default: 512) */
+  maxSizeKB: number;
+
+  /** File patterns to cache (glob patterns like "*.md", "*.ts") */
+  filePatterns: string[];
+}
+
+/**
+ * Custom endpoint configuration
+ * Maps CLI identifiers to specific models and caching strategies
+ */
+export interface CustomEndpoint {
+  /** Unique CLI identifier (used in --model flag, e.g., "my-gpt4o") */
+  id: string;
+
+  /** Display name for UI */
+  name: string;
+
+  /** Reference to provider credential ID */
+  providerId: string;
+
+  /** Model identifier (e.g., "gpt-4o", "claude-3-5-sonnet-20241022") */
+  model: string;
+
+  /** Optional description */
+  description?: string;
+
+  /** Cache strategy for this endpoint */
+  cacheStrategy: CacheStrategy;
+
+  /** Whether this endpoint is enabled */
+  enabled: boolean;
+
+  /** Creation timestamp (ISO 8601) */
+  createdAt: string;
+
+  /** Last update timestamp (ISO 8601) */
+  updatedAt: string;
+}
+
+/**
+ * Global cache settings
+ * Applies to all endpoints unless overridden
+ */
+export interface GlobalCacheSettings {
+  /** Whether caching is globally enabled */
+  enabled: boolean;
+
+  /** Cache directory path (default: ~/.ccw/cache/context) */
+  cacheDir: string;
+
+  /** Maximum total cache size in MB (default: 100) */
+  maxTotalSizeMB: number;
+}
+
+/**
+ * Complete LiteLLM API configuration
+ * Root configuration object stored in JSON file
+ */
+export interface LiteLLMApiConfig {
+  /** Configuration schema version */
+  version: number;
+
+  /** List of configured providers */
+  providers: ProviderCredential[];
+
+  /** List of custom endpoints */
+  endpoints: CustomEndpoint[];
+
+  /** Default endpoint ID (optional) */
+  defaultEndpoint?: string;
+
+  /** Global cache settings */
+  globalCacheSettings: GlobalCacheSettings;
+}
--- a/ccw/tests/litellm-client.test.ts
+++ b/ccw/tests/litellm-client.test.ts
@@ -0,0 +1,96 @@
+/**
+ * LiteLLM Client Tests
+ * Tests for the LiteLLM TypeScript bridge
+ */
+
+import { describe, it, expect, beforeEach } from '@jest/globals';
+import { LiteLLMClient, getLiteLLMClient, checkLiteLLMAvailable, getLiteLLMStatus } from '../src/tools/litellm-client';
+
+describe('LiteLLMClient', () => {
+  let client: LiteLLMClient;
+
+  beforeEach(() => {
+    client = new LiteLLMClient({ timeout: 5000 });
+  });
+
+  describe('Constructor', () => {
+    it('should create client with default config', () => {
+      const defaultClient = new LiteLLMClient();
+      expect(defaultClient).toBeDefined();
+    });
+
+    it('should create client with custom config', () => {
+      const customClient = new LiteLLMClient({
+        pythonPath: 'python3',
+        timeout: 10000
+      });
+      expect(customClient).toBeDefined();
+    });
+  });
+
+  describe('isAvailable', () => {
+    it('should check if ccw-litellm is available', async () => {
+      const available = await client.isAvailable();
+      expect(typeof available).toBe('boolean');
+    });
+  });
+
+  describe('getStatus', () => {
+    it('should return status object', async () => {
+      const status = await client.getStatus();
+      expect(status).toHaveProperty('available');
+      expect(typeof status.available).toBe('boolean');
+    });
+  });
+
+  describe('embed', () => {
+    it('should throw error for empty texts array', async () => {
+      await expect(client.embed([])).rejects.toThrow('texts array cannot be empty');
+    });
+
+    it('should throw error for null texts', async () => {
+      await expect(client.embed(null as any)).rejects.toThrow();
+    });
+  });
+
+  describe('chat', () => {
+    it('should throw error for empty message', async () => {
+      await expect(client.chat('')).rejects.toThrow('message cannot be empty');
+    });
+  });
+
+  describe('chatMessages', () => {
+    it('should throw error for empty messages array', async () => {
+      await expect(client.chatMessages([])).rejects.toThrow('messages array cannot be empty');
+    });
+
+    it('should throw error for null messages', async () => {
+      await expect(client.chatMessages(null as any)).rejects.toThrow();
+    });
+  });
+});
+
+describe('Singleton Functions', () => {
+  describe('getLiteLLMClient', () => {
+    it('should return singleton instance', () => {
+      const client1 = getLiteLLMClient();
+      const client2 = getLiteLLMClient();
+      expect(client1).toBe(client2);
+    });
+  });
+
+  describe('checkLiteLLMAvailable', () => {
+    it('should return boolean', async () => {
+      const available = await checkLiteLLMAvailable();
+      expect(typeof available).toBe('boolean');
+    });
+  });
+
+  describe('getLiteLLMStatus', () => {
+    it('should return status object', async () => {
+      const status = await getLiteLLMStatus();
+      expect(status).toHaveProperty('available');
+      expect(typeof status.available).toBe('boolean');
+    });
+  });
+});
--- a/codex-lens/src/codexlens/cli/commands.py
+++ b/codex-lens/src/codexlens/cli/commands.py
@@ -106,7 +106,8 @@ def init(
    workers: Optional[int] = typer.Option(None, "--workers", "-w", min=1, max=16, help="Parallel worker processes (default: auto-detect based on CPU count, max 16)."),
    force: bool = typer.Option(False, "--force", "-f", help="Force full reindex (skip incremental mode)."),
    no_embeddings: bool = typer.Option(False, "--no-embeddings", help="Skip automatic embedding generation (if semantic deps installed)."),
-    embedding_model: str = typer.Option("code", "--embedding-model", help="Embedding model profile: fast, code, multilingual, balanced."),
+    embedding_backend: str = typer.Option("fastembed", "--embedding-backend", help="Embedding backend: fastembed (local) or litellm (remote API)."),
+    embedding_model: str = typer.Option("code", "--embedding-model", help="Embedding model: profile name for fastembed (fast/code/multilingual/balanced) or model name for litellm (e.g. text-embedding-3-small)."),
    json_mode: bool = typer.Option(False, "--json", help="Output JSON response."),
    verbose: bool = typer.Option(False, "--verbose", "-v", help="Enable debug logging."),
 ) -> None:
@@ -120,6 +121,14 @@ def init(

    If semantic search dependencies are installed, automatically generates embeddings
    after indexing completes. Use --no-embeddings to skip this step.
+
+    Embedding Backend Options:
+      - fastembed: Local ONNX-based embeddings (default, no API calls)
+      - litellm: Remote API embeddings via ccw-litellm (requires API keys)
+
+    Embedding Model Options:
+      - For fastembed backend: Use profile names (fast, code, multilingual, balanced)
+      - For litellm backend: Use model names (e.g., text-embedding-3-small, text-embedding-ada-002)
    """
    _configure_logging(verbose, json_mode)
    config = Config()
@@ -171,11 +180,22 @@ def init(
                from codexlens.cli.embedding_manager import generate_embeddings_recursive, get_embeddings_status

                if SEMANTIC_AVAILABLE:
+                    # Validate embedding backend
+                    valid_backends = ["fastembed", "litellm"]
+                    if embedding_backend not in valid_backends:
+                        error_msg = f"Invalid embedding backend: {embedding_backend}. Must be one of: {', '.join(valid_backends)}"
+                        if json_mode:
+                            print_json(success=False, error=error_msg)
+                        else:
+                            console.print(f"[red]Error:[/red] {error_msg}")
+                        raise typer.Exit(code=1)
+
                    # Use the index root directory (not the _index.db file)
                    index_root = Path(build_result.index_root)

                    if not json_mode:
                        console.print("\n[bold]Generating embeddings...[/bold]")
+                        console.print(f"Backend: [cyan]{embedding_backend}[/cyan]")
                        console.print(f"Model: [cyan]{embedding_model}[/cyan]")
                    else:
                        # Output progress message for JSON mode (parsed by Node.js)
@@ -196,6 +216,7 @@ def init(

                    embed_result = generate_embeddings_recursive(
                        index_root,
+                        embedding_backend=embedding_backend,
                        model_profile=embedding_model,
                        force=False,  # Don't force regenerate during init
                        chunk_size=2000,
@@ -1781,11 +1802,17 @@ def embeddings_generate(
        exists=True,
        help="Path to _index.db file or project directory.",
    ),
+    backend: str = typer.Option(
+        "fastembed",
+        "--backend",
+        "-b",
+        help="Embedding backend: fastembed (local) or litellm (remote API).",
+    ),
    model: str = typer.Option(
        "code",
        "--model",
        "-m",
-        help="Model profile: fast, code, multilingual, balanced.",
+        help="Model: profile name for fastembed (fast/code/multilingual/balanced) or model name for litellm (e.g. text-embedding-3-small).",
    ),
    force: bool = typer.Option(
        False,
@@ -1813,21 +1840,43 @@ def embeddings_generate(
    semantic search capabilities. Embeddings are stored in the same
    database as the FTS index.

-    Model Profiles:
-      - fast: BAAI/bge-small-en-v1.5 (384 dims, ~80MB)
-      - code: jinaai/jina-embeddings-v2-base-code (768 dims, ~150MB) [recommended]
-      - multilingual: intfloat/multilingual-e5-large (1024 dims, ~1GB)
-      - balanced: mixedbread-ai/mxbai-embed-large-v1 (1024 dims, ~600MB)
+    Embedding Backend Options:
+      - fastembed: Local ONNX-based embeddings (default, no API calls)
+      - litellm: Remote API embeddings via ccw-litellm (requires API keys)
+
+    Model Options:
+      For fastembed backend (profiles):
+        - fast: BAAI/bge-small-en-v1.5 (384 dims, ~80MB)
+        - code: jinaai/jina-embeddings-v2-base-code (768 dims, ~150MB) [recommended]
+        - multilingual: intfloat/multilingual-e5-large (1024 dims, ~1GB)
+        - balanced: mixedbread-ai/mxbai-embed-large-v1 (1024 dims, ~600MB)
+
+      For litellm backend (model names):
+        - text-embedding-3-small, text-embedding-3-large (OpenAI)
+        - text-embedding-ada-002 (OpenAI legacy)
+        - Any model supported by ccw-litellm

    Examples:
-        codexlens embeddings-generate ~/projects/my-app              # Auto-find index for project
+        codexlens embeddings-generate ~/projects/my-app              # Auto-find index (fastembed, code profile)
        codexlens embeddings-generate ~/.codexlens/indexes/project/_index.db  # Specific index
-        codexlens embeddings-generate ~/projects/my-app --model fast --force  # Regenerate with fast model
+        codexlens embeddings-generate ~/projects/my-app --backend litellm --model text-embedding-3-small  # Use LiteLLM
+        codexlens embeddings-generate ~/projects/my-app --model fast --force  # Regenerate with fast profile
    """
    _configure_logging(verbose, json_mode)

    from codexlens.cli.embedding_manager import generate_embeddings, generate_embeddings_recursive

+    # Validate backend
+    valid_backends = ["fastembed", "litellm"]
+    if backend not in valid_backends:
+        error_msg = f"Invalid backend: {backend}. Must be one of: {', '.join(valid_backends)}"
+        if json_mode:
+            print_json(success=False, error=error_msg)
+        else:
+            console.print(f"[red]Error:[/red] {error_msg}")
+            console.print(f"[dim]Valid backends: {', '.join(valid_backends)}[/dim]")
+        raise typer.Exit(code=1)
+
    # Resolve path
    target_path = path.expanduser().resolve()

@@ -1877,11 +1926,13 @@ def embeddings_generate(
        console.print(f"Mode: [yellow]Recursive[/yellow]")
    else:
        console.print(f"Index: [dim]{index_path}[/dim]")
+    console.print(f"Backend: [cyan]{backend}[/cyan]")
    console.print(f"Model: [cyan]{model}[/cyan]\n")

    if use_recursive:
        result = generate_embeddings_recursive(
            index_root,
+            embedding_backend=backend,
            model_profile=model,
            force=force,
            chunk_size=chunk_size,
@@ -1890,6 +1941,7 @@ def embeddings_generate(
    else:
        result = generate_embeddings(
            index_path,
+            embedding_backend=backend,
            model_profile=model,
            force=force,
            chunk_size=chunk_size,
--- a/codex-lens/src/codexlens/cli/embedding_manager.py
+++ b/codex-lens/src/codexlens/cli/embedding_manager.py
@@ -191,6 +191,7 @@ def check_index_embeddings(index_path: Path) -> Dict[str, any]:

 def generate_embeddings(
    index_path: Path,
+    embedding_backend: str = "fastembed",
    model_profile: str = "code",
    force: bool = False,
    chunk_size: int = 2000,
@@ -203,7 +204,9 @@ def generate_embeddings(

    Args:
        index_path: Path to _index.db file
-        model_profile: Model profile (fast, code, multilingual, balanced)
+        embedding_backend: Embedding backend to use (fastembed or litellm)
+        model_profile: Model profile for fastembed (fast, code, multilingual, balanced)
+                      or model name for litellm (e.g., text-embedding-3-small)
        force: If True, regenerate even if embeddings exist
        chunk_size: Maximum chunk size in characters
        progress_callback: Optional callback for progress updates
@@ -253,8 +256,22 @@ def generate_embeddings(

    # Initialize components
    try:
-        # Initialize embedder (singleton, reused throughout the function)
-        embedder = get_embedder(profile=model_profile)
+        # Import factory function to support both backends
+        from codexlens.semantic.factory import get_embedder as get_embedder_factory
+
+        # Initialize embedder using factory (supports both fastembed and litellm)
+        # For fastembed: model_profile is a profile name (fast/code/multilingual/balanced)
+        # For litellm: model_profile is a model name (e.g., text-embedding-3-small)
+        if embedding_backend == "fastembed":
+            embedder = get_embedder_factory(backend="fastembed", profile=model_profile, use_gpu=True)
+        elif embedding_backend == "litellm":
+            embedder = get_embedder_factory(backend="litellm", model=model_profile)
+        else:
+            return {
+                "success": False,
+                "error": f"Invalid embedding backend: {embedding_backend}. Must be 'fastembed' or 'litellm'.",
+            }
+
        # skip_token_count=True: Use fast estimation (len/4) instead of expensive tiktoken
        # This significantly reduces CPU usage with minimal impact on metadata accuracy
        chunker = Chunker(config=ChunkConfig(max_chunk_size=chunk_size, skip_token_count=True))
@@ -428,6 +445,7 @@ def find_all_indexes(scan_dir: Path) -> List[Path]:

 def generate_embeddings_recursive(
    index_root: Path,
+    embedding_backend: str = "fastembed",
    model_profile: str = "code",
    force: bool = False,
    chunk_size: int = 2000,
@@ -437,7 +455,9 @@ def generate_embeddings_recursive(

    Args:
        index_root: Root index directory containing _index.db files
-        model_profile: Model profile (fast, code, multilingual, balanced)
+        embedding_backend: Embedding backend to use (fastembed or litellm)
+        model_profile: Model profile for fastembed (fast, code, multilingual, balanced)
+                      or model name for litellm (e.g., text-embedding-3-small)
        force: If True, regenerate even if embeddings exist
        chunk_size: Maximum chunk size in characters
        progress_callback: Optional callback for progress updates
@@ -474,6 +494,7 @@ def generate_embeddings_recursive(

        result = generate_embeddings(
            index_path,
+            embedding_backend=embedding_backend,
            model_profile=model_profile,
            force=force,
            chunk_size=chunk_size,
--- a/codex-lens/src/codexlens/semantic/init.py
+++ b/codex-lens/src/codexlens/semantic/init.py
@@ -67,10 +67,29 @@ def check_gpu_available() -> tuple[bool, str]:
        return False, "GPU support module not available"


+# Export embedder components
+# BaseEmbedder is always available (abstract base class)
+from .base import BaseEmbedder
+
+# Factory function for creating embedders
+from .factory import get_embedder as get_embedder_factory
+
+# Optional: LiteLLMEmbedderWrapper (only if ccw-litellm is installed)
+try:
+    from .litellm_embedder import LiteLLMEmbedderWrapper
+    _LITELLM_AVAILABLE = True
+except ImportError:
+    LiteLLMEmbedderWrapper = None
+    _LITELLM_AVAILABLE = False
+
+
 __all__ = [
    "SEMANTIC_AVAILABLE",
    "SEMANTIC_BACKEND",
    "GPU_AVAILABLE",
    "check_semantic_available",
    "check_gpu_available",
+    "BaseEmbedder",
+    "get_embedder_factory",
+    "LiteLLMEmbedderWrapper",
 ]
--- a/codex-lens/src/codexlens/semantic/base.py
+++ b/codex-lens/src/codexlens/semantic/base.py
@@ -0,0 +1,51 @@
+"""Base class for embedders.
+
+Defines the interface that all embedders must implement.
+"""
+
+from __future__ import annotations
+
+from abc import ABC, abstractmethod
+from typing import Iterable
+
+import numpy as np
+
+
+class BaseEmbedder(ABC):
+    """Base class for all embedders.
+
+    All embedder implementations must inherit from this class and implement
+    the abstract methods to ensure a consistent interface.
+    """
+
+    @property
+    @abstractmethod
+    def embedding_dim(self) -> int:
+        """Return embedding dimensions.
+
+        Returns:
+            int: Dimension of the embedding vectors.
+        """
+        ...
+
+    @property
+    @abstractmethod
+    def model_name(self) -> str:
+        """Return model name.
+
+        Returns:
+            str: Name or identifier of the underlying model.
+        """
+        ...
+
+    @abstractmethod
+    def embed_to_numpy(self, texts: str | Iterable[str]) -> np.ndarray:
+        """Embed texts to numpy array.
+
+        Args:
+            texts: Single text or iterable of texts to embed.
+
+        Returns:
+            numpy.ndarray: Array of shape (n_texts, embedding_dim) containing embeddings.
+        """
+        ...
--- a/codex-lens/src/codexlens/semantic/embedder.py
+++ b/codex-lens/src/codexlens/semantic/embedder.py
@@ -14,6 +14,7 @@ from typing import Dict, Iterable, List, Optional
 import numpy as np

 from . import SEMANTIC_AVAILABLE
+from .base import BaseEmbedder
 from .gpu_support import get_optimal_providers, is_gpu_available, get_gpu_summary, get_selected_device_id

 logger = logging.getLogger(__name__)
@@ -84,7 +85,7 @@ def clear_embedder_cache() -> None:
        gc.collect()


-class Embedder:
+class Embedder(BaseEmbedder):
    """Generate embeddings for code chunks using fastembed (ONNX-based).

    Supported Model Profiles:
@@ -138,11 +139,11 @@ class Embedder:

        # Resolve model name from profile or use explicit name
        if model_name:
-            self.model_name = model_name
+            self._model_name = model_name
        elif profile and profile in self.MODELS:
-            self.model_name = self.MODELS[profile]
+            self._model_name = self.MODELS[profile]
        else:
-            self.model_name = self.DEFAULT_MODEL
+            self._model_name = self.DEFAULT_MODEL

        # Configure ONNX execution providers with device_id options for GPU selection
        # Using with_device_options=True ensures DirectML/CUDA device_id is passed correctly
@@ -154,10 +155,15 @@ class Embedder:
        self._use_gpu = use_gpu
        self._model = None

+    @property
+    def model_name(self) -> str:
+        """Get model name."""
+        return self._model_name
+
    @property
    def embedding_dim(self) -> int:
        """Get embedding dimension for current model."""
-        return self.MODEL_DIMS.get(self.model_name, 768)  # Default to 768 if unknown
+        return self.MODEL_DIMS.get(self._model_name, 768)  # Default to 768 if unknown

    @property
    def providers(self) -> List[str]:
--- a/codex-lens/src/codexlens/semantic/factory.py
+++ b/codex-lens/src/codexlens/semantic/factory.py
@@ -0,0 +1,61 @@
+"""Factory for creating embedders.
+
+Provides a unified interface for instantiating different embedder backends.
+"""
+
+from __future__ import annotations
+
+from typing import Any
+
+from .base import BaseEmbedder
+
+
+def get_embedder(
+    backend: str = "fastembed",
+    profile: str = "code",
+    model: str = "default",
+    use_gpu: bool = True,
+    **kwargs: Any,
+) -> BaseEmbedder:
+    """Factory function to create embedder based on backend.
+
+    Args:
+        backend: Embedder backend to use. Options:
+            - "fastembed": Use fastembed (ONNX-based) embedder (default)
+            - "litellm": Use ccw-litellm embedder
+        profile: Model profile for fastembed backend ("fast", "code", "multilingual", "balanced")
+                Used only when backend="fastembed". Default: "code"
+        model: Model identifier for litellm backend.
+              Used only when backend="litellm". Default: "default"
+        use_gpu: Whether to use GPU acceleration when available (default: True).
+                Used only when backend="fastembed".
+        **kwargs: Additional backend-specific arguments
+
+    Returns:
+        BaseEmbedder: Configured embedder instance
+
+    Raises:
+        ValueError: If backend is not recognized
+        ImportError: If required backend dependencies are not installed
+
+    Examples:
+        Create fastembed embedder with code profile:
+            >>> embedder = get_embedder(backend="fastembed", profile="code")
+
+        Create fastembed embedder with fast profile and CPU only:
+            >>> embedder = get_embedder(backend="fastembed", profile="fast", use_gpu=False)
+
+        Create litellm embedder:
+            >>> embedder = get_embedder(backend="litellm", model="text-embedding-3-small")
+    """
+    if backend == "fastembed":
+        from .embedder import Embedder
+        return Embedder(profile=profile, use_gpu=use_gpu, **kwargs)
+    elif backend == "litellm":
+        from .litellm_embedder import LiteLLMEmbedderWrapper
+        return LiteLLMEmbedderWrapper(model=model, **kwargs)
+    else:
+        raise ValueError(
+            f"Unknown backend: {backend}. "
+            f"Supported backends: 'fastembed', 'litellm'"
+        )
--- a/codex-lens/src/codexlens/semantic/litellm_embedder.py
+++ b/codex-lens/src/codexlens/semantic/litellm_embedder.py
@@ -0,0 +1,79 @@
+"""LiteLLM embedder wrapper for CodexLens.
+
+Provides integration with ccw-litellm's LiteLLMEmbedder for embedding generation.
+"""
+
+from __future__ import annotations
+
+from typing import Iterable
+
+import numpy as np
+
+from .base import BaseEmbedder
+
+
+class LiteLLMEmbedderWrapper(BaseEmbedder):
+    """Wrapper for ccw-litellm LiteLLMEmbedder.
+
+    This wrapper adapts the ccw-litellm LiteLLMEmbedder to the CodexLens
+    BaseEmbedder interface, enabling seamless integration with CodexLens
+    semantic search functionality.
+
+    Args:
+        model: Model identifier for LiteLLM (default: "default")
+        **kwargs: Additional arguments passed to LiteLLMEmbedder
+
+    Raises:
+        ImportError: If ccw-litellm package is not installed
+    """
+
+    def __init__(self, model: str = "default", **kwargs) -> None:
+        """Initialize LiteLLM embedder wrapper.
+
+        Args:
+            model: Model identifier for LiteLLM (default: "default")
+            **kwargs: Additional arguments passed to LiteLLMEmbedder
+
+        Raises:
+            ImportError: If ccw-litellm package is not installed
+        """
+        try:
+            from ccw_litellm import LiteLLMEmbedder
+            self._embedder = LiteLLMEmbedder(model=model, **kwargs)
+        except ImportError as e:
+            raise ImportError(
+                "ccw-litellm not installed. Install with: pip install ccw-litellm"
+            ) from e
+
+    @property
+    def embedding_dim(self) -> int:
+        """Return embedding dimensions from LiteLLMEmbedder.
+
+        Returns:
+            int: Dimension of the embedding vectors.
+        """
+        return self._embedder.dimensions
+
+    @property
+    def model_name(self) -> str:
+        """Return model name from LiteLLMEmbedder.
+
+        Returns:
+            str: Name or identifier of the underlying model.
+        """
+        return self._embedder.model_name
+
+    def embed_to_numpy(self, texts: str | Iterable[str]) -> np.ndarray:
+        """Embed texts to numpy array using LiteLLMEmbedder.
+
+        Args:
+            texts: Single text or iterable of texts to embed.
+
+        Returns:
+            numpy.ndarray: Array of shape (n_texts, embedding_dim) containing embeddings.
+        """
+        if isinstance(texts, str):
+            texts = [texts]
+        else:
+            texts = list(texts)
+        return self._embedder.embed(texts)