feat: Add unified LiteLLM API management with dashboard UI and CLI integration

- Create ccw-litellm Python package with AbstractEmbedder and AbstractLLMClient interfaces - Add BaseEmbedder abstraction and factory pattern to codex-lens for pluggable backends - Implement API Settings dashboard page for provider credentials and custom endpoints - Add REST API routes for CRUD operations on providers and endpoints - Extend CLI with --model parameter for custom endpoint routing - Integrate existing context-cache for @pattern file resolution - Add provider model registry with predefined models per provider type - Include i18n translations (en/zh) for all new UI elements 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 01:50:27 +08:00 · 2025-12-23 20:36:32 +08:00
parent 5228581324
commit bf66b095c7
44 changed files with 4948 additions and 19 deletions
--- a/ccw-litellm/README.md
+++ b/ccw-litellm/README.md
@@ -0,0 +1,180 @@
+# ccw-litellm
+
+Unified LiteLLM interface layer shared by ccw and codex-lens projects.
+
+## Features
+
+- **Unified LLM Interface**: Abstract interface for LLM operations (chat, completion)
+- **Unified Embedding Interface**: Abstract interface for text embeddings
+- **Multi-Provider Support**: OpenAI, Anthropic, Azure, and more via LiteLLM
+- **Configuration Management**: YAML-based configuration with environment variable substitution
+- **Type Safety**: Full type annotations with Pydantic models
+
+## Installation
+
+```bash
+pip install -e .
+```
+
+## Quick Start
+
+### Configuration
+
+Create a configuration file at `~/.ccw/config/litellm-config.yaml`:
+
+```yaml
+version: 1
+default_provider: openai
+
+providers:
+  openai:
+    api_key: ${OPENAI_API_KEY}
+    api_base: https://api.openai.com/v1
+
+llm_models:
+  default:
+    provider: openai
+    model: gpt-4
+
+embedding_models:
+  default:
+    provider: openai
+    model: text-embedding-3-small
+    dimensions: 1536
+```
+
+### Usage
+
+#### LLM Client
+
+```python
+from ccw_litellm import LiteLLMClient, ChatMessage
+
+# Initialize client with default model
+client = LiteLLMClient(model="default")
+
+# Chat completion
+messages = [
+    ChatMessage(role="user", content="Hello, how are you?")
+]
+response = client.chat(messages)
+print(response.content)
+
+# Text completion
+response = client.complete("Once upon a time")
+print(response.content)
+```
+
+#### Embedder
+
+```python
+from ccw_litellm import LiteLLMEmbedder
+
+# Initialize embedder with default model
+embedder = LiteLLMEmbedder(model="default")
+
+# Embed single text
+vector = embedder.embed("Hello world")
+print(vector.shape)  # (1, 1536)
+
+# Embed multiple texts
+vectors = embedder.embed(["Text 1", "Text 2", "Text 3"])
+print(vectors.shape)  # (3, 1536)
+```
+
+#### Custom Configuration
+
+```python
+from ccw_litellm import LiteLLMClient, load_config
+
+# Load custom configuration
+config = load_config("/path/to/custom-config.yaml")
+
+# Use custom configuration
+client = LiteLLMClient(model="fast", config=config)
+```
+
+## Configuration Reference
+
+### Provider Configuration
+
+```yaml
+providers:
+  <provider_name>:
+    api_key: <api_key_or_${ENV_VAR}>
+    api_base: <base_url>
+```
+
+Supported providers: `openai`, `anthropic`, `azure`, `vertex_ai`, `bedrock`, etc.
+
+### LLM Model Configuration
+
+```yaml
+llm_models:
+  <model_name>:
+    provider: <provider_name>
+    model: <model_identifier>
+```
+
+### Embedding Model Configuration
+
+```yaml
+embedding_models:
+  <model_name>:
+    provider: <provider_name>
+    model: <model_identifier>
+    dimensions: <embedding_dimensions>
+```
+
+## Environment Variables
+
+The configuration supports environment variable substitution using the `${VAR}` or `${VAR:-default}` syntax:
+
+```yaml
+providers:
+  openai:
+    api_key: ${OPENAI_API_KEY}              # Required
+    api_base: ${OPENAI_API_BASE:-https://api.openai.com/v1}  # With default
+```
+
+## API Reference
+
+### Interfaces
+
+- `AbstractLLMClient`: Abstract base class for LLM clients
+- `AbstractEmbedder`: Abstract base class for embedders
+- `ChatMessage`: Message data class (role, content)
+- `LLMResponse`: Response data class (content, raw)
+
+### Implementations
+
+- `LiteLLMClient`: LiteLLM implementation of AbstractLLMClient
+- `LiteLLMEmbedder`: LiteLLM implementation of AbstractEmbedder
+
+### Configuration
+
+- `LiteLLMConfig`: Root configuration model
+- `ProviderConfig`: Provider configuration model
+- `LLMModelConfig`: LLM model configuration model
+- `EmbeddingModelConfig`: Embedding model configuration model
+- `load_config(path)`: Load configuration from YAML file
+- `get_config(path, reload)`: Get global configuration singleton
+- `reset_config()`: Reset global configuration (for testing)
+
+## Development
+
+### Running Tests
+
+```bash
+pytest tests/ -v
+```
+
+### Type Checking
+
+```bash
+mypy src/ccw_litellm
+```
+
+## License
+
+MIT
--- a/ccw-litellm/litellm-config.yaml.example
+++ b/ccw-litellm/litellm-config.yaml.example
@@ -0,0 +1,53 @@
+# LiteLLM Unified Configuration
+# Copy to ~/.ccw/config/litellm-config.yaml
+
+version: 1
+
+# Default provider for LLM calls
+default_provider: openai
+
+# Provider configurations
+providers:
+  openai:
+    api_key: ${OPENAI_API_KEY}
+    api_base: https://api.openai.com/v1
+    
+  anthropic:
+    api_key: ${ANTHROPIC_API_KEY}
+    
+  ollama:
+    api_base: http://localhost:11434
+    
+  azure:
+    api_key: ${AZURE_API_KEY}
+    api_base: ${AZURE_API_BASE}
+
+# LLM model configurations
+llm_models:
+  default:
+    provider: openai
+    model: gpt-4o
+  fast:
+    provider: openai
+    model: gpt-4o-mini
+  claude:
+    provider: anthropic
+    model: claude-sonnet-4-20250514
+  local:
+    provider: ollama
+    model: llama3.2
+
+# Embedding model configurations
+embedding_models:
+  default:
+    provider: openai
+    model: text-embedding-3-small
+    dimensions: 1536
+  large:
+    provider: openai
+    model: text-embedding-3-large
+    dimensions: 3072
+  ada:
+    provider: openai
+    model: text-embedding-ada-002
+    dimensions: 1536
--- a/ccw-litellm/pyproject.toml
+++ b/ccw-litellm/pyproject.toml
@@ -0,0 +1,35 @@
+[build-system]
+requires = ["setuptools>=61.0"]
+build-backend = "setuptools.build_meta"
+
+[project]
+name = "ccw-litellm"
+version = "0.1.0"
+description = "Unified LiteLLM interface layer shared by ccw and codex-lens"
+requires-python = ">=3.10"
+authors = [{ name = "ccw-litellm contributors" }]
+dependencies = [
+  "litellm>=1.0.0",
+  "pyyaml",
+  "numpy",
+  "pydantic>=2.0",
+]
+
+[project.optional-dependencies]
+dev = [
+  "pytest>=7.0",
+]
+
+[project.scripts]
+ccw-litellm = "ccw_litellm.cli:main"
+
+[tool.setuptools]
+package-dir = { "" = "src" }
+
+[tool.setuptools.packages.find]
+where = ["src"]
+include = ["ccw_litellm*"]
+
+[tool.pytest.ini_options]
+testpaths = ["tests"]
+addopts = "-q"
--- a/ccw-litellm/src/ccw_litellm.egg-info/PKG-INFO
+++ b/ccw-litellm/src/ccw_litellm.egg-info/PKG-INFO
@@ -0,0 +1,12 @@
+Metadata-Version: 2.4
+Name: ccw-litellm
+Version: 0.1.0
+Summary: Unified LiteLLM interface layer shared by ccw and codex-lens
+Author: ccw-litellm contributors
+Requires-Python: >=3.10
+Requires-Dist: litellm>=1.0.0
+Requires-Dist: pyyaml
+Requires-Dist: numpy
+Requires-Dist: pydantic>=2.0
+Provides-Extra: dev
+Requires-Dist: pytest>=7.0; extra == "dev"
--- a/ccw-litellm/src/ccw_litellm.egg-info/SOURCES.txt
+++ b/ccw-litellm/src/ccw_litellm.egg-info/SOURCES.txt
@@ -0,0 +1,17 @@
+pyproject.toml
+src/ccw_litellm/__init__.py
+src/ccw_litellm.egg-info/PKG-INFO
+src/ccw_litellm.egg-info/SOURCES.txt
+src/ccw_litellm.egg-info/dependency_links.txt
+src/ccw_litellm.egg-info/requires.txt
+src/ccw_litellm.egg-info/top_level.txt
+src/ccw_litellm/clients/__init__.py
+src/ccw_litellm/clients/litellm_embedder.py
+src/ccw_litellm/clients/litellm_llm.py
+src/ccw_litellm/config/__init__.py
+src/ccw_litellm/config/loader.py
+src/ccw_litellm/config/models.py
+src/ccw_litellm/interfaces/__init__.py
+src/ccw_litellm/interfaces/embedder.py
+src/ccw_litellm/interfaces/llm.py
+tests/test_interfaces.py
--- a/ccw-litellm/src/ccw_litellm.egg-info/dependency_links.txt
+++ b/ccw-litellm/src/ccw_litellm.egg-info/dependency_links.txt
@@ -0,0 +1 @@
+
--- a/ccw-litellm/src/ccw_litellm.egg-info/requires.txt
+++ b/ccw-litellm/src/ccw_litellm.egg-info/requires.txt
@@ -0,0 +1,7 @@
+litellm>=1.0.0
+pyyaml
+numpy
+pydantic>=2.0
+
+[dev]
+pytest>=7.0
--- a/ccw-litellm/src/ccw_litellm.egg-info/top_level.txt
+++ b/ccw-litellm/src/ccw_litellm.egg-info/top_level.txt
@@ -0,0 +1 @@
+ccw_litellm
--- a/ccw-litellm/src/ccw_litellm/init.py
+++ b/ccw-litellm/src/ccw_litellm/init.py
@@ -0,0 +1,47 @@
+"""ccw-litellm package.
+
+This package provides a small, stable interface layer around LiteLLM to share
+between the ccw and codex-lens projects.
+"""
+
+from __future__ import annotations
+
+from .clients import LiteLLMClient, LiteLLMEmbedder
+from .config import (
+    EmbeddingModelConfig,
+    LiteLLMConfig,
+    LLMModelConfig,
+    ProviderConfig,
+    get_config,
+    load_config,
+    reset_config,
+)
+from .interfaces import (
+    AbstractEmbedder,
+    AbstractLLMClient,
+    ChatMessage,
+    LLMResponse,
+)
+
+__version__ = "0.1.0"
+
+__all__ = [
+    "__version__",
+    # Abstract interfaces
+    "AbstractEmbedder",
+    "AbstractLLMClient",
+    "ChatMessage",
+    "LLMResponse",
+    # Client implementations
+    "LiteLLMClient",
+    "LiteLLMEmbedder",
+    # Configuration
+    "LiteLLMConfig",
+    "ProviderConfig",
+    "LLMModelConfig",
+    "EmbeddingModelConfig",
+    "load_config",
+    "get_config",
+    "reset_config",
+]
+
--- a/ccw-litellm/src/ccw_litellm/cli.py
+++ b/ccw-litellm/src/ccw_litellm/cli.py
@@ -0,0 +1,108 @@
+"""CLI entry point for ccw-litellm."""
+
+from __future__ import annotations
+
+import argparse
+import json
+import sys
+from pathlib import Path
+
+
+def main() -> int:
+    """Main CLI entry point."""
+    parser = argparse.ArgumentParser(
+        prog="ccw-litellm",
+        description="Unified LiteLLM interface for ccw and codex-lens",
+    )
+    subparsers = parser.add_subparsers(dest="command", help="Available commands")
+
+    # config command
+    config_parser = subparsers.add_parser("config", help="Show configuration")
+    config_parser.add_argument(
+        "--path",
+        type=Path,
+        help="Configuration file path",
+    )
+
+    # embed command
+    embed_parser = subparsers.add_parser("embed", help="Generate embeddings")
+    embed_parser.add_argument("texts", nargs="+", help="Texts to embed")
+    embed_parser.add_argument(
+        "--model",
+        default="default",
+        help="Embedding model name (default: default)",
+    )
+    embed_parser.add_argument(
+        "--output",
+        choices=["json", "shape"],
+        default="shape",
+        help="Output format (default: shape)",
+    )
+
+    # chat command
+    chat_parser = subparsers.add_parser("chat", help="Chat with LLM")
+    chat_parser.add_argument("message", help="Message to send")
+    chat_parser.add_argument(
+        "--model",
+        default="default",
+        help="LLM model name (default: default)",
+    )
+
+    # version command
+    subparsers.add_parser("version", help="Show version")
+
+    args = parser.parse_args()
+
+    if args.command == "version":
+        from . import __version__
+
+        print(f"ccw-litellm {__version__}")
+        return 0
+
+    if args.command == "config":
+        from .config import get_config
+
+        try:
+            config = get_config(config_path=args.path if hasattr(args, "path") else None)
+            print(config.model_dump_json(indent=2))
+        except Exception as e:
+            print(f"Error loading config: {e}", file=sys.stderr)
+            return 1
+        return 0
+
+    if args.command == "embed":
+        from .clients import LiteLLMEmbedder
+
+        try:
+            embedder = LiteLLMEmbedder(model=args.model)
+            vectors = embedder.embed(args.texts)
+
+            if args.output == "json":
+                print(json.dumps(vectors.tolist()))
+            else:
+                print(f"Shape: {vectors.shape}")
+                print(f"Dimensions: {embedder.dimensions}")
+        except Exception as e:
+            print(f"Error: {e}", file=sys.stderr)
+            return 1
+        return 0
+
+    if args.command == "chat":
+        from .clients import LiteLLMClient
+        from .interfaces import ChatMessage
+
+        try:
+            client = LiteLLMClient(model=args.model)
+            response = client.chat([ChatMessage(role="user", content=args.message)])
+            print(response.content)
+        except Exception as e:
+            print(f"Error: {e}", file=sys.stderr)
+            return 1
+        return 0
+
+    parser.print_help()
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- a/ccw-litellm/src/ccw_litellm/clients/init.py
+++ b/ccw-litellm/src/ccw_litellm/clients/init.py
@@ -0,0 +1,12 @@
+"""Client implementations for ccw-litellm."""
+
+from __future__ import annotations
+
+from .litellm_embedder import LiteLLMEmbedder
+from .litellm_llm import LiteLLMClient
+
+__all__ = [
+    "LiteLLMClient",
+    "LiteLLMEmbedder",
+]
+
--- a/ccw-litellm/src/ccw_litellm/clients/litellm_embedder.py
+++ b/ccw-litellm/src/ccw_litellm/clients/litellm_embedder.py
@@ -0,0 +1,170 @@
+"""LiteLLM embedder implementation for text embeddings."""
+
+from __future__ import annotations
+
+import logging
+from typing import Any, Sequence
+
+import litellm
+import numpy as np
+from numpy.typing import NDArray
+
+from ..config import LiteLLMConfig, get_config
+from ..interfaces.embedder import AbstractEmbedder
+
+logger = logging.getLogger(__name__)
+
+
+class LiteLLMEmbedder(AbstractEmbedder):
+    """LiteLLM embedder implementation.
+
+    Supports multiple embedding providers (OpenAI, etc.) through LiteLLM's unified interface.
+
+    Example:
+        embedder = LiteLLMEmbedder(model="default")
+        vectors = embedder.embed(["Hello world", "Another text"])
+        print(vectors.shape)  # (2, 1536)
+    """
+
+    def __init__(
+        self,
+        model: str = "default",
+        config: LiteLLMConfig | None = None,
+        **litellm_kwargs: Any,
+    ) -> None:
+        """Initialize LiteLLM embedder.
+
+        Args:
+            model: Model name from configuration (default: "default")
+            config: Configuration instance (default: use global config)
+            **litellm_kwargs: Additional arguments to pass to litellm.embedding()
+        """
+        self._config = config or get_config()
+        self._model_name = model
+        self._litellm_kwargs = litellm_kwargs
+
+        # Get embedding model configuration
+        try:
+            self._model_config = self._config.get_embedding_model(model)
+        except ValueError as e:
+            logger.error(f"Failed to get embedding model configuration: {e}")
+            raise
+
+        # Get provider configuration
+        try:
+            self._provider_config = self._config.get_provider(self._model_config.provider)
+        except ValueError as e:
+            logger.error(f"Failed to get provider configuration: {e}")
+            raise
+
+        # Set up LiteLLM environment
+        self._setup_litellm()
+
+    def _setup_litellm(self) -> None:
+        """Configure LiteLLM with provider settings."""
+        provider = self._model_config.provider
+
+        # Set API key
+        if self._provider_config.api_key:
+            litellm.api_key = self._provider_config.api_key
+            # Also set environment-specific keys
+            if provider == "openai":
+                litellm.openai_key = self._provider_config.api_key
+            elif provider == "anthropic":
+                litellm.anthropic_key = self._provider_config.api_key
+
+        # Set API base
+        if self._provider_config.api_base:
+            litellm.api_base = self._provider_config.api_base
+
+    def _format_model_name(self) -> str:
+        """Format model name for LiteLLM.
+
+        Returns:
+            Formatted model name (e.g., "text-embedding-3-small")
+        """
+        provider = self._model_config.provider
+        model = self._model_config.model
+
+        # For some providers, LiteLLM expects explicit prefix
+        if provider in ["azure", "vertex_ai", "bedrock"]:
+            return f"{provider}/{model}"
+
+        return model
+
+    @property
+    def dimensions(self) -> int:
+        """Embedding vector size."""
+        return self._model_config.dimensions
+
+    def embed(
+        self,
+        texts: str | Sequence[str],
+        *,
+        batch_size: int | None = None,
+        **kwargs: Any,
+    ) -> NDArray[np.floating]:
+        """Embed one or more texts.
+
+        Args:
+            texts: Single text or sequence of texts
+            batch_size: Batch size for processing (currently unused, LiteLLM handles batching)
+            **kwargs: Additional arguments for litellm.embedding()
+
+        Returns:
+            A numpy array of shape (n_texts, dimensions).
+
+        Raises:
+            Exception: If LiteLLM embedding fails
+        """
+        # Normalize input to list
+        if isinstance(texts, str):
+            text_list = [texts]
+            single_input = True
+        else:
+            text_list = list(texts)
+            single_input = False
+
+        if not text_list:
+            # Return empty array with correct shape
+            return np.empty((0, self.dimensions), dtype=np.float32)
+
+        # Merge kwargs
+        embedding_kwargs = {**self._litellm_kwargs, **kwargs}
+
+        try:
+            # Call LiteLLM embedding
+            response = litellm.embedding(
+                model=self._format_model_name(),
+                input=text_list,
+                **embedding_kwargs,
+            )
+
+            # Extract embeddings
+            embeddings = [item["embedding"] for item in response.data]
+
+            # Convert to numpy array
+            result = np.array(embeddings, dtype=np.float32)
+
+            # Validate dimensions
+            if result.shape[1] != self.dimensions:
+                logger.warning(
+                    f"Expected {self.dimensions} dimensions, got {result.shape[1]}. "
+                    f"Configuration may be incorrect."
+                )
+
+            return result
+
+        except Exception as e:
+            logger.error(f"LiteLLM embedding failed: {e}")
+            raise
+
+    @property
+    def model_name(self) -> str:
+        """Get configured model name."""
+        return self._model_name
+
+    @property
+    def provider(self) -> str:
+        """Get configured provider name."""
+        return self._model_config.provider
--- a/ccw-litellm/src/ccw_litellm/clients/litellm_llm.py
+++ b/ccw-litellm/src/ccw_litellm/clients/litellm_llm.py
@@ -0,0 +1,165 @@
+"""LiteLLM client implementation for LLM operations."""
+
+from __future__ import annotations
+
+import logging
+from typing import Any, Sequence
+
+import litellm
+
+from ..config import LiteLLMConfig, get_config
+from ..interfaces.llm import AbstractLLMClient, ChatMessage, LLMResponse
+
+logger = logging.getLogger(__name__)
+
+
+class LiteLLMClient(AbstractLLMClient):
+    """LiteLLM client implementation.
+
+    Supports multiple providers (OpenAI, Anthropic, etc.) through LiteLLM's unified interface.
+
+    Example:
+        client = LiteLLMClient(model="default")
+        response = client.chat([
+            ChatMessage(role="user", content="Hello!")
+        ])
+        print(response.content)
+    """
+
+    def __init__(
+        self,
+        model: str = "default",
+        config: LiteLLMConfig | None = None,
+        **litellm_kwargs: Any,
+    ) -> None:
+        """Initialize LiteLLM client.
+
+        Args:
+            model: Model name from configuration (default: "default")
+            config: Configuration instance (default: use global config)
+            **litellm_kwargs: Additional arguments to pass to litellm.completion()
+        """
+        self._config = config or get_config()
+        self._model_name = model
+        self._litellm_kwargs = litellm_kwargs
+
+        # Get model configuration
+        try:
+            self._model_config = self._config.get_llm_model(model)
+        except ValueError as e:
+            logger.error(f"Failed to get model configuration: {e}")
+            raise
+
+        # Get provider configuration
+        try:
+            self._provider_config = self._config.get_provider(self._model_config.provider)
+        except ValueError as e:
+            logger.error(f"Failed to get provider configuration: {e}")
+            raise
+
+        # Set up LiteLLM environment
+        self._setup_litellm()
+
+    def _setup_litellm(self) -> None:
+        """Configure LiteLLM with provider settings."""
+        provider = self._model_config.provider
+
+        # Set API key
+        if self._provider_config.api_key:
+            env_var = f"{provider.upper()}_API_KEY"
+            litellm.api_key = self._provider_config.api_key
+            # Also set environment-specific keys
+            if provider == "openai":
+                litellm.openai_key = self._provider_config.api_key
+            elif provider == "anthropic":
+                litellm.anthropic_key = self._provider_config.api_key
+
+        # Set API base
+        if self._provider_config.api_base:
+            litellm.api_base = self._provider_config.api_base
+
+    def _format_model_name(self) -> str:
+        """Format model name for LiteLLM.
+
+        Returns:
+            Formatted model name (e.g., "gpt-4", "claude-3-opus-20240229")
+        """
+        # LiteLLM expects model names in format: "provider/model" or just "model"
+        # If provider is explicit, use provider/model format
+        provider = self._model_config.provider
+        model = self._model_config.model
+
+        # For some providers, LiteLLM expects explicit prefix
+        if provider in ["anthropic", "azure", "vertex_ai", "bedrock"]:
+            return f"{provider}/{model}"
+
+        return model
+
+    def chat(
+        self,
+        messages: Sequence[ChatMessage],
+        **kwargs: Any,
+    ) -> LLMResponse:
+        """Chat completion for a sequence of messages.
+
+        Args:
+            messages: Sequence of chat messages
+            **kwargs: Additional arguments for litellm.completion()
+
+        Returns:
+            LLM response with content and raw response
+
+        Raises:
+            Exception: If LiteLLM completion fails
+        """
+        # Convert messages to LiteLLM format
+        litellm_messages = [
+            {"role": msg.role, "content": msg.content} for msg in messages
+        ]
+
+        # Merge kwargs
+        completion_kwargs = {**self._litellm_kwargs, **kwargs}
+
+        try:
+            # Call LiteLLM
+            response = litellm.completion(
+                model=self._format_model_name(),
+                messages=litellm_messages,
+                **completion_kwargs,
+            )
+
+            # Extract content
+            content = response.choices[0].message.content or ""
+
+            return LLMResponse(content=content, raw=response)
+
+        except Exception as e:
+            logger.error(f"LiteLLM completion failed: {e}")
+            raise
+
+    def complete(self, prompt: str, **kwargs: Any) -> LLMResponse:
+        """Text completion for a prompt.
+
+        Args:
+            prompt: Input prompt
+            **kwargs: Additional arguments for litellm.completion()
+
+        Returns:
+            LLM response with content and raw response
+
+        Raises:
+            Exception: If LiteLLM completion fails
+        """
+        # Convert to chat format (most modern models use chat interface)
+        messages = [ChatMessage(role="user", content=prompt)]
+        return self.chat(messages, **kwargs)
+
+    @property
+    def model_name(self) -> str:
+        """Get configured model name."""
+        return self._model_name
+
+    @property
+    def provider(self) -> str:
+        """Get configured provider name."""
+        return self._model_config.provider
--- a/ccw-litellm/src/ccw_litellm/config/init.py
+++ b/ccw-litellm/src/ccw_litellm/config/init.py
@@ -0,0 +1,22 @@
+"""Configuration management for LiteLLM integration."""
+
+from __future__ import annotations
+
+from .loader import get_config, load_config, reset_config
+from .models import (
+    EmbeddingModelConfig,
+    LiteLLMConfig,
+    LLMModelConfig,
+    ProviderConfig,
+)
+
+__all__ = [
+    "LiteLLMConfig",
+    "ProviderConfig",
+    "LLMModelConfig",
+    "EmbeddingModelConfig",
+    "load_config",
+    "get_config",
+    "reset_config",
+]
+
--- a/ccw-litellm/src/ccw_litellm/config/loader.py
+++ b/ccw-litellm/src/ccw_litellm/config/loader.py
@@ -0,0 +1,150 @@
+"""Configuration loader with environment variable substitution."""
+
+from __future__ import annotations
+
+import os
+import re
+from pathlib import Path
+from typing import Any
+
+import yaml
+
+from .models import LiteLLMConfig
+
+# Default configuration path
+DEFAULT_CONFIG_PATH = Path.home() / ".ccw" / "config" / "litellm-config.yaml"
+
+# Global configuration singleton
+_config_instance: LiteLLMConfig | None = None
+
+
+def _substitute_env_vars(value: Any) -> Any:
+    """Recursively substitute environment variables in configuration values.
+
+    Supports ${ENV_VAR} and ${ENV_VAR:-default} syntax.
+
+    Args:
+        value: Configuration value (str, dict, list, or primitive)
+
+    Returns:
+        Value with environment variables substituted
+    """
+    if isinstance(value, str):
+        # Pattern: ${VAR} or ${VAR:-default}
+        pattern = r"\$\{([^:}]+)(?::-(.*?))?\}"
+
+        def replace_var(match: re.Match) -> str:
+            var_name = match.group(1)
+            default_value = match.group(2) if match.group(2) is not None else ""
+            return os.environ.get(var_name, default_value)
+
+        return re.sub(pattern, replace_var, value)
+
+    if isinstance(value, dict):
+        return {k: _substitute_env_vars(v) for k, v in value.items()}
+
+    if isinstance(value, list):
+        return [_substitute_env_vars(item) for item in value]
+
+    return value
+
+
+def _get_default_config() -> dict[str, Any]:
+    """Get default configuration when no config file exists.
+
+    Returns:
+        Default configuration dictionary
+    """
+    return {
+        "version": 1,
+        "default_provider": "openai",
+        "providers": {
+            "openai": {
+                "api_key": "${OPENAI_API_KEY}",
+                "api_base": "https://api.openai.com/v1",
+            },
+        },
+        "llm_models": {
+            "default": {
+                "provider": "openai",
+                "model": "gpt-4",
+            },
+            "fast": {
+                "provider": "openai",
+                "model": "gpt-3.5-turbo",
+            },
+        },
+        "embedding_models": {
+            "default": {
+                "provider": "openai",
+                "model": "text-embedding-3-small",
+                "dimensions": 1536,
+            },
+        },
+    }
+
+
+def load_config(config_path: Path | str | None = None) -> LiteLLMConfig:
+    """Load LiteLLM configuration from YAML file.
+
+    Args:
+        config_path: Path to configuration file (default: ~/.ccw/config/litellm-config.yaml)
+
+    Returns:
+        Parsed and validated configuration
+
+    Raises:
+        FileNotFoundError: If config file not found and no default available
+        ValueError: If configuration is invalid
+    """
+    if config_path is None:
+        config_path = DEFAULT_CONFIG_PATH
+    else:
+        config_path = Path(config_path)
+
+    # Load configuration
+    if config_path.exists():
+        try:
+            with open(config_path, "r", encoding="utf-8") as f:
+                raw_config = yaml.safe_load(f)
+        except Exception as e:
+            raise ValueError(f"Failed to load configuration from {config_path}: {e}") from e
+    else:
+        # Use default configuration
+        raw_config = _get_default_config()
+
+    # Substitute environment variables
+    config_data = _substitute_env_vars(raw_config)
+
+    # Validate and parse with Pydantic
+    try:
+        return LiteLLMConfig.model_validate(config_data)
+    except Exception as e:
+        raise ValueError(f"Invalid configuration: {e}") from e
+
+
+def get_config(config_path: Path | str | None = None, reload: bool = False) -> LiteLLMConfig:
+    """Get global configuration singleton.
+
+    Args:
+        config_path: Path to configuration file (default: ~/.ccw/config/litellm-config.yaml)
+        reload: Force reload configuration from disk
+
+    Returns:
+        Global configuration instance
+    """
+    global _config_instance
+
+    if _config_instance is None or reload:
+        _config_instance = load_config(config_path)
+
+    return _config_instance
+
+
+def reset_config() -> None:
+    """Reset global configuration singleton.
+
+    Useful for testing.
+    """
+    global _config_instance
+    _config_instance = None
--- a/ccw-litellm/src/ccw_litellm/config/models.py
+++ b/ccw-litellm/src/ccw_litellm/config/models.py
@@ -0,0 +1,130 @@
+"""Pydantic configuration models for LiteLLM integration."""
+
+from __future__ import annotations
+
+from typing import Any
+
+from pydantic import BaseModel, Field
+
+
+class ProviderConfig(BaseModel):
+    """Provider API configuration.
+
+    Supports environment variable substitution in the format ${ENV_VAR}.
+    """
+
+    api_key: str | None = None
+    api_base: str | None = None
+
+    model_config = {"extra": "allow"}
+
+
+class LLMModelConfig(BaseModel):
+    """LLM model configuration."""
+
+    provider: str
+    model: str
+
+    model_config = {"extra": "allow"}
+
+
+class EmbeddingModelConfig(BaseModel):
+    """Embedding model configuration."""
+
+    provider: str  # "openai", "fastembed", "ollama", etc.
+    model: str
+    dimensions: int
+
+    model_config = {"extra": "allow"}
+
+
+class LiteLLMConfig(BaseModel):
+    """Root configuration for LiteLLM integration.
+
+    Example YAML:
+        version: 1
+        default_provider: openai
+        providers:
+          openai:
+            api_key: ${OPENAI_API_KEY}
+            api_base: https://api.openai.com/v1
+          anthropic:
+            api_key: ${ANTHROPIC_API_KEY}
+        llm_models:
+          default:
+            provider: openai
+            model: gpt-4
+          fast:
+            provider: openai
+            model: gpt-3.5-turbo
+        embedding_models:
+          default:
+            provider: openai
+            model: text-embedding-3-small
+            dimensions: 1536
+    """
+
+    version: int = 1
+    default_provider: str = "openai"
+    providers: dict[str, ProviderConfig] = Field(default_factory=dict)
+    llm_models: dict[str, LLMModelConfig] = Field(default_factory=dict)
+    embedding_models: dict[str, EmbeddingModelConfig] = Field(default_factory=dict)
+
+    model_config = {"extra": "allow"}
+
+    def get_llm_model(self, model: str = "default") -> LLMModelConfig:
+        """Get LLM model configuration by name.
+
+        Args:
+            model: Model name or "default"
+
+        Returns:
+            LLM model configuration
+
+        Raises:
+            ValueError: If model not found
+        """
+        if model not in self.llm_models:
+            raise ValueError(
+                f"LLM model '{model}' not found in configuration. "
+                f"Available models: {list(self.llm_models.keys())}"
+            )
+        return self.llm_models[model]
+
+    def get_embedding_model(self, model: str = "default") -> EmbeddingModelConfig:
+        """Get embedding model configuration by name.
+
+        Args:
+            model: Model name or "default"
+
+        Returns:
+            Embedding model configuration
+
+        Raises:
+            ValueError: If model not found
+        """
+        if model not in self.embedding_models:
+            raise ValueError(
+                f"Embedding model '{model}' not found in configuration. "
+                f"Available models: {list(self.embedding_models.keys())}"
+            )
+        return self.embedding_models[model]
+
+    def get_provider(self, provider: str) -> ProviderConfig:
+        """Get provider configuration by name.
+
+        Args:
+            provider: Provider name
+
+        Returns:
+            Provider configuration
+
+        Raises:
+            ValueError: If provider not found
+        """
+        if provider not in self.providers:
+            raise ValueError(
+                f"Provider '{provider}' not found in configuration. "
+                f"Available providers: {list(self.providers.keys())}"
+            )
+        return self.providers[provider]
--- a/ccw-litellm/src/ccw_litellm/interfaces/init.py
+++ b/ccw-litellm/src/ccw_litellm/interfaces/init.py
@@ -0,0 +1,14 @@
+"""Abstract interfaces for ccw-litellm."""
+
+from __future__ import annotations
+
+from .embedder import AbstractEmbedder
+from .llm import AbstractLLMClient, ChatMessage, LLMResponse
+
+__all__ = [
+    "AbstractEmbedder",
+    "AbstractLLMClient",
+    "ChatMessage",
+    "LLMResponse",
+]
+
--- a/ccw-litellm/src/ccw_litellm/interfaces/embedder.py
+++ b/ccw-litellm/src/ccw_litellm/interfaces/embedder.py
@@ -0,0 +1,52 @@
+from __future__ import annotations
+
+import asyncio
+from abc import ABC, abstractmethod
+from typing import Any, Sequence
+
+import numpy as np
+from numpy.typing import NDArray
+
+
+class AbstractEmbedder(ABC):
+    """Embedding interface compatible with fastembed-style embedders.
+
+    Implementers only need to provide the synchronous `embed` method; an
+    asynchronous `aembed` wrapper is provided for convenience.
+    """
+
+    @property
+    @abstractmethod
+    def dimensions(self) -> int:
+        """Embedding vector size."""
+
+    @abstractmethod
+    def embed(
+        self,
+        texts: str | Sequence[str],
+        *,
+        batch_size: int | None = None,
+        **kwargs: Any,
+    ) -> NDArray[np.floating]:
+        """Embed one or more texts.
+
+        Returns:
+            A numpy array of shape (n_texts, dimensions).
+        """
+
+    async def aembed(
+        self,
+        texts: str | Sequence[str],
+        *,
+        batch_size: int | None = None,
+        **kwargs: Any,
+    ) -> NDArray[np.floating]:
+        """Async wrapper around `embed` using a worker thread by default."""
+
+        return await asyncio.to_thread(
+            self.embed,
+            texts,
+            batch_size=batch_size,
+            **kwargs,
+        )
+
--- a/ccw-litellm/src/ccw_litellm/interfaces/llm.py
+++ b/ccw-litellm/src/ccw_litellm/interfaces/llm.py
@@ -0,0 +1,45 @@
+from __future__ import annotations
+
+import asyncio
+from abc import ABC, abstractmethod
+from dataclasses import dataclass
+from typing import Any, Literal, Sequence
+
+
+@dataclass(frozen=True, slots=True)
+class ChatMessage:
+    role: Literal["system", "user", "assistant", "tool"]
+    content: str
+
+
+@dataclass(frozen=True, slots=True)
+class LLMResponse:
+    content: str
+    raw: Any | None = None
+
+
+class AbstractLLMClient(ABC):
+    """LiteLLM-like client interface.
+
+    Implementers only need to provide synchronous methods; async wrappers are
+    provided via `asyncio.to_thread`.
+    """
+
+    @abstractmethod
+    def chat(self, messages: Sequence[ChatMessage], **kwargs: Any) -> LLMResponse:
+        """Chat completion for a sequence of messages."""
+
+    @abstractmethod
+    def complete(self, prompt: str, **kwargs: Any) -> LLMResponse:
+        """Text completion for a prompt."""
+
+    async def achat(self, messages: Sequence[ChatMessage], **kwargs: Any) -> LLMResponse:
+        """Async wrapper around `chat` using a worker thread by default."""
+
+        return await asyncio.to_thread(self.chat, messages, **kwargs)
+
+    async def acomplete(self, prompt: str, **kwargs: Any) -> LLMResponse:
+        """Async wrapper around `complete` using a worker thread by default."""
+
+        return await asyncio.to_thread(self.complete, prompt, **kwargs)
+
--- a/ccw-litellm/tests/conftest.py
+++ b/ccw-litellm/tests/conftest.py
@@ -0,0 +1,11 @@
+from __future__ import annotations
+
+import sys
+from pathlib import Path
+
+
+def pytest_configure() -> None:
+    project_root = Path(__file__).resolve().parents[1]
+    src_dir = project_root / "src"
+    sys.path.insert(0, str(src_dir))
+
--- a/ccw-litellm/tests/test_interfaces.py
+++ b/ccw-litellm/tests/test_interfaces.py
@@ -0,0 +1,64 @@
+from __future__ import annotations
+
+import asyncio
+from typing import Any, Sequence
+
+import numpy as np
+
+from ccw_litellm.interfaces import AbstractEmbedder, AbstractLLMClient, ChatMessage, LLMResponse
+
+
+class _DummyEmbedder(AbstractEmbedder):
+    @property
+    def dimensions(self) -> int:
+        return 3
+
+    def embed(
+        self,
+        texts: str | Sequence[str],
+        *,
+        batch_size: int | None = None,
+        **kwargs: Any,
+    ) -> np.ndarray:
+        if isinstance(texts, str):
+            texts = [texts]
+        _ = batch_size
+        _ = kwargs
+        return np.zeros((len(texts), self.dimensions), dtype=np.float32)
+
+
+class _DummyLLM(AbstractLLMClient):
+    def chat(self, messages: Sequence[ChatMessage], **kwargs: Any) -> LLMResponse:
+        _ = kwargs
+        return LLMResponse(content="".join(m.content for m in messages))
+
+    def complete(self, prompt: str, **kwargs: Any) -> LLMResponse:
+        _ = kwargs
+        return LLMResponse(content=prompt)
+
+
+def test_embed_sync_shape_and_dtype() -> None:
+    emb = _DummyEmbedder()
+    out = emb.embed(["a", "b"])
+    assert out.shape == (2, 3)
+    assert out.dtype == np.float32
+
+
+def test_embed_async_wrapper() -> None:
+    emb = _DummyEmbedder()
+    out = asyncio.run(emb.aembed("x"))
+    assert out.shape == (1, 3)
+
+
+def test_llm_sync() -> None:
+    llm = _DummyLLM()
+    out = llm.chat([ChatMessage(role="user", content="hi")])
+    assert out == LLMResponse(content="hi")
+
+
+def test_llm_async_wrappers() -> None:
+    llm = _DummyLLM()
+    out1 = asyncio.run(llm.achat([ChatMessage(role="user", content="a")]))
+    out2 = asyncio.run(llm.acomplete("b"))
+    assert out1.content == "a"
+    assert out2.content == "b"