fix: 修复 ModelScope API 路由 bug 导致的 Ollama 连接错误

- 添加 _sanitize_text() 方法处理以 'import' 开头的文本 - ModelScope 后端错误地将此类文本路由到本地 Ollama 端点 - 通过在文本前添加空格绕过路由检测，不影响嵌入质量 - 增强 embedding_manager.py 的重试逻辑和错误处理 - 在 commands.py 中成功生成后调用全局模型锁定 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-10 02:24:35 +08:00 · 2025-12-25 12:52:43 +08:00
parent 229d51cd18
commit 501d9a05d4
3 changed files with 72 additions and 18 deletions
--- a/codex-lens/src/codexlens/semantic/litellm_embedder.py
+++ b/codex-lens/src/codexlens/semantic/litellm_embedder.py
@@ -89,6 +89,23 @@ class LiteLLMEmbedderWrapper(BaseEmbedder):
        # Default fallback
        return 8192

+    def _sanitize_text(self, text: str) -> str:
+        """Sanitize text to work around ModelScope API routing bug.
+
+        ModelScope incorrectly routes text starting with lowercase 'import'
+        to an Ollama endpoint, causing failures. This adds a leading space
+        to work around the issue without affecting embedding quality.
+
+        Args:
+            text: Text to sanitize.
+
+        Returns:
+            Sanitized text safe for embedding API.
+        """
+        if text.startswith('import'):
+            return ' ' + text
+        return text
+
    def embed_to_numpy(self, texts: str | Iterable[str], **kwargs) -> np.ndarray:
        """Embed texts to numpy array using LiteLLMEmbedder.

@@ -104,5 +121,9 @@ class LiteLLMEmbedderWrapper(BaseEmbedder):
            texts = [texts]
        else:
            texts = list(texts)
+
+        # Sanitize texts to avoid ModelScope routing bug
+        texts = [self._sanitize_text(t) for t in texts]
+
        # LiteLLM handles batching internally, ignore batch_size parameter
        return self._embedder.embed(texts)