Add tests and documentation for CodexLens LSP tool

- Introduced a new test script for the CodexLens LSP tool to validate core functionalities including symbol search, find definition, find references, and get hover.
- Created comprehensive documentation for the MCP endpoint design, detailing the architecture, features, and integration with the CCW MCP Manager.
- Developed a detailed implementation plan for transitioning to a real LSP server, outlining phases, architecture, and acceptance criteria.
This commit is contained in:
catlog22
2026-01-19 23:26:35 +08:00
parent eeaefa7208
commit 3fe630f221
24 changed files with 3044 additions and 509 deletions

View File

@@ -1,515 +1,363 @@
# Pure Vector Search 实施总结
# CodexLens Real LSP Implementation - Summary
**实施日期**: 2025-12-16
**版本**: v0.5.0
**状态**: ✅ 完成并测试通过
> **Date**: 2026-01-19
> **Status**: Planning Complete, Implementation Ready
> **Focus**: Real LSP Server + VSCode Bridge Integration
---
## 📋 实施清单
## ✅ Completed Work
### ✅ 已完成项
### 1. Planning Documents
- [x] **核心功能实现**
- [x] 修改 `HybridSearchEngine` 添加 `pure_vector` 参数
- [x] 更新 `ChainSearchEngine` 支持 `pure_vector`
- [x] 更新 CLI 支持 `pure-vector` 模式
- [x] 添加参数验证和错误处理
#### a. Main Implementation Plan
**File**: `docs/REAL_LSP_SERVER_PLAN.md`
- [x] **工具脚本和CLI集成**
- [x] 创建向量嵌入生成脚本 (`scripts/generate_embeddings.py`)
- [x] 集成CLI命令 (`codexlens embeddings-generate`, `codexlens embeddings-status`)
- [x] 支持项目路径和索引文件路径
- [x] 支持多种嵌入模型选择
- [x] 添加进度显示和错误处理
- [x] 改进错误消息提示用户使用新CLI命令
**Content**:
- Complete architecture design for real LSP server
- 5-phase implementation plan
- Multi-language support strategy (TypeScript, Python, Go, Rust, Java, C/C++)
- Language server multiplexer design
- Position tolerance feature (cclsp-like)
- MCP integration layer
- [x] **测试验证**
- [x] 创建纯向量搜索测试套件 (`tests/test_pure_vector_search.py`)
- [x] 测试无嵌入场景(返回空列表)
- [x] 测试向量+FTS后备场景
- [x] 测试搜索模式对比
- [x] 所有测试通过 (5/5)
**Key Decisions**:
- Use `pygls` library for LSP implementation
- Support 6+ language servers via multiplexer
- Implement position tolerance for fuzzy AI-generated positions
- Three integration paths: Standalone LSP, VSCode Bridge, Index-based fallback
- [x] **文档**
- [x] 完整使用指南 (`PURE_VECTOR_SEARCH_GUIDE.md`)
- [x] API使用示例
- [x] 故障排除指南
- [x] 性能对比数据
#### b. VSCode Bridge Implementation (Appendix A)
**Included in**: `docs/REAL_LSP_SERVER_PLAN.md`
**Content**:
- HTTP-based VSCode extension bridge
- MCP tool integration (vscode_lsp)
- Complete architecture diagram
- API endpoint specifications
- Comparison with standalone LSP approach
### 2. VSCode Bridge Extension
#### Created Files:
1. **`ccw-vscode-bridge/package.json`**
- VSCode extension manifest
- Dependencies: @types/node, @types/vscode, typescript
2. **`ccw-vscode-bridge/tsconfig.json`**
- TypeScript compilation configuration
- Target: ES2020, CommonJS modules
3. **`ccw-vscode-bridge/src/extension.ts`**
- HTTP server on port 3457
- 4 API endpoints:
- `POST /get_definition`
- `POST /get_references`
- `POST /get_hover`
- `POST /get_document_symbols`
- VSCode API integration via `vscode.commands.executeCommand`
4. **`ccw-vscode-bridge/.vscodeignore`**
- Build artifact exclusion rules
5. **`ccw-vscode-bridge/README.md`**
- Installation & usage instructions
- API endpoint documentation
#### Features:
- ✅ Real-time VSCode LSP integration
- ✅ HTTP REST API for external tools
- ✅ CORS support
- ✅ Error handling
- ✅ Automatic VSCode feature detection
### 3. CCW MCP Tool
#### Created File:
**`ccw/src/tools/vscode-lsp.ts`**
**Features**:
- ✅ 4 LSP actions: get_definition, get_references, get_hover, get_document_symbols
- ✅ Zod schema validation
- ✅ HTTP client with timeout (10s)
- ✅ Connection retry logic
- ✅ Comprehensive error messages
**Parameters**:
- `action` (required): LSP action type
- `file_path` (required): Absolute file path
- `line` (optional): Line number (1-based)
- `character` (optional): Character position (1-based)
#### Integration:
**Modified File**: `ccw/src/tools/index.ts`
- ✅ Imported `vscodeLspMod`
- ✅ Registered tool via `registerTool(toLegacyTool(vscodeLspMod))`
- ✅ Available in MCP server tool list
---
## 🔧 技术变更
## 📋 Implementation Architecture
### 1. HybridSearchEngine 修改
### Three Integration Paths
**文件**: `codexlens/search/hybrid_search.py`
```
Path 1: VSCode Bridge (✅ Implemented)
─────────────────────────────────────
Claude Code → vscode_lsp MCP tool → HTTP → ccw-vscode-bridge → VSCode API → Language Servers
**变更内容**:
```python
def search(
self,
index_path: Path,
query: str,
limit: int = 20,
enable_fuzzy: bool = True,
enable_vector: bool = False,
pure_vector: bool = False, # ← 新增参数
) -> List[SearchResult]:
"""...
Args:
...
pure_vector: If True, only use vector search without FTS fallback
"""
backends = {}
Path 2: Standalone LSP Server (📝 Planned)
──────────────────────────────────────────
Any LSP Client → codexlens-lsp → Language Server Multiplexer → Language Servers
if pure_vector:
# 纯向量模式:只使用向量搜索
if enable_vector:
backends["vector"] = True
else:
# 无效配置警告
self.logger.warning(...)
backends["exact"] = True
else:
# 混合模式总是包含exact作为基线
backends["exact"] = True
if enable_fuzzy:
backends["fuzzy"] = True
if enable_vector:
backends["vector"] = True
Path 3: Index-Based (✅ Existing)
─────────────────────────────────
Claude Code → codex_lens_lsp → Python API → SQLite Index → Cached Results
```
**影响**:
- ✓ 向后兼容:`vector`模式行为不变vector + exact
- ✓ 新功能:`pure_vector=True`时仅使用向量搜索
- ✓ 错误处理无效配置时降级到exact搜索
### Smart Routing Strategy
### 2. ChainSearchEngine 修改
**文件**: `codexlens/search/chain_search.py`
**变更内容**:
```python
@dataclass
class SearchOptions:
"""...
Attributes:
...
pure_vector: If True, only use vector search without FTS fallback
"""
...
pure_vector: bool = False # ← 新增字段
def _search_single_index(
self,
...
pure_vector: bool = False, # ← 新增参数
...
):
"""...
Args:
...
pure_vector: If True, only use vector search without FTS fallback
"""
if hybrid_mode:
hybrid_engine = HybridSearchEngine(weights=hybrid_weights)
fts_results = hybrid_engine.search(
...
pure_vector=pure_vector, # ← 传递参数
)
```javascript
// Priority: VSCode Bridge → Standalone LSP → Index-based
if (vscodeBridgeAvailable) {
return useVSCodeBridge();
} else if (standaloneLSPAvailable) {
return useStandaloneLSP();
} else {
return useIndexBased();
}
```
**影响**:
-`SearchOptions`支持`pure_vector`配置
- ✓ 参数正确传递到底层`HybridSearchEngine`
- ✓ 多索引搜索时每个索引使用相同配置
### 3. CLI 命令修改
**文件**: `codexlens/cli/commands.py`
**变更内容**:
```python
@app.command()
def search(
...
mode: str = typer.Option(
"exact",
"--mode",
"-m",
help="Search mode: exact, fuzzy, hybrid, vector, pure-vector." # ← 更新帮助
),
...
):
"""...
Search Modes:
- exact: Exact FTS using unicode61 tokenizer (default)
- fuzzy: Fuzzy FTS using trigram tokenizer
- hybrid: RRF fusion of exact + fuzzy + vector (recommended)
- vector: Vector search with exact FTS fallback
- pure-vector: Pure semantic vector search only # ← 新增模式
Vector Search Requirements:
Vector search modes require pre-generated embeddings.
Use 'codexlens-embeddings generate' to create embeddings first.
"""
valid_modes = ["exact", "fuzzy", "hybrid", "vector", "pure-vector"] # ← 更新
# Map mode to options
...
elif mode == "pure-vector":
hybrid_mode, enable_fuzzy, enable_vector, pure_vector = True, False, True, True # ← 新增
...
options = SearchOptions(
...
pure_vector=pure_vector, # ← 传递参数
)
```
**影响**:
- ✓ CLI支持5种搜索模式
- ✓ 帮助文档清晰说明各模式差异
- ✓ 参数正确映射到`SearchOptions`
---
## 🧪 测试结果
## 🎯 Next Steps
### 测试套件test_pure_vector_search.py
### Immediate Actions (Phase 1)
1. **Test VSCode Bridge**
```bash
cd ccw-vscode-bridge
npm install
npm run compile
# Press F5 in VSCode to launch extension
```
2. **Test vscode_lsp Tool**
```bash
# Start CCW MCP server
cd ccw
npm run mcp
# Test via MCP client
{
"tool": "vscode_lsp",
"arguments": {
"action": "get_definition",
"file_path": "/path/to/file.ts",
"line": 10,
"character": 5
}
}
```
3. **Document Testing Results**
- Create test reports
- Benchmark latency
- Validate accuracy
### Medium-Term Goals (Phase 2-3)
1. **Implement Standalone LSP Server**
- Setup `codexlens-lsp` project structure
- Implement language server multiplexer
- Add core LSP handlers
2. **Add Position Tolerance**
- Implement fuzzy position matching
- Test with AI-generated positions
3. **Create Integration Tests**
- Unit tests for each component
- E2E tests with real language servers
- Performance benchmarks
### Long-Term Goals (Phase 4-5)
1. **MCP Context Enhancement**
- Integrate LSP results into MCP context
- Hook system for Claude Code
2. **Advanced Features**
- Code actions
- Formatting
- Rename support
3. **Production Deployment**
- Package VSCode extension to .vsix
- Publish to VS Code marketplace
- Create installation scripts
---
## 📊 Project Status Matrix
| Component | Status | Files | Tests | Docs |
|-----------|--------|-------|-------|------|
| VSCode Bridge Extension | ✅ Complete | 5/5 | ⏳ Pending | ✅ Complete |
| vscode_lsp MCP Tool | ✅ Complete | 1/1 | ⏳ Pending | ✅ Complete |
| Tool Registration | ✅ Complete | 1/1 | N/A | N/A |
| Planning Documents | ✅ Complete | 2/2 | N/A | ✅ Complete |
| Standalone LSP Server | 📝 Planned | 0/8 | 0/12 | ✅ Complete |
| Integration Tests | 📝 Planned | 0/3 | 0/15 | ⏳ Pending |
---
## 🔧 Development Environment
### Prerequisites
**For VSCode Bridge**:
- Node.js ≥ 18
- VSCode ≥ 1.80
- TypeScript ≥ 5.0
**For Standalone LSP**:
- Python ≥ 3.8
- pygls ≥ 1.3.0
- Language servers:
- TypeScript: `npm i -g typescript-language-server`
- Python: `pip install python-lsp-server`
- Go: `go install golang.org/x/tools/gopls@latest`
- Rust: `rustup component add rust-analyzer`
### Installation Commands
```bash
$ pytest tests/test_pure_vector_search.py -v
# VSCode Bridge
cd ccw-vscode-bridge
npm install
npm run compile
tests/test_pure_vector_search.py::TestPureVectorSearch
✓ test_pure_vector_without_embeddings PASSED
✓ test_vector_with_fallback PASSED
✓ test_pure_vector_invalid_config PASSED
✓ test_hybrid_mode_ignores_pure_vector PASSED
# CCW MCP (already setup)
cd ccw
npm install
tests/test_pure_vector_search.py::TestSearchModeComparison
✓ test_mode_comparison_without_embeddings PASSED
======================== 5 passed in 0.64s =========================
```
### 模式对比测试结果
```
Mode comparison (without embeddings):
exact: 1 results ← FTS精确匹配
fuzzy: 1 results ← FTS模糊匹配
vector: 1 results ← Vector模式回退到exact
pure_vector: 0 results ← Pure vector无嵌入时返回空 ✓ 预期行为
```
**关键验证**:
- ✅ 纯向量模式在无嵌入时正确返回空列表
- ✅ Vector模式保持向后兼容有FTS后备
- ✅ 所有模式参数映射正确
---
## 📊 性能影响
### 搜索延迟对比
基于测试数据100文件~500代码块无嵌入
| 模式 | 延迟 | 变化 |
|------|------|------|
| exact | 5.6ms | - (基线) |
| fuzzy | 7.7ms | +37% |
| vector (with fallback) | 7.4ms | +32% |
| **pure-vector (no embeddings)** | **2.1ms** | **-62%** ← 快速返回空 |
| hybrid | 9.0ms | +61% |
**分析**:
- ✓ Pure-vector模式在无嵌入时快速返回仅检查表存在性
- ✓ 有嵌入时pure-vector与vector性能相近~7ms
- ✓ 无额外性能开销
---
## 🚀 使用示例
### 命令行使用
```bash
# 1. 安装依赖
pip install codexlens[semantic]
# 2. 创建索引
codexlens init ~/projects/my-app
# 3. 生成嵌入
python scripts/generate_embeddings.py ~/.codexlens/indexes/my-app/_index.db
# 4. 使用纯向量搜索
codexlens search "how to authenticate users" --mode pure-vector
# 5. 使用向量搜索带FTS后备
codexlens search "authentication logic" --mode vector
# 6. 使用混合搜索(推荐)
codexlens search "user login" --mode hybrid
```
### Python API 使用
```python
from pathlib import Path
from codexlens.search.hybrid_search import HybridSearchEngine
engine = HybridSearchEngine()
# 纯向量搜索
results = engine.search(
index_path=Path("~/.codexlens/indexes/project/_index.db"),
query="verify user credentials",
enable_vector=True,
pure_vector=True, # ← 纯向量模式
)
# 向量搜索(带后备)
results = engine.search(
index_path=Path("~/.codexlens/indexes/project/_index.db"),
query="authentication",
enable_vector=True,
pure_vector=False, # ← 允许FTS后备
)
# Future: Standalone LSP
cd codex-lens
pip install -e ".[lsp]"
```
---
## 📝 文档创建
## 📖 Documentation Index
### 新增文档
1. **`PURE_VECTOR_SEARCH_GUIDE.md`** - 完整使用指南
- 快速开始教程
- 使用场景示例
- 故障排除指南
- API使用示例
- 技术细节说明
2. **`SEARCH_COMPARISON_ANALYSIS.md`** - 技术分析报告
- 问题诊断
- 架构分析
- 优化方案
- 实施路线图
3. **`SEARCH_ANALYSIS_SUMMARY.md`** - 快速总结
- 核心发现
- 快速修复步骤
- 下一步行动
4. **`IMPLEMENTATION_SUMMARY.md`** - 实施总结(本文档)
### 更新文档
- CLI帮助文档 (`codexlens search --help`)
- API文档字符串
- 测试文档注释
| Document | Purpose | Status |
|----------|---------|--------|
| `REAL_LSP_SERVER_PLAN.md` | Complete implementation plan | ✅ |
| `LSP_INTEGRATION_PLAN.md` | Original integration strategy | ✅ |
| `MCP_ENDPOINT_DESIGN.md` | MCP endpoint specifications | ✅ |
| `IMPLEMENTATION_SUMMARY.md` | This document | ✅ |
| `ccw-vscode-bridge/README.md` | Bridge usage guide | ✅ |
| `TESTING_GUIDE.md` | Testing procedures | ⏳ TODO |
| `DEPLOYMENT_GUIDE.md` | Production deployment | ⏳ TODO |
---
## 🔄 向后兼容性
## 💡 Key Design Decisions
### 保持兼容的设计决策
### 1. Why Three Integration Paths?
1. **默认值保持不变**
```python
def search(..., pure_vector: bool = False):
# 默认 False保持现有行为
- **VSCode Bridge**: Easiest setup, leverages VSCode's built-in language servers
- **Standalone LSP**: IDE-agnostic, works with any LSP client
- **Index-based**: Fallback for offline or cached queries
### 2. Why HTTP for VSCode Bridge?
- ✅ Simplest cross-process communication
- ✅ No complex IPC/socket management
- ✅ Easy to debug with curl/Postman
- ✅ CORS support for web-based tools
### 3. Why Port 3457?
- Unique port unlikely to conflict
- Easy to remember (345-7)
- Same approach as cclsp (uses stdio)
### 4. Why Not Modify smart_search?
User feedback:
> "第一种跟当前的符号搜索没区别哎"
> (Method 1 has no difference from current symbol search)
**Solution**: Implement real LSP server that connects to live language servers, not pre-indexed data.
---
## 🚀 Quick Start Guide
### Test VSCode Bridge Now
1. **Install Extension**:
```bash
cd ccw-vscode-bridge
npm install && npm run compile
code --install-extension .
```
2. **Vector模式行为不变**
```python
# 之前和之后行为相同
codexlens search "query" --mode vector
# → 总是返回结果vector + exact
2. **Reload VSCode**:
- Press `Cmd+Shift+P` (Mac) or `Ctrl+Shift+P` (Windows)
- Type "Reload Window"
3. **Verify Bridge is Running**:
```bash
curl http://localhost:3457/get_definition \
-X POST \
-H "Content-Type: application/json" \
-d '{"file_path":"/path/to/file.ts","line":10,"character":5}'
```
3. **新模式是可选的**
```python
# 用户可以继续使用现有模式
codexlens search "query" --mode exact
codexlens search "query" --mode hybrid
```
4. **API签名扩展**
```python
# 新参数是可选的,不破坏现有代码
engine.search(index_path, query) # ← 仍然有效
engine.search(index_path, query, pure_vector=True) # ← 新功能
4. **Test via CCW**:
```javascript
// In Claude Code or MCP client
await executeTool('vscode_lsp', {
action: 'get_definition',
file_path: '/absolute/path/to/file.ts',
line: 10,
character: 5
});
```
---
## 🐛 已知限制
## 📞 Support & Troubleshooting
### 当前限制
### Common Issues
1. **需要手动生成嵌入**
- 不会自动触发嵌入生成
- 需要运行独立脚本
**Issue**: "Could not connect to VSCode Bridge"
**Solution**:
1. Ensure VSCode is running
2. Check if extension is activated: `Cmd+Shift+P` → "CCW VSCode Bridge"
3. Verify port 3457 is not in use: `lsof -i :3457`
2. **无增量更新**
- 代码更新后需要完全重新生成嵌入
- 未来将支持增量更新
**Issue**: "No LSP server available"
**Solution**: Open the file in VSCode workspace first
3. **向量搜索比FTS慢**
- 约7ms vs 5ms单索引
- 可接受的折衷
### 缓解措施
- 文档清楚说明嵌入生成步骤
- 提供批量生成脚本
- 添加`--force`选项快速重新生成
**Issue**: "File not found"
**Solution**: Use absolute paths, not relative
---
## 🔮 后续优化计划
## 📝 Change Log
### ~~P1 - 短期1-2周~~ ✅ 已完成
- [x] ~~添加嵌入生成CLI命令~~ ✅
```bash
codexlens embeddings-generate /path/to/project
codexlens embeddings-generate /path/to/_index.db
```
- [x] ~~添加嵌入状态检查~~ ✅
```bash
codexlens embeddings-status # 检查所有索引
codexlens embeddings-status /path/to/project # 检查特定项目
```
- [x] ~~改进错误提示~~ ✅
- Pure-vector无嵌入时友好提示
- 指导用户如何生成嵌入
- 集成到搜索引擎日志中
### ❌ LLM语义增强功能已移除 (2025-12-16)
**移除原因**: 简化代码库,减少外部依赖
**已移除内容**:
- `src/codexlens/semantic/llm_enhancer.py` - LLM增强核心模块
- `src/codexlens/cli/commands.py` 中的 `enhance` 命令
- `tests/test_llm_enhancer.py` - LLM增强测试
- `tests/test_llm_enhanced_search.py` - LLM对比测试
- `scripts/compare_search_methods.py` - 对比测试脚本
- `scripts/test_misleading_comments.py` - 误导性注释测试
- `scripts/show_llm_analysis.py` - LLM分析展示脚本
- `scripts/inspect_llm_summaries.py` - LLM摘要检查工具
- `docs/LLM_ENHANCED_SEARCH_GUIDE.md` - LLM使用指南
- `docs/LLM_ENHANCEMENT_TEST_RESULTS.md` - LLM测试结果
- `docs/MISLEADING_COMMENTS_TEST_RESULTS.md` - 误导性注释测试结果
- `docs/CLI_INTEGRATION_SUMMARY.md` - CLI集成文档包含enhance命令
- `docs/DOCSTRING_LLM_HYBRID_DESIGN.md` - LLM混合策略设计
**保留功能**:
- ✅ 纯向量搜索 (pure_vector) 完整保留
- ✅ 语义嵌入生成 (`codexlens embeddings-generate`)
- ✅ 语义嵌入状态检查 (`codexlens embeddings-status`)
- ✅ 所有核心搜索功能
**历史记录**: LLM增强功能在测试中表现良好但为简化维护和减少外部依赖CCW CLI, Gemini/Qwen API而移除。设计文档DESIGN_EVALUATION_REPORT.md等保留作为历史参考。
### P2 - 中期1-2月
- [ ] 增量嵌入更新
- 检测文件变更
- 仅更新修改的文件
- [ ] 混合分块策略
- Symbol-based优先
- Sliding window补充
- [ ] 查询扩展
- 同义词展开
- 相关术语建议
### P3 - 长期3-6月
- [ ] FAISS集成
- 100x+搜索加速
- 大规模代码库支持
- [ ] 向量压缩
- PQ量化
- 减少50%存储空间
- [ ] 多模态搜索
- 代码 + 文档 + 注释统一搜索
### 2026-01-19 - Initial Implementation
- Created VSCode Bridge extension (5 files)
- Implemented vscode_lsp MCP tool
- Registered tool in CCW registry
- Completed planning documentation
- Added comprehensive architecture diagrams
---
## 📈 成功指标
### 功能指标
- ✅ 5种搜索模式全部工作
- ✅ 100%测试覆盖率
- ✅ 向后兼容性保持
- ✅ 文档完整且清晰
### 性能指标
- ✅ 纯向量延迟 < 10ms
- ✅ 混合搜索开销 < 2x
- ✅ 无嵌入时快速返回 (< 3ms)
### 用户体验指标
- ✅ CLI参数清晰直观
- ✅ 错误提示友好有用
- ✅ 文档易于理解
- ✅ API简单易用
---
## 🎯 总结
### 关键成就
1. **✅ 完成纯向量搜索功能**
- 3个核心组件修改
- 5个测试全部通过
- 完整文档和工具
2. **✅ 解决了初始问题**
- "Vector"模式语义不清晰 → 添加pure-vector模式
- 向量搜索返回空 → 提供嵌入生成工具
- 缺少使用指导 → 创建完整指南
3. **✅ 保持系统质量**
- 向后兼容
- 测试覆盖完整
- 性能影响可控
- 文档详尽
### 交付物
- ✅ 3个修改的源代码文件
- ✅ 1个嵌入生成脚本
- ✅ 1个测试套件5个测试
- ✅ 4个文档文件
### 下一步
1. **立即**用户可以开始使用pure-vector搜索
2. **短期**添加CLI嵌入管理命令
3. **中期**:实施增量更新和优化
4. **长期**高级特性FAISS、压缩、多模态
---
**实施完成!** 🎉
所有计划的功能已实现、测试并文档化。用户现在可以享受纯向量语义搜索的强大功能。
**Document End**

View File

@@ -0,0 +1,284 @@
# CodexLens MCP Endpoint Design
> Generated by Gemini Analysis | 2026-01-19
> Document Version: 1.0
## Overview
This document provides the complete MCP endpoint design for exposing codex-lens LSP capabilities through the Model Context Protocol.
## Related Files
- `src/codexlens/lsp/server.py` - Main LSP server initialization, component management, and capability declaration.
- `src/codexlens/lsp/handlers.py` - Implementation of handlers for core LSP requests (definition, references, completion, hover, workspace symbols).
- `src/codexlens/lsp/providers.py` - Helper classes, specifically `HoverProvider` for generating rich hover information.
- `src/codexlens/storage/global_index.py` - The backing data store (`GlobalSymbolIndex`) that powers most of the symbol lookups.
- `src/codexlens/search/__init__.py` - Exposes the `ChainSearchEngine`, used for advanced reference searching.
## Summary
The `codex-lens` LSP implementation exposes five core code navigation and search features: go to definition, find references, code completion, hover information, and workspace symbol search. These features are primarily powered by two components: `GlobalSymbolIndex` for fast, project-wide symbol lookups (used by definition, completion, hover, and workspace symbols) and `ChainSearchEngine` for advanced, relationship-aware reference finding.
The following MCP tool design externalizes these backend capabilities, allowing a client to leverage the same code intelligence features outside of an LSP context.
## MCP Tool Group: `code.symbol`
This group provides tools for searching and retrieving information about code symbols (functions, classes, etc.) within an indexed project.
---
### 1. `code.symbol.search`
**Description**: Searches for symbols across the entire indexed project, supporting prefix or contains matching. Ideal for implementing workspace symbol searches or providing code completion suggestions.
**Mapped LSP Features**: `workspace/symbol`, `textDocument/completion`
**Backend Implementation**: This tool directly maps to the `GlobalSymbolIndex.search` method.
- Reference: `src/codexlens/lsp/handlers.py:302` (in `lsp_workspace_symbol`)
- Reference: `src/codexlens/lsp/handlers.py:256` (in `lsp_completion`)
**Schema**:
```json
{
"name": "code.symbol.search",
"description": "Searches for symbols across the entire indexed project, supporting prefix or contains matching. Ideal for implementing workspace symbol searches or providing code completion suggestions.",
"inputSchema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The symbol name or prefix to search for."
},
"kind": {
"type": "string",
"description": "Optional: Filter results to only include symbols of a specific kind (e.g., 'function', 'class', 'method').",
"nullable": true
},
"prefix_mode": {
"type": "boolean",
"description": "If true, treats the query as a prefix (name LIKE 'query%'). If false, performs a contains search (name LIKE '%query%'). Defaults to true.",
"default": true
},
"limit": {
"type": "integer",
"description": "The maximum number of symbols to return.",
"default": 50
}
},
"required": ["query"]
}
}
```
**Returns**:
```typescript
Array<{
name: string; // The name of the symbol
kind: string; // The kind of the symbol (e.g., 'function', 'class')
file_path: string; // The absolute path to the file containing the symbol
range: {
start_line: number; // The 1-based starting line number
end_line: number; // The 1-based ending line number
}
}>
```
---
### 2. `code.symbol.findDefinition`
**Description**: Finds the definition location(s) for a symbol with an exact name match. This corresponds to a 'Go to Definition' feature.
**Mapped LSP Feature**: `textDocument/definition`
**Backend Implementation**: This tool uses `GlobalSymbolIndex.search` with `prefix_mode=False` and then filters for an exact name match.
- Reference: `src/codexlens/lsp/handlers.py:180` (in `lsp_definition`)
**Schema**:
```json
{
"name": "code.symbol.findDefinition",
"description": "Finds the definition location(s) for a symbol with an exact name match. This corresponds to a 'Go to Definition' feature.",
"inputSchema": {
"type": "object",
"properties": {
"symbol_name": {
"type": "string",
"description": "The exact name of the symbol to find."
},
"kind": {
"type": "string",
"description": "Optional: Disambiguate by providing the symbol kind (e.g., 'function', 'class').",
"nullable": true
}
},
"required": ["symbol_name"]
}
}
```
**Returns**:
```typescript
Array<{
name: string; // The name of the symbol
kind: string; // The kind of the symbol
file_path: string; // The absolute path to the file
range: {
start_line: number; // The 1-based starting line number
end_line: number; // The 1-based ending line number
}
}>
```
---
### 3. `code.symbol.findReferences`
**Description**: Finds all references to a symbol throughout the project. Uses advanced relationship analysis for accuracy where possible, falling back to name-based search.
**Mapped LSP Feature**: `textDocument/references`
**Backend Implementation**: This primarily uses `ChainSearchEngine.search_references` for accuracy, which is more powerful than a simple name search.
- Reference: `src/codexlens/lsp/handlers.py:218` (in `lsp_references`)
**Schema**:
```json
{
"name": "code.symbol.findReferences",
"description": "Finds all references to a symbol throughout the project. Uses advanced relationship analysis for accuracy where possible.",
"inputSchema": {
"type": "object",
"properties": {
"symbol_name": {
"type": "string",
"description": "The name of the symbol to find references for."
},
"context_path": {
"type": "string",
"description": "The source path of the current project or workspace root to provide context for the search."
},
"limit": {
"type": "integer",
"description": "The maximum number of references to return.",
"default": 200
}
},
"required": ["symbol_name", "context_path"]
}
}
```
**Returns**:
```typescript
Array<{
file_path: string; // The absolute path to the file containing the reference
line: number; // The 1-based line number of the reference
column: number; // The 0-based starting column of the reference
}>
```
---
### 4. `code.symbol.getHoverInfo`
**Description**: Retrieves rich information for a symbol, including its signature and location, suitable for displaying in a hover card.
**Mapped LSP Feature**: `textDocument/hover`
**Backend Implementation**: This tool encapsulates the logic from `HoverProvider`, which finds a symbol in `GlobalSymbolIndex` and then reads the source file to extract its signature.
- Reference: `src/codexlens/lsp/handlers.py:285` (instantiates `HoverProvider`)
- Reference: `src/codexlens/lsp/providers.py:53` (in `HoverProvider.get_hover_info`)
**Schema**:
```json
{
"name": "code.symbol.getHoverInfo",
"description": "Retrieves rich information for a symbol, including its signature and location, suitable for displaying in a hover card.",
"inputSchema": {
"type": "object",
"properties": {
"symbol_name": {
"type": "string",
"description": "The exact name of the symbol to get hover information for."
}
},
"required": ["symbol_name"]
}
}
```
**Returns**:
```typescript
{
name: string; // The name of the symbol
kind: string; // The kind of the symbol
signature: string; // The full code signature as extracted from source
file_path: string; // The absolute path to the file
start_line: number; // The 1-based starting line number
} | null // null if symbol not found
```
---
## Integration with CCW MCP Manager
The `codex-lens-tools` MCP server should be added to the recommended MCP servers list in `ccw/src/templates/dashboard-js/components/mcp-manager.js`:
```javascript
{
id: 'codex-lens-tools',
nameKey: 'mcp.codexLens.name',
descKey: 'mcp.codexLens.desc',
icon: 'search-code',
category: 'code-intelligence',
fields: [
{
key: 'toolSelection',
labelKey: 'mcp.codexLens.field.tools',
type: 'multi-select',
options: [
{ value: 'symbol.search', label: 'Symbol Search' },
{ value: 'symbol.findDefinition', label: 'Find Definition' },
{ value: 'symbol.findReferences', label: 'Find References' },
{ value: 'symbol.getHoverInfo', label: 'Hover Information' }
],
default: ['symbol.search', 'symbol.findDefinition', 'symbol.findReferences'],
required: true,
descKey: 'mcp.codexLens.field.tools.desc'
}
],
buildConfig: (values) => {
const tools = values.toolSelection || [];
const env = { CODEXLENS_ENABLED_TOOLS: tools.join(',') };
return buildCrossPlatformMcpConfig('npx', ['-y', 'codex-lens-mcp'], { env });
}
}
```
## Tool Naming Convention
- **Namespace**: `code.*` for code intelligence tools
- **Category**: `symbol` for symbol-related operations
- **Operation**: Descriptive verb (search, findDefinition, findReferences, getHoverInfo)
- **Full Pattern**: `code.symbol.<operation>`
This naming scheme aligns with MCP conventions and is easily extensible for future categories (e.g., `code.types.*`, `code.imports.*`).
## Future Enhancements
1. **Document Symbol Tool** (`code.symbol.getDocumentSymbols`)
- Maps LSP `textDocument/documentSymbol`
- Returns all symbols in a specific file
2. **Type Information** (`code.type.*` group)
- Type definitions and relationships
- Generic resolution
3. **Relationship Analysis** (`code.relation.*` group)
- Call hierarchy
- Inheritance chains
- Import dependencies
---
Generated: 2026-01-19
Status: Ready for Implementation

View File

@@ -0,0 +1,825 @@
# CodexLens Real LSP Server Implementation Plan
> **Version**: 2.0
> **Status**: Ready for Implementation
> **Based on**: Existing LSP_INTEGRATION_PLAN.md + Real Language Server Integration
> **Goal**: Implement true LSP server functionality (like cclsp), not pre-indexed search
---
## Executive Summary
### Current State vs Target State
| Aspect | Current (Pre-indexed) | Target (Real LSP) |
|--------|----------------------|-------------------|
| **Data Source** | Cached database index | Live language servers |
| **Freshness** | Stale (depends on re-index) | Real-time (LSP protocol) |
| **Accuracy** | Good for indexed content | Perfect (from language server) |
| **Latency** | <50ms (database) | ~50-200ms (LSP) |
| **Language Support** | Limited to parsed symbols | Full LSP support (all languages) |
| **Complexity** | Simple (DB queries) | High (LSP protocol + server mgmt) |
### Why Real LSP vs Index-Based
**Problem with current approach**:
- 符号搜索与smart_search没有本质区别
- 依赖预索引数据,不能实时反映代码变化
- 不支持advanced LSP功能(rename, code actions等)
**Advantages of real LSP**:
- ✅ Real-time code intelligence
- ✅ Supported by all major IDEs (VSCode, Neovim, Sublime, etc.)
- ✅ Standard protocol (Language Server Protocol)
- ✅ Advanced features: rename, code actions, formatting
- ✅ Language-agnostic (TypeScript, Python, Go, Rust, Java, etc.)
---
## Architecture Design
### System Architecture
```
┌─────────────────────────────────────────────────────────┐
│ Client Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ VS Code │ │ Neovim │ │ Sublime │ │
│ │ (LSP Client) │ │ (LSP Client) │ │ (LSP Client) │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
└─────────┼─────────────────┼─────────────────┼───────────┘
│ LSP Protocol │ │
│ (JSON-RPC/stdio)│ │
┌─────────▼─────────────────▼─────────────────▼───────────┐
│ CodexLens LSP Server Bridge │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ LSP Protocol Handler (pygls) │ │
│ │ • initialize / shutdown │ │
│ │ • textDocument/definition │ │
│ │ • textDocument/references │ │
│ │ • textDocument/hover │ │
│ │ • textDocument/completion │ │
│ │ • textDocument/formatting │ │
│ │ • workspace/symbol │ │
│ └────────────────────┬────────────────────────────────┘ │
│ │ │
│ ┌────────────────────▼────────────────────────────────┐ │
│ │ Language Server Multiplexer │ │
│ │ • File type routing (ts→tsserver, py→pylsp, etc.) │ │
│ │ • Multi-server management │ │
│ │ • Request forwarding & response formatting │ │
│ └────────────────────┬────────────────────────────────┘ │
│ │ │
│ ┌────────────────────▼────────────────────────────────┐ │
│ │ Language Servers (Spawned) │ │
│ │ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │ │
│ │ │tsserver│ │ pylsp │ │ gopls │ │rust- │ │ │
│ │ │ │ │ │ │ │ │analyzer│ │ │
│ │ └────────┘ └────────┘ └────────┘ └────────┘ │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Codex-Lens Core (Optional - MCP Layer) │ │
│ │ • Semantic search │ │
│ │ • Custom MCP tools (enrich_prompt, etc.) │ │
│ │ • Hook system (pre-tool, post-tool) │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
```
### Key Differences from Index-Based Approach
1. **Request Flow**
- Index: Query → Database → Results
- LSP: Request → Route to LS → LS processes live code → Results
2. **Configuration**
- Index: Depends on indexing state
- LSP: Depends on installed language servers
3. **Latency Profile**
- Index: Consistent (~50ms)
- LSP: Variable (50-500ms depending on LS performance)
---
## Implementation Phases
### Phase 1: LSP Server Bridge (Foundation)
**Duration**: ~3-5 days
**Complexity**: Medium
**Dependencies**: pygls library
#### 1.1 Setup & Dependencies
**File**: `pyproject.toml`
```toml
[project.optional-dependencies]
lsp = [
"pygls>=1.3.0",
"lsprotocol>=2023.0.0",
]
[project.scripts]
codexlens-lsp = "codexlens.lsp.server:main"
```
**Installation**:
```bash
pip install -e ".[lsp]"
```
#### 1.2 LSP Server Core
**Files to create**:
1. `src/codexlens/lsp/__init__.py` - Package init
2. `src/codexlens/lsp/server.py` - Server entry point
3. `src/codexlens/lsp/multiplexer.py` - LS routing & management
4. `src/codexlens/lsp/handlers.py` - LSP request handlers
**Key responsibilities**:
- Initialize LSP server via pygls
- Handle client capabilities negotiation
- Route requests to appropriate language servers
- Format language server responses to LSP format
#### 1.3 Acceptance Criteria
- [ ] Server starts with `codexlens-lsp --stdio`
- [ ] Responds to `initialize` request
- [ ] Spawns language servers on demand
- [ ] Handles `shutdown` cleanly
- [ ] No crashes on malformed requests
---
### Phase 2: Language Server Multiplexer
**Duration**: ~5-7 days
**Complexity**: High
**Dependencies**: Phase 1 complete
#### 2.1 Multi-Server Management
**File**: `src/codexlens/lsp/multiplexer.py`
**Responsibilities**:
- Spawn language servers based on file extension
- Maintain server process lifecycle
- Route requests by document type
- Handle server crashes & restarts
**Supported Language Servers**:
| Language | Server | Installation |
|----------|--------|--------------|
| TypeScript | `typescript-language-server` | `npm i -g typescript-language-server` |
| Python | `pylsp` | `pip install python-lsp-server` |
| Go | `gopls` | `go install golang.org/x/tools/gopls@latest` |
| Rust | `rust-analyzer` | `rustup component add rust-analyzer` |
| Java | `jdtls` | Download JDTLS |
| C/C++ | `clangd` | `apt install clangd` |
#### 2.2 Configuration
**File**: `codexlens-lsp.json` (user config)
```json
{
"languageServers": {
"typescript": {
"command": ["typescript-language-server", "--stdio"],
"extensions": ["ts", "tsx", "js", "jsx"],
"rootDir": "."
},
"python": {
"command": ["pylsp"],
"extensions": ["py", "pyi"],
"rootDir": ".",
"settings": {
"pylsp": {
"plugins": {
"pycodestyle": { "enabled": true },
"pylint": { "enabled": false }
}
}
}
},
"go": {
"command": ["gopls"],
"extensions": ["go"],
"rootDir": "."
},
"rust": {
"command": ["rust-analyzer"],
"extensions": ["rs"],
"rootDir": "."
}
},
"debug": false,
"logLevel": "info"
}
```
#### 2.3 Acceptance Criteria
- [ ] Routes requests to correct LS based on file type
- [ ] Spawns servers on first request
- [ ] Reuses existing server instances
- [ ] Handles server restarts on crash
- [ ] Respects initialization options from config
---
### Phase 3: Core LSP Handlers
**Duration**: ~5-7 days
**Complexity**: Medium
**Dependencies**: Phase 1-2 complete
#### 3.1 Essential Handlers
Implement LSP request handlers for core functionality:
**Handler Mapping**:
```python
Handlers = {
# Navigation
"textDocument/definition": handle_definition,
"textDocument/references": handle_references,
"textDocument/declaration": handle_declaration,
# Hover & Info
"textDocument/hover": handle_hover,
"textDocument/signatureHelp": handle_signature_help,
# Completion
"textDocument/completion": handle_completion,
"completionItem/resolve": handle_completion_resolve,
# Symbols
"textDocument/documentSymbol": handle_document_symbols,
"workspace/symbol": handle_workspace_symbols,
# Editing
"textDocument/formatting": handle_formatting,
"textDocument/rangeFormatting": handle_range_formatting,
"textDocument/rename": handle_rename,
# Diagnostics
"textDocument/publishDiagnostics": handle_publish_diagnostics,
# Misc
"textDocument/codeAction": handle_code_action,
"textDocument/codeLens": handle_code_lens,
}
```
#### 3.2 Request Forwarding Logic
```python
def forward_request_to_lsp(handler_name, params):
"""Forward request to appropriate language server."""
# Extract document info
document_uri = params.get("textDocument", {}).get("uri")
file_ext = extract_extension(document_uri)
# Get language server
ls = multiplexer.get_server(file_ext)
if not ls:
return {"error": f"No LS for {file_ext}"}
# Convert position (1-based → 0-based)
normalized_params = normalize_positions(params)
# Forward to LS
response = ls.send_request(handler_name, normalized_params)
# Convert response format
return normalize_response(response)
```
#### 3.3 Acceptance Criteria
- [ ] All handlers implemented and tested
- [ ] Proper position coordinate conversion (LSP is 0-based, user-facing is 1-based)
- [ ] Error handling for missing language servers
- [ ] Response formatting matches LSP spec
- [ ] Latency < 500ms for 95th percentile
---
### Phase 4: Advanced Features
**Duration**: ~3-5 days
**Complexity**: Medium
**Dependencies**: Phase 1-3 complete
#### 4.1 Position Tolerance (cclsp-like feature)
Some LSP clients (like Claude Code with fuzzy positions) may send imprecise positions. Implement retry logic:
```python
def find_symbol_with_tolerance(ls, uri, position, max_attempts=5):
"""Try multiple position offsets if exact position fails."""
positions_to_try = [
position, # Original
(position.line - 1, position.char), # One line up
(position.line + 1, position.char), # One line down
(position.line, max(0, position.char - 1)), # One char left
(position.line, position.char + 1), # One char right
]
for pos in positions_to_try:
try:
result = ls.send_request("textDocument/definition", {
"textDocument": {"uri": uri},
"position": pos
})
if result:
return result
except:
continue
return None
```
#### 4.2 MCP Integration (Optional)
Extend with MCP provider for Claude Code hooks:
```python
class MCPBridgeHandler:
"""Bridge LSP results into MCP context."""
def build_mcp_context_from_lsp(self, symbol_name, lsp_results):
"""Convert LSP responses to MCP context."""
# Implementation
pass
```
#### 4.3 Acceptance Criteria
- [ ] Position tolerance working (≥3 positions tried)
- [ ] MCP context generation functional
- [ ] Hook system integration complete
- [ ] All test coverage > 80%
---
### Phase 5: Deployment & Documentation
**Duration**: ~2-3 days
**Complexity**: Low
**Dependencies**: Phase 1-4 complete
#### 5.1 Installation & Setup Guide
Create comprehensive documentation:
- Installation instructions for each supported language
- Configuration guide
- Troubleshooting
- Performance tuning
#### 5.2 CLI Tools
```bash
# Start LSP server
codexlens-lsp --stdio
# Check configured language servers
codexlens-lsp --list-servers
# Validate configuration
codexlens-lsp --validate-config
# Show logs
codexlens-lsp --log-level debug
```
#### 5.3 Acceptance Criteria
- [ ] Documentation complete with examples
- [ ] All CLI commands working
- [ ] Integration tested with VS Code, Neovim
- [ ] Performance benchmarks documented
---
## Module Structure
```
src/codexlens/lsp/
├── __init__.py # Package exports
├── server.py # LSP server entry point
├── multiplexer.py # Language server manager
├── handlers.py # LSP request handlers
├── position_utils.py # Coordinate conversion utilities
├── process_manager.py # Language server process lifecycle
├── response_formatter.py # LSP response formatting
└── config.py # Configuration loading
tests/lsp/
├── test_multiplexer.py # LS routing tests
├── test_handlers.py # Handler tests
├── test_position_conversion.py # Coordinate tests
├── test_integration.py # Full LSP handshake
└── fixtures/
├── sample_python.py # Test files
└── sample_typescript.ts
```
---
## Dependency Graph
```
Phase 5 (Deployment)
Phase 4 (Advanced Features)
Phase 3 (Core Handlers)
├─ Depends on: Phase 2
├─ Depends on: Phase 1
└─ Deliverable: Full LSP functionality
Phase 2 (Multiplexer)
├─ Depends on: Phase 1
└─ Deliverable: Multi-server routing
Phase 1 (Server Bridge)
└─ Deliverable: Basic LSP server
```
---
## Technology Stack
| Component | Technology | Rationale |
|-----------|-----------|-----------|
| LSP Implementation | `pygls` | Mature, well-maintained |
| Protocol | LSP 3.17+ | Latest stable version |
| Process Management | `subprocess` + `psutil` | Standard Python, no external deps |
| Configuration | JSON | Simple, widely understood |
| Logging | `logging` module | Built-in, standard |
| Testing | `pytest` + `pytest-asyncio` | Industry standard |
---
## Risk Assessment
| Risk | Probability | Impact | Mitigation |
|------|-------------|--------|------------|
| Language server crashes | Medium | High | Auto-restart with exponential backoff |
| Configuration errors | Medium | Medium | Validation on startup |
| Performance degradation | Low | High | Implement caching + benchmarks |
| Position mismatch issues | Medium | Low | Tolerance layer (try multiple positions) |
| Memory leaks (long sessions) | Low | Medium | Connection pooling + cleanup timers |
---
## Success Metrics
1. **Functionality**: All 7 core LSP handlers working
2. **Performance**: p95 latency < 500ms for typical requests
3. **Reliability**: 99.9% uptime in production
4. **Coverage**: >80% code coverage
5. **Documentation**: Complete with examples
6. **Multi-language**: Support for 5+ languages
---
## Comparison: This Approach vs Alternatives
### Option A: Real LSP Server (This Plan) ✅ RECOMMENDED
**Pros**:
- ✅ True real-time code intelligence
- ✅ Supports all LSP clients (VSCode, Neovim, Sublime, Emacs, etc.)
- ✅ Advanced features (rename, code actions, formatting)
- ✅ Language-agnostic
- ✅ Follows industry standard protocol
**Cons**:
- ❌ More complex implementation
- ❌ Depends on external language servers
- ❌ Higher latency than index-based
**Effort**: ~20-25 days
---
### Option B: Enhanced Index-Based (Current Approach)
**Pros**:
- ✅ Simple implementation
- ✅ Fast (<50ms)
- ✅ No external dependencies
**Cons**:
- ❌ Same as smart_search (user's concern)
- ❌ Stale data between re-indexes
- ❌ Limited to indexed symbols
- ❌ No advanced LSP features
**Effort**: ~5-10 days
---
### Option C: Hybrid (LSP + Index)
**Pros**:
- ✅ Real-time from LSP
- ✅ Fallback to index
- ✅ Best of both worlds
**Cons**:
- ❌ Highest complexity
- ❌ Difficult to debug conflicts
- ❌ Higher maintenance burden
**Effort**: ~30-35 days
---
## Next Steps
1. **Approve Plan**: Confirm this approach matches requirements
2. **Setup Dev Environment**: Install language servers
3. **Phase 1 Implementation**: Start with server bridge
4. **Iterative Testing**: Test each phase with real IDE integration
5. **Documentation**: Maintain docs as implementation progresses
---
---
## Appendix A: VSCode Bridge Implementation
### A.1 Overview
VSCode Bridge 是另一种集成方式通过VSCode扩展暴露其内置LSP功能给外部工具如CCW MCP Server
**Architecture**:
```
┌─────────────────────────────────────────────────────────────────┐
│ Claude Code / CCW │
│ (MCP Client / CLI) │
└───────────────────────────┬─────────────────────────────────────┘
│ MCP Tool Call (vscode_lsp)
┌───────────────────────────▼─────────────────────────────────────┐
│ CCW MCP Server │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ vscode_lsp Tool │ │
│ │ • HTTP client to VSCode Bridge │ │
│ │ • Parameter validation (Zod) │ │
│ │ • Response formatting │ │
│ └────────────────────────┬────────────────────────────────────┘ │
└───────────────────────────┼─────────────────────────────────────┘
│ HTTP POST (localhost:3457)
┌───────────────────────────▼─────────────────────────────────────┐
│ ccw-vscode-bridge Extension │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ HTTP Server (port 3457) │ │
│ │ Endpoints: │ │
│ │ • POST /get_definition │ │
│ │ • POST /get_references │ │
│ │ • POST /get_hover │ │
│ │ • POST /get_document_symbols │ │
│ └────────────────────────┬────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────▼────────────────────────────────────┐ │
│ │ VSCode API Calls │ │
│ │ vscode.commands.executeCommand(): │ │
│ │ • vscode.executeDefinitionProvider │ │
│ │ • vscode.executeReferenceProvider │ │
│ │ • vscode.executeHoverProvider │ │
│ │ • vscode.executeDocumentSymbolProvider │ │
│ └─────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│ VSCode LSP Integration
┌───────────────────────────▼─────────────────────────────────────┐
│ VSCode Language Services │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │TypeScript│ │ Python │ │ Go │ │ Rust │ │
│ │ Server │ │ Server │ │ (gopls) │ │Analyzer │ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │
└─────────────────────────────────────────────────────────────────┘
```
### A.2 Component Files
**已创建的文件**:
1. `ccw-vscode-bridge/package.json` - VSCode扩展配置
2. `ccw-vscode-bridge/tsconfig.json` - TypeScript配置
3. `ccw-vscode-bridge/src/extension.ts` - 扩展主代码
4. `ccw-vscode-bridge/.vscodeignore` - 打包排除文件
5. `ccw-vscode-bridge/README.md` - 使用文档
**待创建的文件**:
1. `ccw/src/tools/vscode-lsp.ts` - MCP工具实现
2. `ccw/src/tools/index.ts` - 注册新工具
### A.3 VSCode Bridge Extension Implementation
**File**: `ccw-vscode-bridge/src/extension.ts`
```typescript
// 核心功能:
// 1. 启动HTTP服务器监听3457端口
// 2. 接收POST请求解析JSON body
// 3. 调用VSCode内置LSP命令
// 4. 返回JSON结果
// HTTP Endpoints:
// POST /get_definition → vscode.executeDefinitionProvider
// POST /get_references → vscode.executeReferenceProvider
// POST /get_hover → vscode.executeHoverProvider
// POST /get_document_symbols → vscode.executeDocumentSymbolProvider
```
### A.4 MCP Tool Implementation
**File**: `ccw/src/tools/vscode-lsp.ts`
```typescript
/**
* MCP tool that communicates with VSCode Bridge extension.
*
* Actions:
* - get_definition: Find symbol definition
* - get_references: Find all references
* - get_hover: Get hover information
* - get_document_symbols: List symbols in file
*
* Required:
* - ccw-vscode-bridge extension running in VSCode
* - File must be open in VSCode for accurate results
*/
const schema: ToolSchema = {
name: 'vscode_lsp',
description: `Access live VSCode LSP features...`,
inputSchema: {
type: 'object',
properties: {
action: { type: 'string', enum: [...] },
file_path: { type: 'string' },
line: { type: 'number' },
character: { type: 'number' }
},
required: ['action', 'file_path']
}
};
```
### A.5 Advantages vs Standalone LSP Server
| Feature | VSCode Bridge | Standalone LSP Server |
|---------|--------------|----------------------|
| **Setup Complexity** | Low (VSCode ext) | Medium (multiple LS) |
| **Language Support** | Automatic (VSCode) | Manual config |
| **Maintenance** | Low | Medium |
| **IDE Independence** | VSCode only | Any LSP client |
| **Performance** | Good | Good |
| **Advanced Features** | Full VSCode support | LSP standard |
---
## Appendix B: Complete Integration Architecture
### B.1 Three Integration Paths
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ CodexLens Integration Paths │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Path 1: VSCode Bridge (HTTP) Path 2: Standalone LSP Server │
│ ──────────────────────── ───────────────────────────── │
│ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ CCW MCP │ │ Any LSP │ │
│ │ vscode_lsp │ │ Client │ │
│ └──────┬──────┘ └──────┬──────┘ │
│ │ HTTP │ LSP/stdio │
│ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ ccw-vscode │ │ codexlens- │ │
│ │ -bridge │ │ lsp │ │
│ └──────┬──────┘ └──────┬──────┘ │
│ │ VSCode API │ Child Process │
│ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ VSCode │ │ pylsp │ │
│ │ LS │ │ tsserver │ │
│ └─────────────┘ │ gopls │ │
│ └─────────────┘ │
│ │
│ Path 3: Index-Based (Current) │
│ ───────────────────────────── │
│ │
│ ┌─────────────┐ │
│ │ CCW MCP │ │
│ │codex_lens_lsp│ │
│ └──────┬──────┘ │
│ │ Python subprocess │
│ ▼ │
│ ┌─────────────┐ │
│ │ CodexLens │ │
│ │ Index DB │ │
│ └─────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
### B.2 Recommendation Matrix
| Use Case | Recommended Path | Reason |
|----------|-----------------|--------|
| Claude Code + VSCode | Path 1: VSCode Bridge | Simplest, full VSCode features |
| CLI-only workflows | Path 2: Standalone LSP | No VSCode dependency |
| Quick search across indexed code | Path 3: Index-based | Fastest response |
| Multi-IDE support | Path 2: Standalone LSP | Standard protocol |
| Advanced refactoring | Path 1: VSCode Bridge | Full VSCode capabilities |
### B.3 Hybrid Mode (Recommended)
For maximum flexibility, implement all three paths:
```javascript
// Smart routing in CCW
function selectLSPPath(request) {
// 1. Try VSCode Bridge first (if available)
if (await checkVSCodeBridge()) {
return "vscode_bridge";
}
// 2. Fall back to Standalone LSP
if (await checkStandaloneLSP(request.fileType)) {
return "standalone_lsp";
}
// 3. Last resort: Index-based
return "index_based";
}
```
---
## Appendix C: Implementation Tasks Summary
### C.1 VSCode Bridge Tasks
| Task ID | Description | Priority | Status |
|---------|-------------|----------|--------|
| VB-1 | Create ccw-vscode-bridge extension structure | High | ✅ Done |
| VB-2 | Implement HTTP server in extension.ts | High | ✅ Done |
| VB-3 | Create vscode_lsp MCP tool | High | 🔄 Pending |
| VB-4 | Register tool in CCW | High | 🔄 Pending |
| VB-5 | Test with VSCode | Medium | 🔄 Pending |
| VB-6 | Add connection retry logic | Low | 🔄 Pending |
### C.2 Standalone LSP Server Tasks
| Task ID | Description | Priority | Status |
|---------|-------------|----------|--------|
| LSP-1 | Setup pygls project structure | High | 🔄 Pending |
| LSP-2 | Implement multiplexer | High | 🔄 Pending |
| LSP-3 | Core handlers (definition, references) | High | 🔄 Pending |
| LSP-4 | Position tolerance | Medium | 🔄 Pending |
| LSP-5 | Tests and documentation | Medium | 🔄 Pending |
### C.3 Integration Tasks
| Task ID | Description | Priority | Status |
|---------|-------------|----------|--------|
| INT-1 | Smart path routing | Medium | 🔄 Pending |
| INT-2 | Unified error handling | Medium | 🔄 Pending |
| INT-3 | Performance benchmarks | Low | 🔄 Pending |
---
## Questions for Clarification
Before implementation, confirm:
1. **Implementation Priority**: Start with VSCode Bridge (simpler) or Standalone LSP (more general)?
2. **Language Priority**: Which languages are most important? (TypeScript, Python, Go, Rust, etc.)
3. **IDE Focus**: Target VS Code first, then others?
4. **Fallback Strategy**: Should we keep index-based search as fallback if LSP fails?
5. **Caching**: How much should we cache LS responses?
6. **Configuration**: Simple JSON config or more sophisticated format?

View File

@@ -51,7 +51,7 @@ def find_definition(
# Get project info from registry
registry = RegistryStore()
project_info = registry.get_project_by_source(str(project_path))
project_info = registry.get_project(project_path)
if project_info is None:
raise IndexNotFoundError(f"Project not indexed: {project_path}")

View File

@@ -71,7 +71,7 @@ def file_context(
# Get project info from registry
registry = RegistryStore()
project_info = registry.get_project_by_source(str(project_path))
project_info = registry.get_project(project_path)
if project_info is None:
raise IndexNotFoundError(f"Project not indexed: {project_path}")

View File

@@ -43,7 +43,7 @@ def get_hover(
# Get project info from registry
registry = RegistryStore()
project_info = registry.get_project_by_source(str(project_path))
project_info = registry.get_project(project_path)
if project_info is None:
raise IndexNotFoundError(f"Project not indexed: {project_path}")

View File

@@ -139,8 +139,8 @@ def find_references(
# Initialize infrastructure
config = Config()
registry = RegistryStore(config.registry_db_path)
mapper = PathMapper(config.index_root)
registry = RegistryStore()
mapper = PathMapper(config.index_dir)
# Create chain search engine
engine = ChainSearchEngine(registry, mapper, config=config)

View File

@@ -51,7 +51,7 @@ def workspace_symbols(
# Get project info from registry
registry = RegistryStore()
project_info = registry.get_project_by_source(str(project_path))
project_info = registry.get_project(project_path)
if project_info is None:
raise IndexNotFoundError(f"Project not indexed: {project_path}")

View File

@@ -53,3 +53,7 @@ class StorageError(CodexLensError):
class SearchError(CodexLensError):
"""Raised when a search operation fails."""
class IndexNotFoundError(CodexLensError):
"""Raised when a project's index cannot be found."""