mirror of
https://github.com/catlog22/Claude-Code-Workflow.git
synced 2026-02-05 01:50:27 +08:00
- Updated package name from `codexlens` to `codex-lens` in all relevant files to ensure consistency with `pyproject.toml`. - Enhanced `findLocalPackagePath()` to always search for local paths, even when running from `node_modules`. - Removed fallback logic for PyPI installation in several functions, providing clearer error messages for local installation failures. - Added detailed documentation on installation steps and error handling for local development packages. - Introduced a new summary document outlining the issues and fixes related to CodexLens installation.
3.5 KiB
3.5 KiB
Memory Embedder
Bridge CCW to CodexLens semantic search by generating and searching embeddings for memory chunks.
Features
- Generate embeddings for memory chunks using CodexLens's jina-embeddings-v2-base-code (768 dim)
- Semantic search across all memory types (core_memory, workflow, cli_history)
- Status tracking to monitor embedding progress
- Batch processing for efficient embedding generation
- Restore commands included in search results
Requirements
pip install numpy codex-lens[semantic]
Usage
1. Check Status
python scripts/memory_embedder.py status <db_path>
Example output:
{
"total_chunks": 150,
"embedded_chunks": 100,
"pending_chunks": 50,
"by_type": {
"core_memory": {"total": 80, "embedded": 60, "pending": 20},
"workflow": {"total": 50, "embedded": 30, "pending": 20},
"cli_history": {"total": 20, "embedded": 10, "pending": 10}
}
}
2. Generate Embeddings
Embed all unembedded chunks:
python scripts/memory_embedder.py embed <db_path>
Embed specific source:
python scripts/memory_embedder.py embed <db_path> --source-id CMEM-20250101-120000
Re-embed all chunks (force):
python scripts/memory_embedder.py embed <db_path> --force
Adjust batch size (default 8):
python scripts/memory_embedder.py embed <db_path> --batch-size 16
Example output:
{
"success": true,
"chunks_processed": 50,
"chunks_failed": 0,
"elapsed_time": 12.34
}
3. Semantic Search
Basic search:
python scripts/memory_embedder.py search <db_path> "authentication flow"
Advanced search:
python scripts/memory_embedder.py search <db_path> "rate limiting" \
--top-k 5 \
--min-score 0.5 \
--type workflow
Example output:
{
"success": true,
"matches": [
{
"source_id": "WFS-20250101-auth",
"source_type": "workflow",
"chunk_index": 2,
"content": "Implemented JWT-based authentication...",
"score": 0.8542,
"restore_command": "ccw session resume WFS-20250101-auth"
}
]
}
Database Path
The database is located in CCW's storage directory:
- Windows:
%USERPROFILE%\.ccw\projects\<project-id>\core-memory\core_memory.db - Linux/Mac:
~/.ccw/projects/<project-id>/core-memory/core_memory.db
Find your project's database:
ccw memory list # Shows project path
# Then look in: ~/.ccw/projects/<hashed-path>/core-memory/core_memory.db
Integration with CCW
This script is designed to be called from CCW's TypeScript code:
import { execSync } from 'child_process';
// Embed chunks
const result = execSync(
`python scripts/memory_embedder.py embed ${dbPath}`,
{ encoding: 'utf-8' }
);
const { success, chunks_processed } = JSON.parse(result);
// Search
const searchResult = execSync(
`python scripts/memory_embedder.py search ${dbPath} "${query}" --top-k 10`,
{ encoding: 'utf-8' }
);
const { matches } = JSON.parse(searchResult);
Performance
- Embedding speed: ~8 chunks/second (batch size 8)
- Search speed: ~0.1-0.5 seconds for 1000 chunks
- Model loading: ~0.8 seconds (cached after first use)
Source Types
core_memory: Strategic architectural contextworkflow: Session-based development historycli_history: Command execution logs
Restore Commands
Search results include restore commands:
- core_memory/cli_history:
ccw memory export <source_id> - workflow:
ccw session resume <source_id>