Commit Graph

48 Commits

Author SHA1 Message Date
catlog22
09d99abee6 Issue Queue: issue-exec-20260106-160325 (#52)
* feat(security): Secure dashboard server by default

## Solution Summary
- Solution-ID: SOL-DSC-002-1
- Issue-ID: DSC-002

## Tasks Completed
- [T1] JWT token manager (24h expiry, persisted secret/token)
- [T2] API auth middleware + localhost token endpoint
- [T3] Default bind 127.0.0.1, add --host with warning
- [T4] Localhost-only CORS with credentials + Vary
- [T5] SECURITY.md documentation + README link

## Verification
- npm run build
- npm test -- ccw/tests/token-manager.test.ts ccw/tests/middleware.test.ts ccw/tests/server-auth.integration.test.ts ccw/tests/server.test.ts ccw/tests/cors.test.ts

* fix(security): Prevent command injection in Windows spawn()

## Solution Summary
- **Solution-ID**: SOL-DSC-001-1
- **Issue-ID**: DSC-001
- **Risk/Impact/Complexity**: high/high/medium

## Tasks Completed
- [T1] Create Windows shell escape utility
- [T2] Escape cli-executor spawn() args on Windows
- [T3] Add command injection regression tests

## Files Modified
- ccw/src/utils/shell-escape.ts
- ccw/src/tools/cli-executor.ts
- ccw/tests/shell-escape.test.ts
- ccw/tests/security/command-injection.test.ts

## Verification
- npm run build
- npm test -- ccw/tests/shell-escape.test.ts ccw/tests/security/command-injection.test.ts

* fix(security): Harden path validation (DSC-005)

## Solution Summary
- Solution-ID: SOL-DSC-005-1
- Issue-ID: DSC-005

## Tasks Completed
- T1: Refactor path validation to pre-resolution checking
- T2: Implement allowlist-based path validation
- T3: Add path validation to API routes
- T4: Add path security regression tests

## Files Modified
- ccw/src/utils/path-resolver.ts
- ccw/src/utils/path-validator.ts
- ccw/src/core/routes/graph-routes.ts
- ccw/src/core/routes/files-routes.ts
- ccw/src/core/routes/skills-routes.ts
- ccw/tests/path-resolver.test.ts
- ccw/tests/graph-routes.test.ts
- ccw/tests/files-routes.test.ts
- ccw/tests/skills-routes.test.ts
- ccw/tests/security/path-traversal.test.ts

## Verification
- npm run build
- npm test -- path-resolver.test.ts
- npm test -- path-validator.test.ts
- npm test -- graph-routes.test.ts
- npm test -- files-routes.test.ts
- npm test -- skills-routes.test.ts
- npm test -- ccw/tests/security/path-traversal.test.ts

* fix(security): Prevent credential leakage (DSC-004)

## Solution Summary
- Solution-ID: SOL-DSC-004-1
- Issue-ID: DSC-004

## Tasks Completed
- T1: Create credential handling security tests
- T2: Add log sanitization tests
- T3: Add env var leakage prevention tests
- T4: Add secure storage tests

## Files Modified
- ccw/src/config/litellm-api-config-manager.ts
- ccw/src/core/routes/litellm-api-routes.ts
- ccw/tests/security/credential-handling.test.ts

## Verification
- npm run build
- node --experimental-strip-types --test ccw/tests/security/credential-handling.test.ts

* test(ranking): expand normalize_weights edge case coverage (ISS-1766920108814-0)

## Solution Summary
- Solution-ID: SOL-20251228113607
- Issue-ID: ISS-1766920108814-0

## Tasks Completed
- T1: Fix NaN and invalid total handling in normalize_weights
- T2: Add unit tests for NaN edge cases in normalize_weights

## Files Modified
- codex-lens/tests/test_rrf_fusion.py

## Verification
- python -m pytest codex-lens/tests/test_rrf_fusion.py::TestNormalizeBM25Score -v
- python -m pytest codex-lens/tests/test_rrf_fusion.py -v -k normalize
- python -m pytest codex-lens/tests/test_rrf_fusion.py::TestReciprocalRankFusion::test_weight_normalization codex-lens/tests/test_cli_hybrid_search.py::TestCLIHybridSearch::test_weights_normalization -v

* feat(security): Add CSRF protection and tighten CORS (DSC-006)

## Solution Summary
- Solution-ID: SOL-DSC-006-1
- Issue-ID: DSC-006
- Risk/Impact/Complexity: high/high/medium

## Tasks Completed
- T1: Create CSRF token generation system
- T2: Add CSRF token endpoints
- T3: Implement CSRF validation middleware
- T4: Restrict CORS to trusted origins
- T5: Add CSRF security tests

## Files Modified
- ccw/src/core/auth/csrf-manager.ts
- ccw/src/core/auth/csrf-middleware.ts
- ccw/src/core/routes/auth-routes.ts
- ccw/src/core/server.ts
- ccw/tests/csrf-manager.test.ts
- ccw/tests/auth-routes.test.ts
- ccw/tests/csrf-middleware.test.ts
- ccw/tests/security/csrf.test.ts

## Verification
- npm run build
- node --experimental-strip-types --test ccw/tests/csrf-manager.test.ts
- node --experimental-strip-types --test ccw/tests/auth-routes.test.ts
- node --experimental-strip-types --test ccw/tests/csrf-middleware.test.ts
- node --experimental-strip-types --test ccw/tests/cors.test.ts
- node --experimental-strip-types --test ccw/tests/security/csrf.test.ts

* fix(cli-executor): prevent stale SIGKILL timeouts

## Solution Summary

- Solution-ID: SOL-DSC-007-1

- Issue-ID: DSC-007

- Risk/Impact/Complexity: low/low/low

## Tasks Completed

- [T1] Store timeout handle in killCurrentCliProcess

## Files Modified

- ccw/src/tools/cli-executor.ts

- ccw/tests/cli-executor-kill.test.ts

## Verification

- node --experimental-strip-types --test ccw/tests/cli-executor-kill.test.ts

* fix(cli-executor): enhance merge validation guards

## Solution Summary

- Solution-ID: SOL-DSC-008-1

- Issue-ID: DSC-008

- Risk/Impact/Complexity: low/low/low

## Tasks Completed

- [T1] Enhance sourceConversations array validation

## Files Modified

- ccw/src/tools/cli-executor.ts

- ccw/tests/cli-executor-merge-validation.test.ts

## Verification

- node --experimental-strip-types --test ccw/tests/cli-executor-merge-validation.test.ts

* refactor(core): remove @ts-nocheck from core routes

## Solution Summary
- Solution-ID: SOL-DSC-003-1
- Issue-ID: DSC-003
- Queue-ID: QUE-20260106-164500
- Item-ID: S-9

## Tasks Completed
- T1: Create shared RouteContext type definition
- T2: Remove @ts-nocheck from small route files
- T3: Remove @ts-nocheck from medium route files
- T4: Remove @ts-nocheck from large route files
- T5: Remove @ts-nocheck from remaining core files

## Files Modified
- ccw/src/core/dashboard-generator-patch.ts
- ccw/src/core/dashboard-generator.ts
- ccw/src/core/routes/ccw-routes.ts
- ccw/src/core/routes/claude-routes.ts
- ccw/src/core/routes/cli-routes.ts
- ccw/src/core/routes/codexlens-routes.ts
- ccw/src/core/routes/discovery-routes.ts
- ccw/src/core/routes/files-routes.ts
- ccw/src/core/routes/graph-routes.ts
- ccw/src/core/routes/help-routes.ts
- ccw/src/core/routes/hooks-routes.ts
- ccw/src/core/routes/issue-routes.ts
- ccw/src/core/routes/litellm-api-routes.ts
- ccw/src/core/routes/litellm-routes.ts
- ccw/src/core/routes/mcp-routes.ts
- ccw/src/core/routes/mcp-routes.ts.backup
- ccw/src/core/routes/mcp-templates-db.ts
- ccw/src/core/routes/nav-status-routes.ts
- ccw/src/core/routes/rules-routes.ts
- ccw/src/core/routes/session-routes.ts
- ccw/src/core/routes/skills-routes.ts
- ccw/src/core/routes/status-routes.ts
- ccw/src/core/routes/system-routes.ts
- ccw/src/core/routes/types.ts
- ccw/src/core/server.ts
- ccw/src/core/websocket.ts

## Verification
- npm run build
- npm test

* refactor: split cli-executor and codexlens routes into modules

## Solution Summary
- Solution-ID: SOL-DSC-012-1
- Issue-ID: DSC-012
- Risk/Impact/Complexity: medium/medium/high

## Tasks Completed
- [T1] Extract execution orchestration from cli-executor.ts (Refactor ccw/src/tools)
- [T2] Extract route handlers from codexlens-routes.ts (Refactor ccw/src/core/routes)
- [T3] Extract prompt concatenation logic from cli-executor (Refactor ccw/src/tools)
- [T4] Document refactored module architecture (Docs)

## Files Modified
- ccw/src/tools/cli-executor.ts
- ccw/src/tools/cli-executor-core.ts
- ccw/src/tools/cli-executor-utils.ts
- ccw/src/tools/cli-executor-state.ts
- ccw/src/tools/cli-prompt-builder.ts
- ccw/src/tools/README.md
- ccw/src/core/routes/codexlens-routes.ts
- ccw/src/core/routes/codexlens/config-handlers.ts
- ccw/src/core/routes/codexlens/index-handlers.ts
- ccw/src/core/routes/codexlens/semantic-handlers.ts
- ccw/src/core/routes/codexlens/watcher-handlers.ts
- ccw/src/core/routes/codexlens/utils.ts
- ccw/src/core/routes/codexlens/README.md

## Verification
- npm run build
- npm test

* test(issue): Add comprehensive issue command tests

## Solution Summary
- **Solution-ID**: SOL-DSC-009-1
- **Issue-ID**: DSC-009
- **Risk/Impact/Complexity**: low/high/medium

## Tasks Completed
- [T1] Create issue command test file structure: Create isolated test harness
- [T2] Add JSONL read/write operation tests: Verify JSONL correctness and errors
- [T3] Add issue lifecycle tests: Verify status transitions and timestamps
- [T4] Add solution binding tests: Verify binding flows and error cases
- [T5] Add queue formation tests: Verify queue creation, IDs, and DAG behavior
- [T6] Add queue execution tests: Verify next/done/retry and status sync

## Files Modified
- ccw/src/commands/issue.ts
- ccw/tests/issue-command.test.ts

## Verification
- node --experimental-strip-types --test ccw/tests/issue-command.test.ts

* test(routes): Add integration tests for route modules

## Solution Summary
- Solution-ID: SOL-DSC-010-1
- Issue-ID: DSC-010
- Queue-ID: QUE-20260106-164500

## Tasks Completed
- [T1] Add tests for ccw-routes.ts
- [T2] Add tests for files-routes.ts
- [T3] Add tests for claude-routes.ts (includes Windows path fix for create)
- [T4] Add tests for issue-routes.ts
- [T5] Add tests for help-routes.ts (avoid hanging watchers)
- [T6] Add tests for nav-status-routes.ts
- [T7] Add tests for hooks/graph/rules/skills/litellm-api routes

## Files Modified
- ccw/src/core/routes/claude-routes.ts
- ccw/src/core/routes/help-routes.ts
- ccw/tests/integration/ccw-routes.test.ts
- ccw/tests/integration/claude-routes.test.ts
- ccw/tests/integration/files-routes.test.ts
- ccw/tests/integration/issue-routes.test.ts
- ccw/tests/integration/help-routes.test.ts
- ccw/tests/integration/nav-status-routes.test.ts
- ccw/tests/integration/hooks-routes.test.ts
- ccw/tests/integration/graph-routes.test.ts
- ccw/tests/integration/rules-routes.test.ts
- ccw/tests/integration/skills-routes.test.ts
- ccw/tests/integration/litellm-api-routes.test.ts

## Verification
- node --experimental-strip-types --test ccw/tests/integration/ccw-routes.test.ts
- node --experimental-strip-types --test ccw/tests/integration/files-routes.test.ts
- node --experimental-strip-types --test ccw/tests/integration/claude-routes.test.ts
- node --experimental-strip-types --test ccw/tests/integration/issue-routes.test.ts
- node --experimental-strip-types --test ccw/tests/integration/help-routes.test.ts
- node --experimental-strip-types --test ccw/tests/integration/nav-status-routes.test.ts
- node --experimental-strip-types --test ccw/tests/integration/hooks-routes.test.ts
- node --experimental-strip-types --test ccw/tests/integration/graph-routes.test.ts
- node --experimental-strip-types --test ccw/tests/integration/rules-routes.test.ts
- node --experimental-strip-types --test ccw/tests/integration/skills-routes.test.ts
- node --experimental-strip-types --test ccw/tests/integration/litellm-api-routes.test.ts

* refactor(core): Switch cache and lite scanning to async fs

## Solution Summary
- Solution-ID: SOL-DSC-013-1
- Issue-ID: DSC-013
- Queue-ID: QUE-20260106-164500

## Tasks Completed
- [T1] Convert cache-manager.ts to async file operations
- [T2] Convert lite-scanner.ts to async file operations
- [T3] Update cache-manager call sites to await async API
- [T4] Update lite-scanner call sites to await async API

## Files Modified
- ccw/src/core/cache-manager.ts
- ccw/src/core/lite-scanner.ts
- ccw/src/core/data-aggregator.ts

## Verification
- npm run build
- npm test

* fix(exec): Add timeout protection for execSync

## Solution Summary
- Solution-ID: SOL-DSC-014-1
- Issue-ID: DSC-014
- Queue-ID: QUE-20260106-164500

## Tasks Completed
- [T1] Add timeout to execSync calls in python-utils.ts
- [T2] Add timeout to execSync calls in detect-changed-modules.ts
- [T3] Add timeout to execSync calls in claude-freshness.ts
- [T4] Add timeout to execSync calls in issue.ts
- [T5] Consolidate execSync timeout constants and audit coverage

## Files Modified
- ccw/src/utils/exec-constants.ts
- ccw/src/utils/python-utils.ts
- ccw/src/tools/detect-changed-modules.ts
- ccw/src/core/claude-freshness.ts
- ccw/src/commands/issue.ts
- ccw/src/tools/smart-search.ts
- ccw/src/tools/codex-lens.ts
- ccw/src/core/routes/codexlens/config-handlers.ts

## Verification
- npm run build
- npm test
- node --experimental-strip-types --test ccw/tests/issue-command.test.ts

* feat(cli): Add progress spinner with elapsed time for long-running operations

## Solution Summary
- Solution-ID: SOL-DSC-015-1
- Issue-ID: DSC-015
- Queue-Item: S-15
- Risk/Impact/Complexity: low/medium/low

## Tasks Completed
- [T1] Add progress spinner to CLI execution: Update ccw/src/commands/cli.ts

## Files Modified
- ccw/src/commands/cli.ts
- ccw/tests/cli-command.test.ts

## Verification
- node --experimental-strip-types --test ccw/tests/cli-command.test.ts
- node --experimental-strip-types --test ccw/tests/cli-executor-kill.test.ts
- node --experimental-strip-types --test ccw/tests/cli-executor-merge-validation.test.ts

* fix(cli): Move full output hint immediately after truncation notice

## Solution Summary
- Solution-ID: SOL-DSC-016-1
- Issue-ID: DSC-016
- Queue-Item: S-16
- Risk/Impact/Complexity: low/high/low

## Tasks Completed
- [T1] Relocate output hint after truncation: Update ccw/src/commands/cli.ts

## Files Modified
- ccw/src/commands/cli.ts
- ccw/tests/cli-command.test.ts

## Verification
- npm run build
- node --experimental-strip-types --test ccw/tests/cli-command.test.ts

* feat(cli): Add confirmation prompts for destructive operations

## Solution Summary
- Solution-ID: SOL-DSC-017-1
- Issue-ID: DSC-017
- Queue-Item: S-17
- Risk/Impact/Complexity: low/high/low

## Tasks Completed
- [T1] Add confirmation to storage clean operations: Update ccw/src/commands/cli.ts
- [T2] Add confirmation to issue queue delete: Update ccw/src/commands/issue.ts

## Files Modified
- ccw/src/commands/cli.ts
- ccw/src/commands/issue.ts
- ccw/tests/cli-command.test.ts
- ccw/tests/issue-command.test.ts

## Verification
- npm run build
- node --experimental-strip-types --test ccw/tests/cli-command.test.ts
- node --experimental-strip-types --test ccw/tests/issue-command.test.ts

* feat(cli): Improve multi-line prompt guidance

## Solution Summary
- Solution-ID: SOL-DSC-018-1
- Issue-ID: DSC-018
- Queue-Item: S-18
- Risk/Impact/Complexity: low/medium/low

## Tasks Completed
- [T1] Update CLI help to emphasize --file option: Update ccw/src/commands/cli.ts
- [T2] Add inline hint for multi-line detection: Update ccw/src/commands/cli.ts

## Files Modified
- ccw/src/commands/cli.ts
- ccw/tests/cli-command.test.ts

## Verification
- npm run build
- node --experimental-strip-types --test ccw/tests/cli-command.test.ts

---------

Co-authored-by: catlog22 <catlog22@github.com>
2026-01-07 22:35:46 +08:00
catlog22
e21d801523 feat: Add multi-type embedding backends for cascade retrieval
- Implemented BinaryEmbeddingBackend for fast coarse filtering using 256-dimensional binary vectors.
- Developed DenseEmbeddingBackend for high-precision dense vectors (2048 dimensions) for reranking.
- Created CascadeEmbeddingBackend to combine binary and dense embeddings for two-stage retrieval.
- Introduced utility functions for embedding conversion and distance computation.

chore: Migration 010 - Add multi-vector storage support

- Added 'chunks' table to support multi-vector embeddings for cascade retrieval.
- Included new columns: embedding_binary (256-dim) and embedding_dense (2048-dim) for efficient storage.
- Implemented upgrade and downgrade functions to manage schema changes and data migration.
2026-01-02 10:52:43 +08:00
catlog22
520f2d26f2 feat(codex-lens): add unified reranker architecture and file watcher
Unified Reranker Architecture:
- Add BaseReranker ABC with factory pattern
- Implement 4 backends: ONNX (default), API, LiteLLM, Legacy
- Add .env configuration parsing for API credentials
- Migrate from sentence-transformers to optimum+onnxruntime

File Watcher Module:
- Add real-time file system monitoring with watchdog
- Implement IncrementalIndexer for single-file updates
- Add WatcherManager with signal handling and graceful shutdown
- Add 'codexlens watch' CLI command
- Event filtering, debouncing, and deduplication
- Thread-safe design with proper resource cleanup

Tests: 16 watcher tests + 5 reranker test files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 13:23:52 +08:00
catlog22
31a45f1f30 Add graph expansion and cross-encoder reranking features
- Implemented GraphExpander to enhance search results with related symbols using precomputed neighbors.
- Added CrossEncoderReranker for second-stage search ranking, allowing for improved result scoring.
- Created migrations to establish necessary database tables for relationships and graph neighbors.
- Developed tests for graph expansion functionality, ensuring related results are populated correctly.
- Enhanced performance benchmarks for cross-encoder reranking latency and graph expansion overhead.
- Updated schema cleanup tests to reflect changes in versioning and deprecated fields.
- Added new test cases for Treesitter parser to validate relationship extraction with alias resolution.
2025-12-31 16:58:59 +08:00
catlog22
70f8b14eaa refactor(vector_store): use safer SQL query construction pattern
Replaces f-string interpolation with safer string formatting.
Adds documentation on SQL injection prevention.
No functional changes - parameterized queries still used.

Fixes: ISS-1766921318981-9

Solution-ID: SOL-1735386000-9
Issue-ID: ISS-1766921318981-9
Task-ID: T1
2025-12-29 20:09:49 +08:00
catlog22
0c8b2f2ec9 fix(vector_store): add bounds checking for chunk ID generation
Prevents potential integer overflow when start_id is near sys.maxsize.
Adds validation before range() calculation in batch insert methods.

Fixes: ISS-1766921318981-6

Solution-ID: SOL-1735386000-6
Issue-ID: ISS-1766921318981-6
Task-ID: T1
2025-12-29 20:02:19 +08:00
catlog22
c56104c082 fix(vector_store): add null check for ANN search results before filtering
Prevents errors when HNSW search returns null/empty results due to race conditions.
Adds validation for ids and distances before zip operation.

Fixes: ISS-1766921318981-5

Solution-ID: SOL-1735386000-5
Issue-ID: ISS-1766921318981-5
Task-ID: T1
2025-12-29 19:53:32 +08:00
catlog22
7f4433e449 fix(vector_store): add parameter validation for min_score range
Validates min_score is within [0.0, 1.0] for cosine similarity.
Raises ValueError for out-of-range values to prevent unexpected filtering.

Fixes: ISS-1766921318981-14

Solution-ID: SOL-1735386000-14
Issue-ID: ISS-1766921318981-14
Task-ID: T1
2025-12-29 19:46:26 +08:00
catlog22
60fbb4177c fix(config): add specific exception handling for path operations
Replaces generic Exception handling with specific PermissionError and OSError
handling in __post_init__ and ensure_runtime_dirs(). Provides clear diagnostic
messages to distinguish permission issues from other filesystem errors.

Solution-ID: SOL-1735385400008
Issue-ID: ISS-1766921318981-8
Task-ID: T1
2025-12-29 19:34:27 +08:00
catlog22
5914b1c5fc fix(vector-store): protect bulk insert mode transitions with lock
Ensure begin_bulk_insert() and end_bulk_insert() are fully
lock-protected to prevent TOCTOU race conditions.

Solution-ID: SOL-1735392000003
Issue-ID: ISS-1766921318981-12
Task-ID: T2
2025-12-29 19:20:02 +08:00
catlog22
d8be23fa83 fix(vector-store): add lock protection for bulk insert mode flag
Protect _bulk_insert_mode flag and accumulation lists with
_ann_write_lock to prevent corruption during concurrent access.

Solution-ID: SOL-1735392000003
Issue-ID: ISS-1766921318981-12
Task-ID: T1
2025-12-29 19:16:30 +08:00
catlog22
3fdd52742b fix(storage): handle rollback failures in batch operations
Adds nested exception handling in add_files() and _migrate_fts_to_external()
to catch and log rollback failures. Uses exception chaining to preserve both
transaction and rollback errors, preventing silent database inconsistency.

Solution-ID: SOL-1735385400010
Issue-ID: ISS-1766921318981-10
Task-ID: T1
2025-12-29 19:08:49 +08:00
catlog22
76ab4d67fe test(entities): add zero vector validation tests
Add comprehensive test coverage for zero and near-zero vector
detection in SemanticChunk embedding validation.

Solution-ID: SOL-20251228113612
Issue-ID: ISS-1766921318981-7
Task-ID: T2
2025-12-29 19:03:20 +08:00
catlog22
6a73d3c379 fix(search): handle path operation failures in symbol filtering
Adds robust exception handling for os.path.commonpath() in search_symbols()
to prevent crashes on malformed paths and Windows cross-drive scenarios.
Invalid symbols are skipped with debug logging, search continues.

Solution-ID: SOL-1735385400004
Issue-ID: ISS-1766921318981-4
Task-ID: T1
2025-12-29 18:59:10 +08:00
catlog22
5d5652c2c5 fix(sqlite-store): improve thread tracking in connection cleanup
Add fallback validation to detect dead threads missed by
threading.enumerate(), ensuring all stale connections are cleaned.

Solution-ID: SOL-1735392000002
Issue-ID: ISS-1766921318981-3
Task-ID: T2
2025-12-29 18:50:22 +08:00
catlog22
b958a1ea96 fix(sqlite-store): add periodic cleanup timer for connection pool
Implement background timer to proactively clean stale connections
every 5 minutes, preventing indefinite accumulation.

Solution-ID: SOL-1735392000002
Issue-ID: ISS-1766921318981-3
Task-ID: T1
2025-12-29 18:43:55 +08:00
catlog22
9a45732a39 test(codex-lens): add connection pool stress tests
Solution-ID: SOL-1735410004
Issue-ID: ISS-1766921318981-24
Task-ID: T3
2025-12-29 18:16:03 +08:00
catlog22
015b46e58b test(codex-lens): add concurrent write operation tests
Solution-ID: SOL-1735410004
Issue-ID: ISS-1766921318981-24
Task-ID: T2
2025-12-29 18:12:09 +08:00
catlog22
042a99dbe3 test(codex-lens): add concurrent read operation tests
Solution-ID: SOL-1735410004

Issue-ID: ISS-1766921318981-24

Task-ID: T1
2025-12-29 17:59:08 +08:00
catlog22
1396010437 fix(embedder): add lock protection for cache read operations
Protect fast path cache read in get_embedder() to prevent KeyError

during concurrent access and cache clearing operations.

Solution-ID: SOL-1735392000001

Issue-ID: ISS-1766921318981-2

Task-ID: T1
2025-12-29 12:33:23 +08:00
catlog22
84d06f4273 fix(registry): normalize path case for comparison on Windows
Adds case normalization for path comparison on Windows to handle
case-insensitive filesystem behavior. Preserves case-sensitivity on Unix.

Fixes: ISS-1766921318981-13

Solution-ID: SOL-1735386000-13
Issue-ID: ISS-1766921318981-13
Task-ID: T1
2025-12-28 21:51:23 +08:00
catlog22
af2ff54cb7 test(vector-store): add epsilon tolerance edge case tests
Add comprehensive test coverage for near-zero norms, product
underflow, and floating point precision edge cases in
_cosine_similarity function.

Solution-ID: SOL-20251228113619
Issue-ID: ISS-1766921318981-11
Task-ID: T2
2025-12-28 21:37:59 +08:00
catlog22
93dcdd2293 fix(config): log configuration loading errors instead of silently ignoring
Replaces bare exception handler in load_settings() with logging.warning()
to help users debug configuration file issues (syntax errors, permissions).
Maintains backward compatibility - errors do not break initialization.

Solution-ID: SOL-1735385400001
Issue-ID: ISS-1766921318981-1
Task-ID: T1
2025-12-28 21:06:23 +08:00
catlog22
58caccb250 test(ranking): add edge case tests for normalize_weights
Add comprehensive test coverage for NaN, infinity, and all-None
edge cases in weight normalization to prevent regression.

Solution-ID: SOL-20251228113631
Issue-ID: ISS-1766921318981-0
Task-ID: T2
2025-12-28 20:59:08 +08:00
catlog22
4061ae48c4 feat: Implement adaptive RRF weights and query intent detection
- Added integration tests for adaptive RRF weights in hybrid search.
- Enhanced query intent detection with new classifications: keyword, semantic, and mixed.
- Introduced symbol boosting in search results based on explicit symbol matches.
- Implemented embedding-based reranking with configurable options.
- Added global symbol index for efficient symbol lookups across projects.
- Improved file deletion handling on Windows to avoid permission errors.
- Updated chunk configuration to increase overlap for better context.
- Modified package.json test script to target specific test files.
- Created comprehensive writing style guidelines for documentation.
- Added TypeScript tests for query intent detection and adaptive weights.
- Established performance benchmarks for global symbol indexing.
2025-12-26 15:08:47 +08:00
catlog22
3b842ed290 feat(cli-executor): add streaming option and enhance output handling
- Introduced a `stream` parameter to control output streaming vs. caching.
- Enhanced status determination logic to prioritize valid output over exit codes.
- Updated output structure to include full stdout and stderr when not streaming.

feat(cli-history-store): extend conversation turn schema and migration

- Added `cached`, `stdout_full`, and `stderr_full` fields to the conversation turn schema.
- Implemented database migration to add new columns if they do not exist.
- Updated upsert logic to handle new fields.

feat(codex-lens): implement global symbol index for fast lookups

- Created `GlobalSymbolIndex` class to manage project-wide symbol indexing.
- Added methods for adding, updating, and deleting symbols in the global index.
- Integrated global index updates into directory indexing processes.

feat(codex-lens): optimize search functionality with global index

- Enhanced `ChainSearchEngine` to utilize the global symbol index for faster searches.
- Added configuration option to enable/disable global symbol indexing.
- Updated tests to validate global index functionality and performance.
2025-12-25 22:22:31 +08:00
catlog22
8e744597d1 feat: Implement CodexLens multi-provider embedding rotation management
- Added functions to get and update CodexLens embedding rotation configuration.
- Introduced functionality to retrieve enabled embedding providers for rotation.
- Created endpoints for managing rotation configuration via API.
- Enhanced dashboard UI to support multi-provider rotation configuration.
- Updated internationalization strings for new rotation features.
- Adjusted CLI commands and embedding manager to support increased concurrency limits.
- Modified hybrid search weights for improved ranking behavior.
2025-12-25 14:13:27 +08:00
catlog22
3c3ce55842 feat: 添加对 LiteLLM 嵌入后端的支持,增强并发 API 调用能力 2025-12-24 22:20:13 +08:00
catlog22
e671b45948 feat: Enhance configuration management and embedding capabilities
- Added JSON-based settings management in Config class for embedding and LLM configurations.
- Introduced methods to save and load settings from a JSON file.
- Updated BaseEmbedder and its subclasses to include max_tokens property for better token management.
- Enhanced chunking strategy to support recursive splitting of large symbols with improved overlap handling.
- Implemented comprehensive tests for recursive splitting and chunking behavior.
- Added CLI tools configuration management for better integration with external tools.
- Introduced a new command for compacting session memory into structured text for recovery.
2025-12-24 16:32:27 +08:00
catlog22
3e9a309079 refactor: 移除图索引功能,修复内存泄露,优化嵌入生成
主要更改:

1. 移除图索引功能 (graph indexing)
   - 删除 graph_analyzer.py 及相关迁移文件
   - 移除 CLI 的 graph 命令和 --enrich 标志
   - 清理 chain_search.py 中的图查询方法 (370行)
   - 删除相关测试文件

2. 修复嵌入生成内存问题
   - 重构 generate_embeddings.py 使用流式批处理
   - 改用 embedding_manager 的内存安全实现
   - 文件从 548 行精简到 259 行 (52.7% 减少)

3. 修复内存泄露
   - chain_search.py: quick_search 使用 with 语句管理 ChainSearchEngine
   - embedding_manager.py: 使用 with 语句管理 VectorStore
   - vector_store.py: 添加暴力搜索内存警告

4. 代码清理
   - 移除 Symbol 模型的 token_count 和 symbol_type 字段
   - 清理相关测试用例

测试: 760 passed, 7 skipped

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-21 16:22:03 +08:00
catlog22
b27d8a9570 feat: Add CLAUDE.md freshness tracking and update reminders
- Add SQLite table and CRUD methods for tracking update history
- Create freshness calculation service based on git file changes
- Add API endpoints for freshness data, marking updates, and history
- Display freshness badges in file tree (green/yellow/red indicators)
- Show freshness gauge and details in metadata panel
- Auto-mark files as updated after CLI sync
- Add English and Chinese i18n translations

Freshness algorithm: 100 - min((changedFilesCount / 20) * 100, 100)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 16:14:46 +08:00
catlog22
7adde91e9f feat: Add search result grouping by similarity score
Add functionality to group search results with similar content and scores
into a single representative result with additional locations.

Changes:
- Add AdditionalLocation entity model for storing grouped result locations
- Add additional_locations field to SearchResult for backward compatibility
- Implement group_similar_results() function in ranking.py with:
  - Content-based grouping (by excerpt or content field)
  - Score-based sub-grouping with configurable threshold
  - Metadata preservation with grouped_count tracking
- Add group_results and grouping_threshold options to SearchOptions
- Integrate grouping into ChainSearchEngine.search() after RRF fusion

Test coverage:
- 36 multi-level tests covering unit, boundary, integration, and performance
- Real-world scenario tests for RRF scores and duplicate code detection

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 16:33:44 +08:00
catlog22
2f0cce0089 feat: Enhance CodexLens indexing and search capabilities with new CLI options and improved error handling 2025-12-19 15:10:37 +08:00
catlog22
5e91ba6c60 Implement ANN index using HNSW algorithm and update related tests
- Added ANNIndex class for approximate nearest neighbor search using HNSW.
- Integrated ANN index with VectorStore for enhanced search capabilities.
- Updated test suite for ANN index, including tests for adding, searching, saving, and loading vectors.
- Modified existing tests to accommodate changes in search performance expectations.
- Improved error handling for file operations in tests to ensure compatibility with Windows file locks.
- Adjusted hybrid search performance assertions for increased stability in CI environments.
2025-12-19 10:35:29 +08:00
catlog22
17af615fe2 Add help view and core memory styles
- Introduced styles for the help view including tab transitions, accordion animations, search highlighting, and responsive design.
- Implemented core memory styles with modal base styles, memory card designs, and knowledge graph visualization.
- Enhanced dark mode support across various components.
- Added loading states and empty state designs for better user experience.
2025-12-18 18:29:45 +08:00
catlog22
b702791c2c Remove LLM enhancement features and related components as per user request. This includes the deletion of source code files, CLI commands, front-end components, tests, scripts, and documentation associated with LLM functionality. Simplified dependencies and reduced complexity while retaining core vector search capabilities. Validation of changes confirmed successful removal and functionality. 2025-12-16 21:38:27 +08:00
catlog22
d21066c282 Add scripts for inspecting LLM summaries and testing misleading comments
- Implement `inspect_llm_summaries.py` to display LLM-generated summaries from the semantic_chunks table in the database.
- Create `show_llm_analysis.py` to demonstrate LLM analysis of misleading code examples, highlighting discrepancies between comments and actual functionality.
- Develop `test_misleading_comments.py` to compare pure vector search with LLM-enhanced search, focusing on the impact of misleading or missing comments on search results.
- Introduce `test_llm_enhanced_search.py` to provide a test suite for evaluating the effectiveness of LLM-enhanced vector search against pure vector search.
- Ensure all new scripts are integrated with the existing codebase and follow the established coding standards.
2025-12-16 20:29:28 +08:00
catlog22
df23975a0b Add comprehensive tests for schema cleanup migration and search comparison
- Implement tests for migration 005 to verify removal of deprecated fields in the database schema.
- Ensure that new databases are created with a clean schema.
- Validate that keywords are correctly extracted from the normalized file_keywords table.
- Test symbol insertion without deprecated fields and subdir operations without direct_files.
- Create a detailed search comparison test to evaluate vector search vs hybrid search performance.
- Add a script for reindexing projects to extract code relationships and verify GraphAnalyzer functionality.
- Include a test script to check TreeSitter parser availability and relationship extraction from sample files.
2025-12-16 19:27:05 +08:00
catlog22
3da0ef2adb Add comprehensive tests for query parsing and Reciprocal Rank Fusion
- Implemented tests for the QueryParser class, covering various identifier splitting methods (CamelCase, snake_case, kebab-case), OR expansion, and FTS5 operator preservation.
- Added parameterized tests to validate expected token outputs for different query formats.
- Created edge case tests to ensure robustness against unusual input scenarios.
- Developed tests for the Reciprocal Rank Fusion (RRF) algorithm, including score computation, weight handling, and result ranking across multiple sources.
- Included tests for normalization of BM25 scores and tagging search results with source metadata.
2025-12-16 10:20:19 +08:00
catlog22
35485bbbb1 feat: Enhance navigation and cleanup for graph explorer view
- Added a cleanup function to reset the state when navigating away from the graph explorer.
- Updated navigation logic to call the cleanup function before switching views.
- Improved internationalization by adding new translations for graph-related terms.
- Adjusted icon sizes for better UI consistency in the graph explorer.
- Implemented impact analysis button functionality in the graph explorer.
- Refactored CLI tool configuration to use updated model names.
- Enhanced CLI executor to handle prompts correctly for codex commands.
- Introduced code relationship storage for better visualization in the index tree.
- Added support for parsing Markdown and plain text files in the symbol parser.
- Updated tests to reflect changes in language detection logic.
2025-12-15 23:11:01 +08:00
catlog22
97640a517a feat(storage): implement storage manager for centralized management and cleanup
- Added a new Storage Manager component to handle storage statistics, project cleanup, and configuration for CCW centralized storage.
- Introduced functions to calculate directory sizes, get project storage stats, and clean specific or all storage.
- Enhanced SQLiteStore with a public API for executing queries securely.
- Updated tests to utilize the new execute_query method and validate storage management functionalities.
- Improved performance by implementing connection pooling with idle timeout management in SQLiteStore.
- Added new fields (token_count, symbol_type) to the symbols table and adjusted related insertions.
- Enhanced error handling and logging for storage operations.
2025-12-15 17:39:38 +08:00
catlog22
0fe16963cd Add comprehensive tests for tokenizer, performance benchmarks, and TreeSitter parser functionality
- Implemented unit tests for the Tokenizer class, covering various text inputs, edge cases, and fallback mechanisms.
- Created performance benchmarks comparing tiktoken and pure Python implementations for token counting.
- Developed extensive tests for TreeSitterSymbolParser across Python, JavaScript, and TypeScript, ensuring accurate symbol extraction and parsing.
- Added configuration documentation for MCP integration and custom prompts, enhancing usability and flexibility.
- Introduced a refactor script for GraphAnalyzer to streamline future improvements.
2025-12-15 14:36:09 +08:00
catlog22
0529b57694 Implement database migration framework and performance optimizations
- Added active memory configuration for manual interval and Gemini tool.
- Created file modification rules for handling edits and writes.
- Implemented migration manager for managing database schema migrations.
- Added migration 001 to normalize keywords into separate tables.
- Developed tests for validating performance optimizations including keyword normalization, path lookup, and symbol search.
- Created validation script to manually verify optimization implementations.
2025-12-14 18:08:32 +08:00
catlog22
79a2953862 Add comprehensive tests for vector/semantic search functionality
- Implement full coverage tests for Embedder model loading and embedding generation
- Add CRUD operations and caching tests for VectorStore
- Include cosine similarity computation tests
- Validate semantic search accuracy and relevance through various queries
- Establish performance benchmarks for embedding and search operations
- Ensure edge cases and error handling are covered
- Test thread safety and concurrent access scenarios
- Verify availability of semantic search dependencies
2025-12-14 17:17:09 +08:00
catlog22
08dc0a0348 perf(codex-lens): optimize search performance with vectorized operations
Performance Optimizations:
- VectorStore: NumPy vectorized cosine similarity (100x+ faster)
  - Cached embedding matrix with pre-computed norms
  - Lazy content loading for top-k results only
  - Thread-safe cache invalidation
- SQLite: Added PRAGMA mmap_size=30GB for memory-mapped I/O
- FTS5: unicode61 tokenizer with tokenchars='_' for code identifiers
- ChainSearch: files_only fast path skipping snippet generation
- ThreadPoolExecutor: shared pool across searches

New Components:
- DirIndexStore: single-directory index with FTS5 and symbols
- RegistryStore: global project registry with path mappings
- PathMapper: source-to-index path conversion utility
- IndexTreeBuilder: hierarchical index tree construction
- ChainSearchEngine: parallel recursive directory search

Test Coverage:
- 36 comprehensive search functionality tests
- 14 performance benchmark tests
- 296 total tests passing (100% pass rate)

Benchmark Results:
- FTS5 search: 0.23-0.26ms avg (3900-4300 ops/sec)
- Vector search: 1.05-1.54ms avg (650-955 ops/sec)
- Full semantic: 4.56-6.38ms avg per query

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-14 11:06:24 +08:00
catlog22
4faa5f1c95 Add comprehensive tests for semantic chunking and search functionality
- Implemented tests for the ChunkConfig and Chunker classes, covering default and custom configurations.
- Added tests for symbol-based chunking, including single and multiple symbols, handling of empty symbols, and preservation of line numbers.
- Developed tests for sliding window chunking, ensuring correct chunking behavior with various content sizes and configurations.
- Created integration tests for semantic search, validating embedding generation, vector storage, and search accuracy across a complex codebase.
- Included performance tests for embedding generation and search operations.
- Established tests for chunking strategies, comparing symbol-based and sliding window approaches.
- Enhanced test coverage for edge cases, including handling of unicode characters and out-of-bounds symbol ranges.
2025-12-12 19:55:35 +08:00
catlog22
c42f91a7fe feat: Add support for Tree-Sitter parsing and enhance SQLite storage performance 2025-12-12 18:40:24 +08:00
catlog22
92d2085b64 Optimize SQLite FTS storage and pooling 2025-12-12 17:40:03 +08:00