Previously, embeddings were only generated for root directory files (1.6% coverage, 5/303 files).
This fix implements recursive processing across all subdirectory indexes, achieving 100% coverage
with 2,042 semantic chunks across all 303 files in 26 index databases.
Key improvements:
1. **Recursive embeddings generation** (embedding_manager.py):
- Add generate_embeddings_recursive() to process all _index.db files in directory tree
- Add get_embeddings_status() for comprehensive coverage statistics
- Add discover_all_index_dbs() helper for recursive file discovery
2. **Enhanced CLI commands** (commands.py):
- embeddings-generate: Add --recursive flag for full project coverage
- init: Use recursive generation by default for complete indexing
- status: Display embeddings coverage statistics with 50% threshold
3. **Smart search routing improvements** (smart-search.ts):
- Add 50% embeddings coverage threshold for hybrid mode routing
- Auto-fallback to exact mode when coverage insufficient
- Strip ANSI color codes from JSON output for correct parsing
- Add embeddings_coverage_percent to IndexStatus and SearchMetadata
- Provide clear warnings with actionable suggestions
4. **Documentation and analysis**:
- Add SMART_SEARCH_ANALYSIS.md with initial investigation
- Add SMART_SEARCH_CORRECTED_ANALYSIS.md revealing true extent of issue
- Add EMBEDDINGS_FIX_SUMMARY.md with complete fix summary
- Add check_embeddings.py script for coverage verification
Results:
- Coverage improved from 1.6% (5/303 files) to 100% (303/303 files) - 62.5x increase
- Semantic chunks increased from 10 to 2,042 - 204x increase
- All 26 subdirectory indexes now have embeddings vs just 1
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>