Add comprehensive tests for query parsing and Reciprocal Rank Fusion

- Implemented tests for the QueryParser class, covering various identifier splitting methods (CamelCase, snake_case, kebab-case), OR expansion, and FTS5 operator preservation.
- Added parameterized tests to validate expected token outputs for different query formats.
- Created edge case tests to ensure robustness against unusual input scenarios.
- Developed tests for the Reciprocal Rank Fusion (RRF) algorithm, including score computation, weight handling, and result ranking across multiple sources.
- Included tests for normalization of BM25 scores and tagging search results with source metadata.
This commit is contained in:
catlog22
2025-12-16 10:20:19 +08:00
parent 35485bbbb1
commit 3da0ef2adb
39 changed files with 6171 additions and 240 deletions

View File

@@ -216,7 +216,7 @@ Before completion, verify:
{
"step": "analyze_module_structure",
"action": "Deep analysis of module structure and API",
"command": "ccw cli exec \"PURPOSE: Document module comprehensively\nTASK: Extract module purpose, architecture, public API, dependencies\nMODE: analysis\nCONTEXT: @**/* System: [system_context]\nEXPECTED: Complete module analysis for documentation\nRULES: $(cat ~/.claude/workflows/cli-templates/prompts/documentation/module-documentation.txt)\" --tool gemini --cd src/auth",
"command": "ccw cli exec \"PURPOSE: Document module comprehensively\nTASK: Extract module purpose, architecture, public API, dependencies\nMODE: analysis\nCONTEXT: @**/* System: [system_context]\nEXPECTED: Complete module analysis for documentation\nRULES: $(cat ~/.claude/workflows/cli-templates/prompts/documentation/module-documentation.txt)\" --tool gemini --mode analysis --cd src/auth",
"output_to": "module_analysis",
"on_error": "fail"
}

View File

@@ -364,7 +364,7 @@ api_id=$((group_count + 3))
},
{
"step": "analyze_project",
"command": "bash(gemini \"PURPOSE: Analyze project structure\\nTASK: Extract overview from modules\\nMODE: analysis\\nCONTEXT: [all_module_docs]\\nEXPECTED: Project outline\")",
"command": "bash(ccw cli exec \"PURPOSE: Analyze project structure\\nTASK: Extract overview from modules\\nMODE: analysis\\nCONTEXT: [all_module_docs]\\nEXPECTED: Project outline\" --tool gemini --mode analysis)",
"output_to": "project_outline"
}
],
@@ -404,7 +404,7 @@ api_id=$((group_count + 3))
"pre_analysis": [
{"step": "load_existing_docs", "command": "bash(cat .workflow/docs/${project_name}/{ARCHITECTURE,EXAMPLES}.md 2>/dev/null || echo 'No existing docs')", "output_to": "existing_arch_examples"},
{"step": "load_all_docs", "command": "bash(cat .workflow/docs/${project_name}/README.md && find .workflow/docs/${project_name} -type f -name '*.md' ! -path '*/README.md' ! -path '*/ARCHITECTURE.md' ! -path '*/EXAMPLES.md' ! -path '*/api/*' | xargs cat)", "output_to": "all_docs"},
{"step": "analyze_architecture", "command": "bash(gemini \"PURPOSE: Analyze system architecture\\nTASK: Synthesize architectural overview and examples\\nMODE: analysis\\nCONTEXT: [all_docs]\\nEXPECTED: Architecture + Examples outline\")", "output_to": "arch_examples_outline"}
{"step": "analyze_architecture", "command": "bash(ccw cli exec \"PURPOSE: Analyze system architecture\\nTASK: Synthesize architectural overview and examples\\nMODE: analysis\\nCONTEXT: [all_docs]\\nEXPECTED: Architecture + Examples outline\" --tool gemini --mode analysis)", "output_to": "arch_examples_outline"}
],
"implementation_approach": [
{
@@ -441,7 +441,7 @@ api_id=$((group_count + 3))
"pre_analysis": [
{"step": "discover_api", "command": "bash(rg 'router\\.| @(Get|Post)' -g '*.{ts,js}')", "output_to": "endpoint_discovery"},
{"step": "load_existing_api", "command": "bash(cat .workflow/docs/${project_name}/api/README.md 2>/dev/null || echo 'No existing API docs')", "output_to": "existing_api_docs"},
{"step": "analyze_api", "command": "bash(gemini \"PURPOSE: Document HTTP API\\nTASK: Analyze endpoints\\nMODE: analysis\\nCONTEXT: @src/api/**/* [endpoint_discovery]\\nEXPECTED: API outline\")", "output_to": "api_outline"}
{"step": "analyze_api", "command": "bash(ccw cli exec \"PURPOSE: Document HTTP API\\nTASK: Analyze endpoints\\nMODE: analysis\\nCONTEXT: @src/api/**/* [endpoint_discovery]\\nEXPECTED: API outline\" --tool gemini --mode analysis)", "output_to": "api_outline"}
],
"implementation_approach": [
{

View File

@@ -147,7 +147,7 @@ RULES:
- Identify key architecture patterns and technical constraints
- Extract integration points and development standards
- Output concise, structured format
" --tool ${tool}
" --tool ${tool} --mode analysis
\`\`\`
### Step 4: Generate Core Content Package

View File

@@ -198,7 +198,7 @@ Objectives:
CONTEXT: @IMPL_PLAN.md @workflow-session.json
EXPECTED: Structured lessons and conflicts in JSON format
RULES: Template reference from skill-aggregation.txt
" --tool gemini --cd .workflow/.archives/{session_id}
" --tool gemini --mode analysis --cd .workflow/.archives/{session_id}
3.5. **Generate SKILL.md Description** (CRITICAL for auto-loading):
@@ -345,7 +345,7 @@ Objectives:
CONTEXT: [Provide aggregated JSON data]
EXPECTED: Final aggregated structure for SKILL documents
RULES: Template reference from skill-aggregation.txt
" --tool gemini
" --tool gemini --mode analysis
3. Read templates for formatting (same 4 templates as single mode)

View File

@@ -574,11 +574,11 @@ RULES: $(cat ~/.claude/workflows/cli-templates/prompts/analysis/02-review-code-q
# - Report findings directly
# Method 2: Gemini Review (recommended)
ccw cli exec "[Shared Prompt Template with artifacts]" --tool gemini
ccw cli exec "[Shared Prompt Template with artifacts]" --tool gemini --mode analysis
# CONTEXT includes: @**/* @${plan.json} [@${exploration.json}]
# Method 3: Qwen Review (alternative)
ccw cli exec "[Shared Prompt Template with artifacts]" --tool qwen
ccw cli exec "[Shared Prompt Template with artifacts]" --tool qwen --mode analysis
# Same prompt as Gemini, different execution engine
# Method 4: Codex Review (autonomous)

View File

@@ -139,7 +139,7 @@ EXPECTED:
- Red-Green-Refactor cycle validation
- Best practices adherence assessment
RULES: Focus on TDD best practices and workflow adherence. Be specific about violations and improvements.
" --tool gemini --cd project-root > .workflow/active/{sessionId}/TDD_COMPLIANCE_REPORT.md
" --tool gemini --mode analysis --cd project-root > .workflow/active/{sessionId}/TDD_COMPLIANCE_REPORT.md
```
**Output**: TDD_COMPLIANCE_REPORT.md

View File

@@ -152,7 +152,7 @@ Task(subagent_type="cli-execution-agent", prompt=`
- ModuleOverlap conflicts with overlap_analysis
- Targeted clarification questions
RULES: $(cat ~/.claude/workflows/cli-templates/prompts/analysis/02-analyze-code-patterns.txt) | Focus on breaking changes, migration needs, and functional overlaps | Prioritize exploration-identified conflicts | analysis=READ-ONLY
" --tool gemini --cd {project_root}
" --tool gemini --mode analysis --cd {project_root}
Fallback: Qwen (same prompt) → Claude (manual analysis)

View File

@@ -187,7 +187,7 @@ Task(subagent_type="ui-design-agent",
CONTEXT: @**/*.css @**/*.scss @**/*.js @**/*.ts
EXPECTED: JSON report listing conflicts with file:line, values, semantic context
RULES: Focus on core tokens | Report ALL variants | analysis=READ-ONLY
\" --tool gemini --cd ${source}
\" --tool gemini --mode analysis --cd ${source}
\`\`\`
**Step 1: Load file list**
@@ -302,7 +302,7 @@ Task(subagent_type="ui-design-agent",
CONTEXT: @**/*.css @**/*.scss @**/*.js @**/*.ts
EXPECTED: JSON report listing frameworks, animation types, file locations
RULES: Focus on framework consistency | Map all animations | analysis=READ-ONLY
\" --tool gemini --cd ${source}
\" --tool gemini --mode analysis --cd ${source}
\`\`\`
**Step 1: Load file list**
@@ -381,7 +381,7 @@ Task(subagent_type="ui-design-agent",
CONTEXT: @**/*.css @**/*.scss @**/*.js @**/*.ts @**/*.html
EXPECTED: JSON report categorizing components, layout patterns, naming conventions
RULES: Focus on component reusability | Identify layout systems | analysis=READ-ONLY
\" --tool gemini --cd ${source}
\" --tool gemini --mode analysis --cd ${source}
\`\`\`
**Step 1: Load file list**

View File

@@ -61,10 +61,13 @@ RULES: $(cat ~/.claude/workflows/cli-templates/prompts/[category]/[template].txt
ccw cli exec "<PROMPT>" --tool <gemini|qwen|codex> --mode <analysis|write|auto>
```
**⚠️ CRITICAL**: `--mode` parameter is **MANDATORY** for all CLI executions. No defaults are assumed.
### Core Principles
- **Use tools early and often** - Tools are faster and more thorough
- **Unified CLI** - Always use `ccw cli exec` for consistent parameter handling
- **Mode is MANDATORY** - ALWAYS explicitly specify `--mode analysis|write|auto` (no implicit defaults)
- **One template required** - ALWAYS reference exactly ONE template in RULES (use universal fallback if no specific match)
- **Write protection** - Require EXPLICIT `--mode write` or `--mode auto`
- **No escape characters** - NEVER use `\$`, `\"`, `\'` in CLI commands
@@ -103,12 +106,12 @@ RULES: $(cat ~/.claude/workflows/cli-templates/protocols/write-protocol.md) $(ca
### Gemini & Qwen
**Via CCW**: `ccw cli exec "<prompt>" --tool gemini` or `--tool qwen`
**Via CCW**: `ccw cli exec "<prompt>" --tool gemini --mode analysis` or `--tool qwen --mode analysis`
**Characteristics**:
- Large context window, pattern recognition
- Best for: Analysis, documentation, code exploration, architecture review
- Default MODE: `analysis` (read-only)
- Recommended MODE: `analysis` (read-only) for analysis tasks, `write` for file creation
- Priority: Prefer Gemini; use Qwen as fallback
**Models** (override via `--model`):
@@ -133,8 +136,8 @@ RULES: $(cat ~/.claude/workflows/cli-templates/protocols/write-protocol.md) $(ca
**Resume via `--resume` parameter**:
```bash
ccw cli exec "Continue analyzing" --resume # Resume last session
ccw cli exec "Fix issues found" --resume <id> # Resume specific session
ccw cli exec "Continue analyzing" --tool gemini --mode analysis --resume # Resume last session
ccw cli exec "Fix issues found" --tool codex --mode auto --resume <id> # Resume specific session
```
| Value | Description |
@@ -213,7 +216,7 @@ rg "export.*Component" --files-with-matches --type ts
CONTEXT: @components/Auth.tsx @types/auth.d.ts | Memory: Previous type refactoring
# Step 3: Execute CLI
ccw cli exec "..." --tool gemini --cd src
ccw cli exec "..." --tool gemini --mode analysis --cd src
```
### RULES Configuration
@@ -289,7 +292,7 @@ RULES: $(cat ~/.claude/workflows/cli-templates/prompts/universal/00-universal-ri
| Option | Description | Default |
|--------|-------------|---------|
| `--tool <tool>` | gemini, qwen, codex | gemini |
| `--mode <mode>` | analysis, write, auto | analysis |
| `--mode <mode>` | **REQUIRED**: analysis, write, auto | **NONE** (must specify) |
| `--model <model>` | Model override | auto-select |
| `--cd <path>` | Working directory | current |
| `--includeDirs <dirs>` | Additional directories (comma-separated) | none |
@@ -314,10 +317,10 @@ When using `--cd`:
```bash
# Single directory
ccw cli exec "CONTEXT: @**/* @../shared/**/*" --cd src/auth --includeDirs ../shared
ccw cli exec "CONTEXT: @**/* @../shared/**/*" --tool gemini --mode analysis --cd src/auth --includeDirs ../shared
# Multiple directories
ccw cli exec "..." --cd src/auth --includeDirs ../shared,../types,../utils
ccw cli exec "..." --tool gemini --mode analysis --cd src/auth --includeDirs ../shared,../types,../utils
```
**Rule**: If CONTEXT contains `@../dir/**/*`, MUST include `--includeDirs ../dir`
@@ -404,8 +407,8 @@ RULES: $(cat ~/.claude/workflows/cli-templates/prompts/development/02-refactor-c
**Codex Multiplier**: 3x allocated time (minimum 15min / 900000ms)
```bash
ccw cli exec "<prompt>" --tool gemini --timeout 600000 # 10 min
ccw cli exec "<prompt>" --tool codex --timeout 1800000 # 30 min
ccw cli exec "<prompt>" --tool gemini --mode analysis --timeout 600000 # 10 min
ccw cli exec "<prompt>" --tool codex --mode auto --timeout 1800000 # 30 min
```
### Permission Framework
@@ -413,9 +416,9 @@ ccw cli exec "<prompt>" --tool codex --timeout 1800000 # 30 min
**Single-Use Authorization**: Each execution requires explicit user instruction. Previous authorization does NOT carry over.
**Mode Hierarchy**:
- `analysis` (default): Read-only, safe for auto-execution
- `write`: Requires explicit `--mode write`
- `auto`: Requires explicit `--mode auto`
- `analysis`: Read-only, safe for auto-execution
- `write`: Create/Modify/Delete files - requires explicit `--mode write`
- `auto`: Full operations - requires explicit `--mode auto`
- **Exception**: User provides clear instructions like "modify", "create", "implement"
---