fix: Add explicit JSON schema requirements to review cycle agent prompts

Move specific JSON structure requirements from cli-explore-agent (keep generic) to review-*-cycle.md prompts. Key requirements now inline in prompts: - Root must be array [{}] not object {} - analysis_timestamp field (not timestamp/analyzed_at) - Flat summary structure (not nested by_severity) - Lowercase severity/id values - Correct field names (snippet not code_snippet, impact not exploit_scenario)
2026-02-05 01:50:27 +08:00 · 2025-11-26 15:15:30 +08:00
parent cf6a0f1bc0
commit 34a9a23d5b
5 changed files with 120 additions and 106 deletions
--- a/.claude/agents/cli-explore-agent.md
+++ b/.claude/agents/cli-explore-agent.md
@@ -29,10 +29,10 @@ Phase 1: Task Understanding
    ↓ Parse prompt for: analysis scope, output requirements, schema path
 Phase 2: Analysis Execution
    ↓ Bash structural scan + Gemini semantic analysis (based on mode)
-Phase 3: Schema Validation (if schema specified in prompt)
-    ↓ Read schema → Validate structure → Check field names
+Phase 3: Schema Validation (MANDATORY if schema specified)
+    ↓ Read schema → Extract EXACT field names → Validate structure
 Phase 4: Output Generation
-    ↓ Agent report + File output (if required)
+    ↓ Agent report + File output (strictly schema-compliant)
 ```

 ---
@@ -100,21 +100,34 @@ RULES: {from prompt, if template specified} | analysis=READ-ONLY

 ## Phase 3: Schema Validation

-**⚠️ MANDATORY if schema file specified in prompt**
+### ⚠️ CRITICAL: Schema Compliance Protocol

-**Step 1**: `Read()` schema file specified in prompt
+**This phase is MANDATORY when schema file is specified in prompt.**

-**Step 2**: Extract requirements from schema:
- Required fields and types
- Field naming conventions
- Enum values and constraints
- Root structure (array vs object)
+**Step 1: Read Schema FIRST**
+```
+Read(schema_file_path)
+```

-**Step 3**: Validate before output:
- Field names exactly match schema
- Data types correct
- All required fields present
- Root structure correct
+**Step 2: Extract Schema Requirements**
+
+Parse and memorize:
+1. **Root structure** - Is it array `[...]` or object `{...}`?
+2. **Required fields** - List all `"required": [...]` arrays
+3. **Field names EXACTLY** - Copy character-by-character (case-sensitive)
+4. **Enum values** - Copy exact strings (e.g., `"critical"` not `"Critical"`)
+5. **Nested structures** - Note flat vs nested requirements
+
+**Step 3: Pre-Output Validation Checklist**
+
+Before writing ANY JSON output, verify:
+
+- [ ] Root structure matches schema (array vs object)
+- [ ] ALL required fields present at each level
+- [ ] Field names EXACTLY match schema (character-by-character)
+- [ ] Enum values EXACTLY match schema (case-sensitive)
+- [ ] Nested structures follow schema pattern (flat vs nested)
+- [ ] Data types correct (string, integer, array, object)

 ---

@@ -129,11 +142,13 @@ Brief summary:

 ### File Output (as specified in prompt)

-**Extract from prompt**:
- Output file path
- Schema file path
+**⚠️ MANDATORY WORKFLOW**:

-**⚠️ MANDATORY**: If schema specified, `Read()` schema before writing, strictly follow field names and structure.
+1. `Read()` schema file BEFORE generating output
+2. Extract ALL field names from schema
+3. Build JSON using ONLY schema field names
+4. Validate against checklist before writing
+5. Write file with validated content

 ---

@@ -150,23 +165,18 @@ Brief summary:
 ## Key Reminders

 **ALWAYS**:
-1. Understand task requirements from prompt
-2. If schema specified, `Read()` schema before writing
-3. Strictly follow schema field names and structure
-4. Include file:line references
-5. Attribute discovery source (bash/gemini)
+1. Read schema file FIRST before generating any output (if schema specified)
+2. Copy field names EXACTLY from schema (case-sensitive)
+3. Verify root structure matches schema (array vs object)
+4. Match nested/flat structures as schema requires
+5. Use exact enum values from schema (case-sensitive)
+6. Include ALL required fields at every level
+7. Include file:line references in findings
+8. Attribute discovery source (bash/gemini)

 **NEVER**:
 1. Modify any files (read-only agent)
-2. Skip schema validation (if specified)
-3. Use wrong field names
-4. Mix case (e.g., severity must be lowercase)
-
-**COMMON FIELD ERRORS**:
-| Wrong | Correct |
-|-------|---------|
-| `"severity": "Critical"` | `"severity": "critical"` |
-| `"code_snippet"` | `"snippet"` |
-| `"timestamp"` | `"analysis_timestamp"` |
-| `{ "dimension": ... }` | `[{ "dimension": ... }]` |
-| `"by_severity": {...}` | `"critical": 2, "high": 4` |
+2. Skip schema reading step when schema is specified
+3. Guess field names - ALWAYS copy from schema
+4. Assume structure - ALWAYS verify against schema
+5. Omit required fields
--- a/.claude/agents/context-search-agent.md
+++ b/.claude/agents/context-search-agent.md
@@ -397,29 +397,7 @@ Calculate risk level based on:
 }
 ```

-## Execution Mode: Brainstorm vs Plan

-### Brainstorm Mode (Lightweight)
-**Purpose**: Provide high-level context for generating brainstorming questions
-**Execution**: Phase 1-2 only (skip deep analysis)
-**Output**:
- Lightweight context-package with:
-  - Project structure overview
-  - Tech stack identification
-  - High-level existing module names
-  - Basic conflict risk (file count only)
- Skip: Detailed dependency graphs, deep code analysis, web research
-
-### Plan Mode (Comprehensive)
-**Purpose**: Detailed implementation planning with conflict detection
-**Execution**: Full Phase 1-3 (complete discovery + analysis)
-**Output**:
- Comprehensive context-package with:
-  - Detailed dependency graphs
-  - Deep code structure analysis
-  - Conflict detection with mitigation strategies
-  - Web research for unfamiliar tech
- Include: All discovery tracks, relevance scoring, 3-source synthesis

 ## Quality Validation

--- a/.claude/agents/test-context-search-agent.md
+++ b/.claude/agents/test-context-search-agent.md
@@ -397,23 +397,3 @@ function detect_framework_from_config() {
 - ✅ All missing tests catalogued with priority
 - ✅ Execution time < 30 seconds (< 60s for large codebases)

-## Integration Points
-
-### Called By
- `/workflow:tools:test-context-gather` - Orchestrator command
-
-### Calls
- Code-Index MCP tools (preferred)
- ripgrep/find (fallback)
- Bash file operations
-
-### Followed By
- `/workflow:tools:test-concept-enhanced` - Test generation analysis
-
-## Notes
-
- **Detection-first**: Always check for existing test-context-package before analysis
- **Code-Index priority**: Use MCP tools when available, fallback to CLI
- **Framework agnostic**: Supports Jest, Mocha, pytest, RSpec, etc.
- **Coverage gap focus**: Primary goal is identifying missing tests
- **Source context critical**: Implementation summaries guide test generation
--- a/.claude/commands/workflow/review-module-cycle.md
+++ b/.claude/commands/workflow/review-module-cycle.md
@@ -438,19 +438,35 @@ Task(
    - Context Pattern: ${targetFiles.map(f => `@${f}`).join(' ')}

    ## Expected Deliverables
-    **MANDATORY**: Before generating any JSON output, read the template example first:
+    **MANDATORY**: Before generating any JSON output, read the schema first:
    - Read: ~/.claude/workflows/cli-templates/schemas/review-dimension-results-schema.json
-    - Follow the exact structure and field naming from the example
+    - Extract and follow EXACT field names from schema

    1. Dimension Results JSON: ${outputDir}/dimensions/${dimension}.json
-       - MUST follow example template: ~/.claude/workflows/cli-templates/schemas/review-dimension-results-schema.json
-       - MUST include: findings array with severity, file, line, description, recommendation
-       - MUST include: summary statistics (total findings, severity distribution)
-       - MUST include: cross_references to related findings
+
+       **⚠️ CRITICAL JSON STRUCTURE REQUIREMENTS**:
+
+       Root structure MUST be array: \`[{ ... }]\` NOT \`{ ... }\`
+
+       Required top-level fields:
+       - dimension, review_id, analysis_timestamp (NOT timestamp/analyzed_at)
+       - cli_tool_used (gemini|qwen|codex), model, analysis_duration_ms
+       - summary (FLAT structure), findings, cross_references
+
+       Summary MUST be FLAT (NOT nested by_severity):
+       \`{ "total_findings": N, "critical": N, "high": N, "medium": N, "low": N, "files_analyzed": N, "lines_reviewed": N }\`
+
+       Finding required fields:
+       - id: format \`{dim}-{seq}-{uuid8}\` e.g., \`sec-001-a1b2c3d4\` (lowercase)
+       - severity: lowercase only (critical|high|medium|low)
+       - snippet (NOT code_snippet), impact (NOT exploit_scenario)
+       - metadata, iteration (0), status (pending_remediation), cross_references
+
    2. Analysis Report: ${outputDir}/reports/${dimension}-analysis.md
       - Human-readable summary with recommendations
       - Grouped by severity: critical → high → medium → low
       - Include file:line references for all findings
+
    3. CLI Output Log: ${outputDir}/reports/${dimension}-cli-output.txt
       - Raw CLI tool output for debugging
       - Include full analysis text
@@ -512,17 +528,24 @@ Task(
    - Mode: analysis (READ-ONLY)

    ## Expected Deliverables
-    **MANDATORY**: Before generating any JSON output, read the template example first:
+    **MANDATORY**: Before generating any JSON output, read the schema first:
    - Read: ~/.claude/workflows/cli-templates/schemas/review-deep-dive-results-schema.json
-    - Follow the exact structure and field naming from the example
+    - Extract and follow EXACT field names from schema

    1. Deep-Dive Results JSON: ${outputDir}/iterations/iteration-${iteration}-finding-${findingId}.json
-       - MUST follow example template: ~/.claude/workflows/cli-templates/schemas/review-deep-dive-results-schema.json
-       - MUST include: root_cause with summary, details, affected_scope, similar_patterns
-       - MUST include: remediation_plan with approach, steps[], estimated_effort, risk_level
-       - MUST include: impact_assessment with files_affected, tests_required, breaking_changes
-       - MUST include: reassessed_severity with severity_change_reason
-       - MUST include: confidence_score (0.0-1.0)
+
+       **⚠️ CRITICAL JSON STRUCTURE REQUIREMENTS**:
+
+       Root structure MUST be array: \`[{ ... }]\` NOT \`{ ... }\`
+
+       Required top-level fields:
+       - finding_id, dimension, iteration, analysis_timestamp
+       - cli_tool_used, model, analysis_duration_ms
+       - original_finding, root_cause, remediation_plan
+       - impact_assessment, reassessed_severity, confidence_score, cross_references
+
+       All nested objects must follow schema exactly - read schema for field names
+
    2. Analysis Report: ${outputDir}/reports/deep-dive-${iteration}-${findingId}.md
       - Detailed root cause analysis
       - Step-by-step remediation plan
--- a/.claude/commands/workflow/review-session-cycle.md
+++ b/.claude/commands/workflow/review-session-cycle.md
@@ -442,19 +442,35 @@ Task(
    - Mode: analysis (READ-ONLY)

    ## Expected Deliverables
-    **MANDATORY**: Before generating any JSON output, read the template example first:
+    **MANDATORY**: Before generating any JSON output, read the schema first:
    - Read: ~/.claude/workflows/cli-templates/schemas/review-dimension-results-schema.json
-    - Follow the exact structure and field naming from the example
+    - Extract and follow EXACT field names from schema

    1. Dimension Results JSON: ${outputDir}/dimensions/${dimension}.json
-       - MUST follow example template: ~/.claude/workflows/cli-templates/schemas/review-dimension-results-schema.json
-       - MUST include: findings array with severity, file, line, description, recommendation
-       - MUST include: summary statistics (total findings, severity distribution)
-       - MUST include: cross_references to related findings
+
+       **⚠️ CRITICAL JSON STRUCTURE REQUIREMENTS**:
+
+       Root structure MUST be array: \`[{ ... }]\` NOT \`{ ... }\`
+
+       Required top-level fields:
+       - dimension, review_id, analysis_timestamp (NOT timestamp/analyzed_at)
+       - cli_tool_used (gemini|qwen|codex), model, analysis_duration_ms
+       - summary (FLAT structure), findings, cross_references
+
+       Summary MUST be FLAT (NOT nested by_severity):
+       \`{ "total_findings": N, "critical": N, "high": N, "medium": N, "low": N, "files_analyzed": N, "lines_reviewed": N }\`
+
+       Finding required fields:
+       - id: format \`{dim}-{seq}-{uuid8}\` e.g., \`sec-001-a1b2c3d4\` (lowercase)
+       - severity: lowercase only (critical|high|medium|low)
+       - snippet (NOT code_snippet), impact (NOT exploit_scenario)
+       - metadata, iteration (0), status (pending_remediation), cross_references
+
    2. Analysis Report: ${outputDir}/reports/${dimension}-analysis.md
       - Human-readable summary with recommendations
       - Grouped by severity: critical → high → medium → low
       - Include file:line references for all findings
+
    3. CLI Output Log: ${outputDir}/reports/${dimension}-cli-output.txt
       - Raw CLI tool output for debugging
       - Include full analysis text
@@ -517,17 +533,24 @@ Task(
    - Mode: analysis (READ-ONLY)

    ## Expected Deliverables
-    **MANDATORY**: Before generating any JSON output, read the template example first:
+    **MANDATORY**: Before generating any JSON output, read the schema first:
    - Read: ~/.claude/workflows/cli-templates/schemas/review-deep-dive-results-schema.json
-    - Follow the exact structure and field naming from the example
+    - Extract and follow EXACT field names from schema

    1. Deep-Dive Results JSON: ${outputDir}/iterations/iteration-${iteration}-finding-${findingId}.json
-       - MUST follow example template: ~/.claude/workflows/cli-templates/schemas/review-deep-dive-results-schema.json
-       - MUST include: root_cause with summary, details, affected_scope, similar_patterns
-       - MUST include: remediation_plan with approach, steps[], estimated_effort, risk_level
-       - MUST include: impact_assessment with files_affected, tests_required, breaking_changes
-       - MUST include: reassessed_severity with severity_change_reason
-       - MUST include: confidence_score (0.0-1.0)
+
+       **⚠️ CRITICAL JSON STRUCTURE REQUIREMENTS**:
+
+       Root structure MUST be array: \`[{ ... }]\` NOT \`{ ... }\`
+
+       Required top-level fields:
+       - finding_id, dimension, iteration, analysis_timestamp
+       - cli_tool_used, model, analysis_duration_ms
+       - original_finding, root_cause, remediation_plan
+       - impact_assessment, reassessed_severity, confidence_score, cross_references
+
+       All nested objects must follow schema exactly - read schema for field names
+
    2. Analysis Report: ${outputDir}/reports/deep-dive-${iteration}-${findingId}.md
       - Detailed root cause analysis
       - Step-by-step remediation plan